Geek Trivia: 404-letter words
The time has come to celebrate yet another of those all-too-unrecognized geek-centric holidays (which I may have just made up): 404 Day! Every April 4th, Web surfers of every persuasion should take time out to celebrate that one universal experience of all Internet consumers and professionals -- the 404 Page Not Found error. No matter which sites you frequent, which ISP you use, or which operating end of the browser zealot spectrum you fall on, we've all had our share of 404s.
So, where did the 404 come from (besides the server, of course)? Like pretty much everything World Wide Web-related, the 404 is an official component of the Hypertext Transfer Protocol (HTTP) specification ratified by the World Wide Web Consortium (W3C).
It first appeared in the version 0.9 HTTP spec, adopted in 1992. If you track down that document, you'll notice a rather telling signature: TimBL. That's the byline of one Tim Berners-Lee, he of the "I invented the World Wide Web and the first Web browser" fame. The same guy who made the modern Web page possible also invented the Page Not Found.
Genius though he was, Berners-Lee didn't spin the HTTP status codes out of whole cloth but based them on the preexisting File Transfer Protocol (FTP) status codes. If you compare the two code listings, you'll find only 10 overlapping codes: 100, 200, 202, 425, 426, 500, 501, 502, 503, and 504.
Only 100 and 200 have similar meanings under both standards -- OK and Continue, respectively -- so it's clear Berners-Lee didn't copy FTP into HTTP. For the record, there is no code 404 in FTP, so that infamous error message is original to the Hypertext Transfer Protocol by way of TimBL.
Rumor has it that, whether or not Berners-Lee suspected that code 404 would become famous by virtue of link rot and lazy sysadmins, he intended that particular numeric to include a sly inside joke. You see, the HTTP status code system bears a striking resemblance to the CERN laboratory building numbering system. CERN, the Swiss techno-mecca, is the birthplace of the World Wide Web, leading some to infer that code 404 is a subtle reference to room 404 at CERN.
The only problem with that theory -- or, rather, that urban legend -- is that there is no room 404 at CERN, and there never has been. The real meaning and origin of the 404 code is far more mundane, with each digit having a specific significance.
WHAT DO THE NUMBERS IN STATUS CODE 404 SIGNIFY UNDER THE FORMAL HTTP SPEC?
What do the numbers signify in the famous code 404 Page Not Found error, according to the HTTP status code specification from the World Wide Web Consortium?
In simplest terms, HTTP code 404 means Page Not Found (duh). But specifically, HTTP 404 is actually two phrases: Client Error and Not Found. The first 4 is the error class (Client Error), and 04 is the specific error (Not Found).
By design, the HTTP code spec is extensible for up to 100 errors per class. There are five recognized classes, quoted below from the current HTTP spec:
1xx: Informational: Request received, continuing process.
2xx: Success: The action was successfully received, understood, and accepted.
3xx: Redirection: Further action must be taken in order to complete the request.
4xx: Client Error: The request contains bad syntax or cannot be fulfilled.
5xx: Server Error: The server failed to fulfill an apparently valid request.
Within each class, secondary two-digit codes specify a distinct error. There are 53 specific codes recognized by the W3C, and most software recognizes these codes as written.
That said, no browser has to recognize the specific three-digit code, it's just a recommendation. All a W3C-compliant browser is required to do is recognize the error class -- the first digit.
So, I could institute a fictional HTTP 499 code Your browser is a scruffy looking nerf herder -- there is no 499 code in the W3C spec -- and a compliant browser would have no idea what that specific code meant but would recognize it as a general client error. It would be up to a human to interpret the lame Empire Strikes Back in-joke.
Incidentally, this does mean that on some level, the 404 error is famous simply because no one found a good reason to stray from the W3C code suggestions. That's also why the exact wording of a 404 return is up to the individual server and sysadmin, busting out anything from a Not Found to a Page Not Found to an elaborately worded Hitchhikers Guide to the Galaxy reference. That's not just techno-humor -- that's erroneously outrageous Geek Trivia.
Bookmarks