Have you ever been to a site looking for content that simply no longer exists? Perhaps you had a link (or someone gave it to you), and it now gets a "file not found" error. Of course, if a desired page has simply been moved, you may be able to find the content with enough digging (or via the google site search I just wrote about), but what if it really is gone? Are you stuck? Maybe not. If you've never seen the "web archive", you're in for a treat.
The Internet Archive is an ambitious project which for years has been archiving at regular intervals the current state of web pages on millions of site. You can simply visit the site, put in a domain name (complete URL to a specific page) and if it's been archived, you'll see the old page in all its glory. Of course, on the surface it seems just plain fun (the site even refers to itself frivolously as the "wayback machine"). For instance, search for google.com and you'll see that their oldest hit is from 1998:
http://web.archive.org/web/19981202230410/http://www.google.com/
It's pretty amazing to see how simple it was then, as it is now. Yahoo also started out much simpler:
http://web.archive.org/web/19961017235908/http://www2.yahoo.com/
Of course, for each of these (and any archived site) there may be dozens of points in time when it has archived what the site looked like.
But back to the real point in this entry: if you try to visit a URL and it's no longer there, whether it's the whole site or a single page, try the archive. It isn't just archiving the front page but spidering as much of the site as it could. Indeed, once you call up a page you can also often follow the links on it to find that other pages have been archived.
For instance, if you try to visit http://www.allaire.com, it now takes you to http://www.macromedia.com (which will someday soon take you to http://www.adobe.com, but that's another story.) But visit the archive, and you can see that there are lots of past versions of the allaire site archived (from 1997-2004):
http://web.archive.org/web/*/http://www.allaire.com
But my real point was that you may want to search for some specific page on a site. For instance, often I read a web site or email with a link to a Microsoft article that is no longer the same URL as it was. Often I can find that specific article's URL in the archive. It's just awesome. Try it out. (There's even a way to set up a shortcut in your browser to jump to the archive for a page automatically. More on that another time.)
if the content is recently missing you can generally find it using the
google search "cache:www.yoursite.com/yourpage.html" - i find that web
archive tends to lag on indexing many sites while google usually has a more
recent copy
Sean Tierney [legaltech@gmail.com]
Excellent point, Sean. I had meant to hint at that, too, when I wrote this,
and was certainly planning to follow up with a future entry about that
awesome tool (and both how to use right-click on a page to see any cached
page, and also making it easier to access from the google toolbar.)