A collection Lesk mentions in Chapter 12 of his book is the Internet Archive, located at http://www.archive.org/index.php. The goal of the Archive is to capture the development of the Internet. The information in the Archive comes from sweeping the Web approximately every other month. The amount of memory devoted to this project is higher than I can even comprehend. The significance of the project is to keep artifacts that are in digital form, preserve them and create a library for researchers, historians and scholars.
The Wayback Machine is a tool for people to search archived Web sites. Currently, one must put in a URL and date, but futuristically there will be a full search engine. I thought it humorous to read in the FAQ section that the Wayback Machine is named after Mr. Peabody’s WABAC machine from the Rocky and Bullwinkle cartoon show. Given that Web sites change frequently, this type of approach makes sense as a way to capture the transitions.
The Internet Archive is a large operation and has a larger goal than just capturing the development of the Internet. It partners with the Library of Congress, the Smithsonian, and has announced an agreement with NASA to archive and manage, free to all, Nasa’s photographs, film and video. It also has an open education resource section with university lectures–some downloadable. In a published profile of Brewster Kahle, GOOD Magazine on 11/7/07, http://www.goodmagazine.com/section/Portraits/digital_alexandria; it states the organization has a goal of digitizing the world’s published information and is scanning 12,000 books per month. The repository will be openly available to all. The article contrasts the approach of The Internet Archive to the Google project that is for-profit and may limit public access. I do fee relieved to read this. The Google project has some good points, but I have had a concern about access to information. The Internet Archive is a site to study further and review periodically for the strides it is making.