Archive for June, 2007

Internet Archive at IWAW

June 21, 2007

On June 23rd Internet Archive will be presenting at the International Web Archiving Workshop (IWAW) in Vancouver.

Brad will be starting off the day of sessions with a presentation on the Wayback Machine. Here is the abstract from the paper Brad is presenting.

‘Wayback’ is an open-source, Java software package for browser-based access of archived web material, offering a variety of operation modes and opportunities for extension. In its basic, usual configuration it can both list available URL captures by date and offer recursive archive browsing starting from any capture. Advanced configurations offer better performance for challenging archived material and improved navigation.

‘Wayback’ is implemented as a collection of loosely coupled alternate implementations of core modules, for which an overview of each is provided. The functionality and implementation is also contrasted with its inspiration and predecessor, the Internet Archive’s classic public Wayback Machine software, and other ways of accessing archived web material. Finally, future directions for improvement are outlined.

After 4pm Gordon will be giving updates on IA’s tool and format developments.

Please come and and introduce yourself if you are attending the workshop!


Asian Tsunami Archive

June 12, 2007

A web collection documenting the devastating Asian tsunami of 2004 is now live and searchable. The project is the result of a collaboration between Internet Archive, The Singapore Internet Reserarch Centre and The collection contains over 1500 sites relating to the disaster. The collection is indexed for text search and can also be searched by specific URL.

Internet Archive believes community driven web archving projects are an essential part of preserving the web. Internet coverage of and response to key global events can unfold rapidly and in a unique and valuable way. Blogs, major news outlets and alternative media can offer different perspectives of the same event and all can be important to researchers, historians and the general public. It is essential to work cross-institutionally and quickly to archive the information, reactions and outcome of important events that affect everyone.

Other community Archive collections include the Hurricanes Katrina and Rita Archive, and the San Francisco Earthquake Centennial archive. Now Featuring Archive-It Collections

June 4, 2007

Archive-It recently introduced an OAI-PMH metadata feed for all Archive-It collections. This feed has been submitted to the OAIster catalog. Our feed has been harvested and you will see hits from Archive-It collections in your search results.

We are also planning to integrate an SRU protocol in our search engine very soon. The Archive-It team is very excited about providing new ways for our partners and their patrons to be able to access their Archive-It collections.

Crawling Around the World

June 1, 2007

Thank you to eveyone who has submitted seeds to our 2 billion page web crawl. Most of our submissions came from US and international libraries and archives.

We received over 18,000 seeds from over 60 countries.

We are currently in the process of preparing the seeds and the crawl will begin on Monday, June 4.