Archive for January, 2008

Call for papers for the 8th International Web Archiving Workshop

January 14, 2008

Julien Masanès, program chair for the 2008 International Web Archiving Workshop, recently issued the workshop call for papers. From the call:


Main international event in this domain since 2001, will likely take place the
18th and 19th of September 2008, in conjunction with ECDL in Aarhus (Denmark)
this year.
The workshop will provide a cross domain overview on active research and
practice in all domains concerned with the acquisition, maintenance and
preservation of digital objects for long-term access, with a particular focus
on web archiving and studies on effective usage of this type of archives.

Important Dates:

Paper submission: 19th of May 2008 (url for submission coming soon).

Notification of acceptance: June 16th, 2008

Camera-ready copy due: July 14th, 2008

Workshop: September 18th and 19th, 2008

Please post submission using ACM template.


Case studies:
• Web Archiving Projects,
• Digital Archeology,
• Cyberculture Studies,
• Web Metrics,
• Web Publishing Models.

Data acquisition:
• Harvesting Technology, Focused Crawling,
• Deep Web Capture,
• Site Architecture Migration,
• Authenticity Control of Captured Documents.
• Acquisition of Dynamic Objects,
• Submission Systems,
• Data Ingest,
• Automated Metadata Capture.

Storage Models and Architecture:
• Hierarchical Storage Models,
• Redundant Storage,
• Distributed Storage,
• Storage Media Migration,
• Cost Models,
• Media Life-Time Analysis.

Digital Preservation:
• Conversion/Migration Strategies,
• Emulation Approaches,
• Data Abstraction Technologies,
• Self-Aware Objects,
• Testbeds, File Format Repositories,
• Document Functionality and Behaviour.

• Access Provision,
• Navigation,
• Web Indexing
• Collection Analysis,
• Information Retrieval,
• Interface Models.

Policy and Social Issues:
• Economics of Information,
• Intellectual Property Rights.
• Challenges and Caveats of Web Archives,
• Scenarios and Visions,
• Privacy Aspects

Workshop Officials:

Julien Masanès (European Archive / e-mail : julien AT
Andreas Rauber (Vienna University of Technology, Austria)

See details on


Access to Around the World in 2 Billion Pages!

January 2, 2008

Thanks to a generous grant from the Mellon Foundation, Internet Archive completed a 2 billion page web crawl in 2007. This is the largest web crawl attempted by Internet Archvie. The project was designed to take a global snapshot of the Web.

Please browse through the resulting collection.

Special thanks to the memory institutions who contributed URLs to the crawl. The crawl began with 18,000 websites from over 60 countries.