Worldwide Wayback Machine Updated: 25% Larger

by

Our primary web archive — the Worldwide Wayback Machine accessible at web.archive.org — has just finished a major index update, meaning that many new months of recent web crawls are now viewable.

Some material from as late as April 2007 is live, and the overall index has grown by about 25%. If for any site you’d wanted more recent material, give your lookups another try.

Advertisements

10 Responses to “Worldwide Wayback Machine Updated: 25% Larger”

  1. Peter Says:

    Web Archive is a very good service for webmasters who wants to analyze growth of sites, so keeps it on, thank you!

  2. Gojomo Says:

    Darrell –

    Most of the crawls that feed the Wayback Machine do fetch all constituent parts (images, stylesheets, etc.) of a page — so when it displays in the WM, there’s a reasonable chance of it being a strong (if not perfect) likeness. Also, unless page design thwarts our URL-rewriting, display of pages on the WM does not rely on content from the original site.

    So, it shouldn’t be the case that mere removal of contents from the original site ruins the WM display.

    Separate archiving of bitmap screen captures from typical browsers has occasionally been discussed as a means of dealing with rich content that’s hard to otherwise save, but there are no current plans to expand the web archive with such snapshots.

  3. Darrell Says:

    Too bad the wayback machine doesn’t actually take a “screen capture” of the site. Once you remove some contents from your site, it’s gone.

  4. mouse Says:

    I’m happy. My site (orangemusic) is now available in the Wayback Machine. Yahoo =)

  5. Gojomo Says:

    ruth falk,

    Sites with robots.txt blocks cannot be collected for the Wayback Machine, and if a robots.txt is added after initial collection, we remove them from the Wayback Machine. Sorry, the site you reference is not available.

    – Gordon @ IA

  6. Gojomo Says:

    Nadia,

    I’m not sure what you mean. I definitely see spoof news content for sharesite.org in 2001-2002 and then for sharesite.net in late 2003-2005. Other times, it appears these sites had other owners. Can you be more specific about what yopu used to find, but now cannot?

    – Gordon @ IA

  7. Nadia Says:

    I’m pretty disappointed in the Wayback Machine. There was a spoof financial news site called http://www.Sharesite.org, and then http://www.Sharesite.net which ran from about 2000 / 2001 for a couple of years. I used to find the articles using the Wayback search, but the whole lot seems now to have been overwritten with some utterly unrelated garbage.

    What’s the deal with that?

    Nadia

  8. ruth falk Says:

    How do I fiind an old website that has been blocked:netG.com

    any clues would be helpful

    Thanks

  9. Tommy Says:

    Today I blogged something about Web Archive.
    It’s really funny to see what was online on my own Page in older Times.
    Keep on.
    Greetings from Germany

  10. Tommy's Blog Says:

    http://www.tommy.de seit 1998 im Internet Archiv…

    Habe mir heute mal den Spass erlaubt und im Internet Archiv geschaut, wie lange eigentlich schon http://www.tommy.de online ist und was damals auf meinen Webseiten zu finden war.
    Der erste Eintrag stammt vom 3. Dezember 1998.
    Übrigens kann jeder auf http://www...

Comments are closed.


%d bloggers like this: