Skip the navigation
News

Internet Archive to unveil massive Wayback Machine data center

The Wayback Machine stores 85 billion Web pages dating back to '96

By Lucas Mearian
March 19, 2009 12:00 PM ET

Computerworld - The Internet Archive organization plans next week to announce the opening of a new data center to house two petabytes of information for its Wayback Machine, the digital time capsule that stores archived versions of Web pages dating back to 1996.

For example, this is what Computerworld's Web site looked like in 1997, what Google looked like in 1998 and what CNN looked like in 2000.

The Wayback Machine houses 85 billion Web pages archived for more than a dozen years, which amounts to three petabytes of data, or about 150 times the content of the Library of Congress. Only five years ago, the Wayback Machine contained about 30 billion Web pages. It is expected to continue to grow by 100TB of data per month now that it's live.

The Internet Archive's massive database is mirrored to the Bibliotheca Alexandrina, the new Library of Alexandria in Egypt, for disaster recovery purposes.

According to an event invitation from Sun Microsystems Inc., the Internet Archive is moving from a traditional data center filled with standard Linux servers to one that runs Solaris 10 with ZFS on Sun Fire x4500s servers inside a Sun Modular Datacenter. The modular system is an all-in-one data center housed in a metal shipping container for mobility.

Because of the modular design, Sun said the data center was deployed in a tenth of the time it would take to build a typical bricks-and-mortar data center. The Wayback Machine Sun Modular Datacenter can service 500 inquiries a second, Sun said. A spokesperson for the Internet Archive said the user interface on the Wayback Machine will not change.

The Internet Archive is a nonprofit organization located in the Presidio in San Francisco, with data centers in Redwood City and Mountain View, Calif. The archive not only keeps snapshots of Web pages, but also software, movies, books, and audio clips.

Users can surf the Wayback Machine by typing in the Web address of a Web site or Web page and then choose from a series of dates that reflect the stored images. The site does not currently support keyword search.

Read more about Storage in Computerworld's Storage Topic Center.



Additional Resources
Forrester Consulting - Optimizing Users and Applications in a Mobile World
WHITE PAPER
Solving application issues over the WAN requires careful consideration. Based on their independent research, Forrester Consulting offers recommendations on how to tackle application performance issues, insufficient bandwidth and the inability to quickly restore users in a disaster.

Read now.

Security KnowledgeVault
WHITE PAPER
Security is not an option. This KnowledgeVault Series offers professional advice how to be proactive in the fight against cybercrimes and multi-layered security threats; how to adopt a holistic approach to protecting and managing data; and how to hire a qualified security assessor. Make security your Number 1 priority.

Read now.

Cut Communications Costs Once and for All
WHITE PAPER
New IP-based communications systems are being deployed by small and midsized businesses at a rapid rate. Learn how these organizations are enabling faster responsiveness, creating better customer experiences, speeding office or mobile interactions, and dramatically reducing existing communications costs.

Read now.

Storage White Papers
Datacenter Consolidation Best Practices Whitepaper
The benefits of storage consolidation are being realized by companies and seen as a way to streamline many storage-driven applications. Learn why the...
Eliminating VMware / Storage Related Performance Challenges
How to proactively monitor the performance in a Fibre Channel SAN / vSphere environment is always a concern. Understand the importance of a...
Cloud Environments Have Familiar Storage Challenges
Cloud environments have many storage challenges that are familiar to data center managers, but due to their density and abstraction, the issues become...
Eight Considerations for Evaluating Disk-Based Backup Solutions
In the past, the movement from tape- to disk-based backup has been less compelling due to the expense of storing backup data on...
ExaGrid Helps U.S. Federal Government Agencies Reduce Backup Windows and Improve Data Protection
The U.S. Government has been the largest user of tape-based backup systems since the 1970s. Most agencies have begun to deploy disk storage...
All Storage White Papers
Storage Webcasts
Understand Your Data: The Future of Backup and Archiving
Archiving and Backup are the foundation of the next generation of information governance. However, commodity data protection tools and basic archives are only...
Optimizing Networks for the Cloud
Join guest speaker, Rohit Mehra, IDC Director of Enterprise Communications Infrastructure, to explore current trends, discuss best practices for optimizing Data Center and...
Apps QuickStart Series Part 2: Designing and Deploying SQL Server on VMware vSphere
Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as...
Apps QuickStart Series Part 1: Designing and Deploying Exchange 2010 on VMware vSphere
Download this webcast to learn the virtual hardware design considerations for Exchange 2010, deployment using the building block approach, options for high-availability and...
Customer Spotlight: How IPC The Hospitalist Company Implemented Oracle on VMware
Have you been looking to hear about customer's experiences with the new VMware vCenter Site Recovery Manager product? View this webcast to learn...
All Storage Webcasts
Newsletter Sign-Up

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all newsletters | Privacy Policy
IT Jobs