GeoCities, once the Internet's third most-visited domain, is back from the dead -- unofficially.
When Yahoo announced early last year that the Web hosting service would close on Oct. 26, 2009, digital historian Jason Scott knew he had to act. "For hundreds of thousands of people, this was their first Web site," he wrote in a blog post. "This was where you went to get the chance to publish your ideas to the largest audience you might ever have dreamed of having."
Founded in 1994 as Beverly Hills Internet, Yahoo GeoCities was one of the first services to offer an easy way for early Internet users to publish their own Web pages. Whereas most hosting options of the 1990s were expensive, GeoCities' free hosting space became the home for thousands of sites built around "neighborhoods," including those focused on conservation, fashion, military, sports, finance and travel.
Yahoo bought the service in 1999, but the availability of affordable personal hosting -- including Yahoo's own Web hosting -- led the search firm to announce in April 2009 that GeoCities would be shuttered, with any data that its owners did not personally archive irrecoverable.
Scott, who has stored thousands of text files from the pre-Internet area on his Web site, TextFiles.com, mobilized a 25-person archive team to download as many of the GeoCities sites as they could. Since Yahoo refused to release a list of directories or users whose sites were hosted on GeoCities, automated scripts were used to probe for data, downloading whatever they found over a period of six months. As the shutdown date approached, the team working with Scott learned of other groups doing the same thing. So they started sharing usernames, creating more comprehensive databases.
About 100,000 Web site accounts were captured and saved. "It was like running through a burning building and deciding what to save," Scott said in an interview.
Now, a year later, Scott has released a torrent file on Pirate Bay containing everything his Archive Team saved. The 642GB file, which took two weeks to compress, unpacks to 909GB of content and offers data Scott said will appeal primarily to academics, historians and collectors. It also shows the ease with which the data could be preserved and made available.
People who want to browse the data online without downloading the entire torrent may do so at Geociti.es, or by exploring the results of other archivists who also set out to save GeoCities. Among them is Jacques Mattheij, who managed to capture even more than Scott's team -- around 1 million accounts and 5TB of data, all of which is available at Reocities.com.
"The Reocities.com project is one of the most worthwhile and satisfying things I've ever done online," said Mattheij. "There isn't a day that passes that someone does not give me more confirmation that this was the right thing to do and that lots of value would have been lost."
One such affirmation came from a cancer survivor who had published her story online to help others but failed to save a copy before GeoCities closed. "My Web site was like my diary of my experiences, complete with pictures," she wrote to Mattheij. "It's more than sentimental to me -- it would be a huge support to have this now as I'm getting retested and biopsied for a possible cancer recurrence."
"Whenever I read e-mail like that, I get a lump in my throat -- but without Yahoo doing the right thing there is no way to help these people," Mattheij said.
Both Mattheij and Scott have also gotten requests to take down sites, although Yahoo itself is unlikely to do so. In 1999, Yahoo amended its terms of service to declare a copyright on users' data. The resulting uproar prompted it to withdraw the change and now Yahoo cannot claim copyright infringement over GeoCities content, said Scott.
Unlike Scott, Mattheij has no plans to release his entire collection as a torrent because of concerns it would be used as fodder to create spam sites. He also stressed that he doesn't think its fate will mirror the original site. "[I am] pretty committed and [am] actually thinking about setting up a foundation to preserve GeoCities for an indefinite period," asserts Mattheij.
Other GeoCities archival efforts include GeoCities.ws, OoCities and The Internet Archive.
"There are at least three [other] organized groups who saw the value in this and threw crazy amounts of resources at it," said Scott. "That tells me it was a valid thing to do, outside of my own belief that it was."