Internet Archive expands book digitizing effort
Internet Archive has so far scanned in 100,000 books
IDG News Service - The Internet Archive has received a grant from the Alfred P. Sloan Foundation to expand its book-digitizing efforts, which so far have resulted in the scanning of about 100,000 books now available on the group's Web site.
The grant will also benefit the Open Content Alliance, an initiative launched in October 2005 and backed by the Internet Archive, Yahoo Inc. and others to digitize books and multimedia material and make them available online, the Internet Archive announced Wednesday.
The scanned works hosted by the Internet Archive are also available for indexing by any search engine that adheres to the OCA's open-access terms for the content. These principles include providing "the greatest possible degree of access to and reuse of collections in the archive, while respecting the rights of content owners and contributors," according to the OCA Web site.
The Sloan Foundation awarded the grant to support the digitization of historical collections from five major libraries by the Internet Archive, a nonprofit organization building an online library of texts, audio, video, software and Web pages.
The $1 million grant will be used in part to scan the complete personal library of founding father and U.S. President John Adams, housed at the Boston Public Library. Meanwhile, the Getty Research Institute in Los Angeles is making available art, architecture and performing arts books.
The archive of publications issued by New York City's Metropolitan Museum of Art will also be digitized, as well as California Gold Rush primary texts from the University of California at Berkeley's Bancroft Library. Finally, the Internet Archive will also scan the James Birney Collection of Anti-Slavery materials from Johns Hopkins University libraries in Baltimore.
Scanning books to make them available online has become a controversial practice primarily due to Google Inc.'s approach. The search engine giant is digitizing library collections that include copyright books without always asking for permission from the copyright owners. It indexes the full text of these works and makes them searchable through its Book Search service.
Google faces lawsuits alleging that this is a violation of copyright law. Google claims it is protected by the fair use principle, because it only displays snippets of text from copyright works.
The Internet Archive has refrained from digitizing copyright books, although it is interested in seeing copyright issues worked out, because its ultimate goal is to provide access to as many works as possible for the benefit of people worldwide, said Brewster Kahle, Internet Archive founder.
For example, Kahle is interested in sorting out the issue of books whose copyright owners can't be found, often called "orphan works," as well as the issue of copyright works that are out of print. In these two cases, Kahle believes that libraries should take a leading role in finding "the right path through it." In the case of in-print copyright books, a collaboration between libraries and publishers could generate significant progress, he said.
While others are criticizing Google for its wholesale scanning of copyright works, Kahle finds fault with the agreements the company is hammering out with its partner libraries. In his opinion, the contracts put too many restrictions on how libraries and people may use and share digital copies of public-domain works. "Google has bound the libraries pretty tightly," he said. "Public domain works should stay in the public domain."
Google didn't immediately respond to a request for comment.
In addition to Yahoo and the aforementioned libraries, participants in the OCA include Microsoft Corp., Adobe Systems Inc., Columbia University, Hewlett-Packard Co., the University of Toronto, Xerox Corp. and the University of North Carolina at Chapel Hill.
- What is this "File Sync" Thing and Why Should I Care About It? All of a sudden, getting a file from your work laptop to your iPad became as simple as clicking "Save." So it's no...
- The Keys to Securing Data in a Collaborative Workplace Losing data is costly. IT professionals have spent years learning how to protect their organizations from hackers, but how do you ward off...
- Cloud-to-Cloud Backup Case Study: AMAG Pharmaceuticals As an IT pioneer in the pharmaceuticals industry, AMAG realized that SaaS backup and recovery would give them the confidence to fully embrace...
- 9 Essentials for a Complete Cloud-to-Cloud Backup Solution In 9 Essentials for a Complete Cloud-to-Cloud Backup Solution, we'll walk you through potential sources of data loss in the cloud and provide...
- The Key to Happiness: Throw out Your Data Warehouse In this webinar, Kerry Reitnauer, Director, Solution Architect at FairPoint Communications will discuss the challenges the data warehouse brought, how they migrated to...
- The Foundation You Need to Build a Better Storage Infrastructure Watch this webcast to hear how you can maximize the economics of your data center by modifying your storage footprint and power usage... All Data Storage White Papers | Webcasts