Skip the navigation

Google buys reCAPTCHA to boost book scanning efforts

By Juan Carlos Perez
September 16, 2009 01:23 PM ET

IDG News Service - Google Inc. plans to accelerate its massive efforts to scan tens of millions of books and periodicals with the acquisition today of a company called reCAPTCHA.

ReCAPTCHA is a well-known provider of CAPTCHA technology, which is used to prevent spammers from using computers to automatically register for online services, such as Web mail accounts and Web site registrations.

CAPTCHA, which stands for "Completely Automated Public Turing test to tell Computers and Humans Apart," requires users to type randomly chosen words that appear as images, a process that is easy for humans but hard for computers to do correctly.

What attracted Google to ReCAPTCHA is that the company has linked its core authentication service with efforts to digitize print books and periodicals. The search company has a massive effort underway in that area for its Google Books and Google News Archive services.

ReCAPTCHA takes its word images from scanned print materials. Every time people solve a CAPTCHA from the company, they are also, as a byproduct, helping to turn scanned words into plain text that can be indexed and made searchable by search engines.

"So we'll be applying the technology within Google not only to increase fraud and spam protection for Google products but also to improve our books and newspaper scanning process," reads a post in Google's official blog authored by Luis von Ahn, cofounder of reCAPTCHA, and Will Cathcart, a Google product manager.

The ReCAPTCHA service is used by about 100,000 Web sites, and it is helping to digitize old editions of The New York Times.

Reprinted with permission from IDG.net. Story copyright 2014 International Data Group. All rights reserved.
Our Commenting Policies
Internet of Things: Get the latest!
Internet of Things

Our new bimonthly Internet of Things newsletter helps you keep pace with the rapidly evolving technologies, trends and developments related to the IoT. Subscribe now and stay up to date!