Skip the navigation

Building a better spam-blocking CAPTCHA

New approaches may give the CAPTCHA antispam technology a second chance

By Steven J. Vaughan-Nichols
January 23, 2009 12:00 PM ET

Computerworld - How do you let people create user accounts or post comments on your Web site without letting spam bots in? Simple -- make your users prove they're human. Many Web sites use CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) technology to try to tell the bots from the people.

CAPTCHA's idea is simple enough. It presents users with an image showing an obfuscated string of letters that they must type in to get an e-mail or social networking account, for instance, or to enter a comment on an online forum. The theory is that only humans can decipher the letters hidden in the image and type in the correct code, and for a time it was an effective tool to keep the bots out.

basic CAPTCHA
A basic CAPTCHA

But while no one has yet come up with a computer that can fool people into thinking it's another person, computers are great at fooling other computers. These days, malware makers and spammers regularly trick the CAPTCHA systems at big-name Web sites such as Yahoo Mail, Gmail and Craigslist, and use these sites to automate their attacks.

So what can we do? Can CAPTCHA be saved?

The rise and fall of CAPTCHA

CAPTCHA was created in 2000 by researchers at Carnegie Mellon University, and by 2007, the technology was being used almost everywhere on the Web. For example, if you try to leave a comment on this story, you'll need to jump through a CAPTCHA hoop before you can leave a message.

Unfortunately, beginning in early 2008, crackers started getting the better of the CAPTCHA systems. In short order, Yahoo Mail's, Gmail's and Hotmail's CAPTCHA defenses were cracked.

Then, adding insult to injury, the crackers started releasing their work in the form of do-it-yourself CAPTCHA cracking software that anyone could use. For example, a program called CL Auto Posting Tool attempts to post bogus ads to Craigslist while automatically overcoming Craigslist's antispam protections.

These programs work by using OCR (optical character recognition) software to try to make sense of CAPTCHA's disguised text. If they fail, they try again. They take advantage of the fact that some CAPTCHA systems don't automatically give users a new CAPTCHA image to puzzle out. Instead, they'll let you, or a cracker program, keep working at the hidden text until it's solved.

Get one of these programs, aim it at the site you want to have bogus accounts on, and you can start spreading spam, anonymously flaming people you don't like, and sending thousands of people links to your malware-infested site.

It's not that the OCR-based cracker programs are that good. They're not. As CAPTCHA expert Sumeet Prasad from security firm Websense explained in a blog posting, while only 10% to 15% of the attempts on Hotmail are successful, a CAPTCHA cracker program needs only six seconds per attack. If a site allows an unlimited number of chances to crack a single image, that means it will take, on average, less than a minute to break in.

Because they are clearly insecure, CAPTCHA systems that allow unlimited or multiple attempts are becoming uncommon. Still, today's automated bots are capable of breaking even those systems that make users respond to a new CAPTCHA image after the first or second unsuccessful attempt. (On average, of course, the bots' efforts are less likely to work at one-try CAPTCHA systems.) That said, simple CAPTCHA systems, such as the ones that use random, non-malformed letters against a simple background, are still in common use and are easily breakable.

Another way to crack a badly designed CAPTCHA program is to reuse the session identification URL of a solved CAPTCHA image. In this case, either the cracker, or more likely a cracking program, first gets the right answer to a CAPTCHA. It then reconnects to the Web site with a URL containing the solved session identification information with a new username. Presto! You have an automated site cracker with a 100% success rate until the session ID eventually expires.



Our Commenting Policies