Skip the navigation

QuickStudy: Optical Character Recognition

By Sami Lais
July 29, 2002 12:00 PM ET

Computerworld - Suppose you wanted to digitize the novel Moby Dick overnight. You could stay up all night typing and still not finish. Or you could use a high-end scanner and in minutes scan all of author Herman Melville's works into a computer using optical character recognition (OCR) technology.

This is the technology long used by libraries and government agencies to make lengthy documents quickly available electronically. Advances in OCR technology have spurred its increasing use by enterprises.

For many document-input tasks, OCR is the most cost-effective and speedy method available. And each year, the technology frees acres of storage space once given over to file cabinets and boxes full of paper documents.

Before OCR can be used, the source material must be scanned using an optical scanner (and sometimes a specialized circuit board in the PC) to read in the page as a bitmap (a pattern of dots). Software to recognize the images is also required.

The OCR software then processes these scans to differentiate between images and text and determine what letters are represented in the light and dark areas.

Older OCR systems match these images against stored bitmaps based on specific fonts. The hit-or-miss results of such pattern-recognition systems helped establish OCR's reputation for inaccuracy.

Today's OCR engines add the multiple algorithms of neural network technology to analyze the stroke edge, the line of discontinuity between the text characters, and the background. Allowing for irregularities of printed ink on paper, each algorithm averages the light and dark along the side of a stroke, matches it to known characters and makes a best guess as to which character it is. The OCR software then averages or polls the results from all the algorithms to obtain a single reading.



Additional Resources
Forrester Consulting - Optimizing Users and Applications in a Mobile World
WHITE PAPER
Solving application issues over the WAN requires careful consideration. Based on their independent research, Forrester Consulting offers recommendations on how to tackle application performance issues, insufficient bandwidth and the inability to quickly restore users in a disaster.

Read now.

Security KnowledgeVault
WHITE PAPER
Security is not an option. This KnowledgeVault Series offers professional advice how to be proactive in the fight against cybercrimes and multi-layered security threats; how to adopt a holistic approach to protecting and managing data; and how to hire a qualified security assessor. Make security your Number 1 priority.

Read now.

Cut Communications Costs Once and for All
WHITE PAPER
New IP-based communications systems are being deployed by small and midsized businesses at a rapid rate. Learn how these organizations are enabling faster responsiveness, creating better customer experiences, speeding office or mobile interactions, and dramatically reducing existing communications costs.

Read now.

App Development White Papers
The Keys to Distributed & Agile Application Development
How leading firms are winning with strategies for efficient application development, without relying on co-location.
Overcome Top 7 Admin Challenges of Active Directory
As Active Directory's role in the enterprise has drastically increased, so has the need to secure the data. Gain insight on creating repeatable,...
Insiders Can Ruin Your Company. Take Action.
Did you know that 80 percent of threats to an organization come from the inside? The threat from insiders is often overlooked in...
Top Solutions and Tools to Prevent Devastating Malware
Custom malware frequently goes undetected. According to Forrester Research, the best way to reduce risk of breach is to deploy file integrity monitoring...
Streamline Compliance and Increase ROI
Streamline, simplify, and automate compliance related activities; especially those that impact multiple business units. This white paper from NetIQ, outlines solutions that will...
All App Development White Papers
App Development Webcasts
Reduced TCO for Communications Applications with New Oracle SPARC Servers
In this webcast learn how Oracle's new SPARC T4 servers and SPARC Supercluster deliver the security, performance, and scalability required for 4G network...
Optimizing Networks for the Cloud
Join guest speaker, Rohit Mehra, IDC Director of Enterprise Communications Infrastructure, to explore current trends, discuss best practices for optimizing Data Center and...
Apps QuickStart Series Part 2: Designing and Deploying SQL Server on VMware vSphere
Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as...
Apps QuickStart Series Part 1: Designing and Deploying Exchange 2010 on VMware vSphere
Download this webcast to learn the virtual hardware design considerations for Exchange 2010, deployment using the building block approach, options for high-availability and...
Customer Spotlight: How IPC The Hospitalist Company Implemented Oracle on VMware
Have you been looking to hear about customer's experiences with the new VMware vCenter Site Recovery Manager product? View this webcast to learn...
All App Development Webcasts
Newsletter Sign-Up

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all newsletters | Privacy Policy
IT Jobs