QuickStudy: Optical Character Recognition
Computerworld - Suppose you wanted to digitize the novel Moby Dick overnight. You could stay up all night typing and still not finish. Or you could use a high-end scanner and in minutes scan all of author Herman Melville's works into a computer using optical character recognition (OCR) technology.
This is the technology long used by libraries and government agencies to make lengthy documents quickly available electronically. Advances in OCR technology have spurred its increasing use by enterprises.
For many document-input tasks, OCR is the most cost-effective and speedy method available. And each year, the technology frees acres of storage space once given over to file cabinets and boxes full of paper documents.
Before OCR can be used, the source material must be scanned using an optical scanner (and sometimes a specialized circuit board in the PC) to read in the page as a bitmap (a pattern of dots). Software to recognize the images is also required.
The OCR software then processes these scans to differentiate between images and text and determine what letters are represented in the light and dark areas.
Older OCR systems match these images against stored bitmaps based on specific fonts. The hit-or-miss results of such pattern-recognition systems helped establish OCR's reputation for inaccuracy.
Today's OCR engines add the multiple algorithms of neural network technology to analyze the stroke edge, the line of discontinuity between the text characters, and the background. Allowing for irregularities of printed ink on paper, each algorithm averages the light and dark along the side of a stroke, matches it to known characters and makes a best guess as to which character it is. The OCR software then averages or polls the results from all the algorithms to obtain a single reading.
Additional Resources


White Papers & Webcasts
Natural User Interface for Enterprise Applications
Learn how a revolutionary user interface can make a complex enterprise application so intuitive even casual users can jump right in....
Why Now is the Right Time for the Linux Desktop
(Source: Novell) Faced with tighter budgets, enterprises are rethinking their desktop strategies to deliver the same - if not better - services and...
Moving Beyond Monolithic - What's Next for Enterprise Application Architectures?
This white paper reviews the current state of enterprise application architecture and presents a prediction on what might come next....
Novell Opens PR Video
Is the Linux desktop for me? Customers are looking for ways to be more flexible and save money. Using Linux offers a great...
SUSE Linux Enterprise Server Deployment Approach Guide
This document is intended for IT professionals and managers who are considering deploying SUSE Linux Enterprise Server. Novell has had a number of...
Usability Is Everything
Learn what sets Workday's HR and Payroll solutions apart from the competition....
SUSE Linux Enterprise Desktop Data Sheet
SUSE Linux Enterprise Desktop is the market's only enterprise-quality Linux desktop ready. It delivers seamless interoperability with existing enterprise systems and dozens of...
The Value of Real SaaS at Workday
Cost savings, speed to value, and innovation brought to the enterprise by Workday's software-as-a-service solutions for HR and Payroll....
SUSE Linux Enterprise Server Data Sheet
SUSE Linux Enterprise Server is a highly reliable, interoperable and manageable server operating system built to power mission-critical workloads in physical and virtual...
SaaS at Flextronics, Inc.
Dave Smoley, CIO of Flextronics, discusses the real value of software-as-a-service and why he chose Workday for his HR solution....
Subscribe to Computerworld
