Computerworld - Suppose you wanted to digitize the novel Moby Dick overnight. You could stay up all night typing and still not finish. Or you could use a high-end scanner and in minutes scan all of author Herman Melville's works into a computer using optical character recognition (OCR) technology.
This is the technology long used by libraries and government agencies to make lengthy documents quickly available electronically. Advances in OCR technology have spurred its increasing use by enterprises.
Before OCR can be used, the source material must be scanned using an optical scanner (and sometimes a specialized circuit board in the PC) to read in the page as a bitmap (a pattern of dots). Software to recognize the images is also required.
The OCR software then processes these scans to differentiate between images and text and determine what letters are represented in the light and dark areas.
Older OCR systems match these images against stored bitmaps based on specific fonts. The hit-or-miss results of such pattern-recognition systems helped establish OCR's reputation for inaccuracy.
Today's OCR engines add the multiple algorithms of neural network technology to analyze the stroke edge, the line of discontinuity between the text characters, and the background. Allowing for irregularities of printed ink on paper, each algorithm averages the light and dark along the side of a stroke, matches it to known characters and makes a best guess as to which character it is. The OCR software then averages or polls the results from all the algorithms to obtain a single reading.
- Silicon Valley's 19 Coolest Places to Work
- Is Windows 8 Development Worth the Trouble?
- 8 Books Every IT Leader Should Read This Year
- 10 Hot Hadoop Startups to Watch
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- Gartner Magic Quadrant for Client Management Tools The client management tool market is maturing and evolving to adapt to consumerization, desktop virtualization, and an ongoing need to improve efficiency.
- Logicalis eBook: SAP HANA: The Need for Speed Without timely business insights, organizations today can suffer logistical, manufacturing, and even financial disaster in a matter of minutes
- Neustar 2014 DDoS Attacks and Impact Report For the third consecutive year, Neustar surveyed hundreds of companies on distributed denial of service (DDoS) attacks. The survey reveals evidence that the...
- Acxiom Case Study This case study, which focuses on Acxiom, explores how the company was able to secure employee data, reduce migration costs and boost productivity...
- Top 4 Digital Signage Fails Join RMG Networks for a look at four of the most common reasons digital signage fails in corporate businesses. Learn about strategies to...
- Building Tomorrow's Infrastructure Listen to this podcast to discover how Crider Foods worked with PC Connection to update their IT infrastructure, while maintaining compliance and control. All Desktop Apps White Papers | Webcasts