

Recognition quality is generally poor except for the highest quality document images. OCR Freeware uses the SimpleOCR or Tesseract engines and provide limited scanning and output format capabilities.What explains the difference between these applications? Here’s the breakdown: OCR software ranges in price from freeware all the way up to tens of thousands of dollars.

To capture handprint, irregular tables, large numbers of data points, or data that doesn’t always appear in the same place on every page, Forms Processing software is what you need. If you need to capture data formatted in tables and output to CSV or Excel, desktop OCR applications do this quite well as long as the tables have a regular format with well-defined columns. If you need to capture specific data in multiple documents and output them to structured data files or a SQL database, Batch OCR Applications are the best option for this. The zones are designed more for excluding regions you don’t want or manually overriding the detection of text, tables and images in the document. What you typically get a text file for each document with a line of text for each zone.

With these applications it is often not possible to output this data as “fields” in a structured data file like CSV, Excel or XML. Most OCR applications have “Lite” versions that don’t have the ability to manually create zones so it’s important to get the correct version. The “Pro” versions of most Desktop OCR applications support the creation of zone templates that can be used to OCR specific regions on batches of documents.
