Training, Manual training | OmniWare Pro 12 ScanSoft guide

Chapter 4

Training

Training is the process of changing the OCR solutions assigned to character shapes in the image. It is useful for uniformly degraded documents or when an unusual typeface is used throughout a document. Training will be less useful for texts with random distortions. Here is an example, based on the letter “g”, which can be printed in different ways:

The first two examples do not need training, because both shapes are normal for the letter “g” and the program can handle them. The third example could benefit from training because the shape of “g” is unusual, and all instances of “g” in the text are likely to look like this. The fourth example is not good for training, because the first “g” is poorly printed, and this shape is unlikely to appear again in the document.

You can use training to improve recognition of special symbols such as @,

®and © or to recognize supported accented letters more reliably. The purpose of training is not to teach the program to read characters from non-supported languages or alphabets.

OmniPage Pro 12 offers two types of training: manual training and automatic training (IntelliTrain). Data coming from both types of training are combined and available for saving to a training file.

When you leave a page on which training data was generated, you will be asked how to apply it to other existing pages in the document.

Manual training

To do manual training, place the insertion point in front of the character you want to train, or select a group of characters (up to one word) and choose Train Character... from the Tools menu or the shortcut menu. You will see an enlarged view of the character(s) to be trained, along with the current OCR solution. Change this to the desired solution and click OK. The program takes this training and examines the rest of the page. If it

Training 69

Image 69

OmniWare Pro 12 ScanSoft manual Training, Manual training

Contents

Page G a L N O T I C E S N T E N T S O C E S S I N G D O C U M E N T S O O F I N G a N D E D I T I N G D E This User’s Guide Online Help Readme FileScanning and other information Bold Using this Guide Getting online Help Online Html HelpContext-Sensitive Help Tech Notes Glossary Installation and setup System requirements Installing OmniPage Pro Before installing OmniPage ProTo install OmniPage Pro Setting up your scanner with OmniPage Pro Setting up your scanner with OmniPage Pro How to start the program Registering your software New features in OmniPage ProDramatic increase in accuracy Streamlined interface Formatting levels for display and saving Better proofing and verifyingSuperior page analysis Advanced saving options Introduction OmniPage Pro’s OCR capabilities What is optical character recognitionGraphics Text formatting Basic processing steps Documents in OmniPage ProBring a set of images into OmniPage Pro Perform OCR to generate editable text Image Panel OmniPage Desktop Toolbars Menu bar Text Editor Image Panel OmniPage Toolbox Thumbnails Managing documents Document Manager Deleting pages from a document Customizing Document Manager columns OmniPage Documents Printing a documentClosing a document How to save to OPD Why save to OPD Scanner SettingsDirect OCR Process Proofing Custom LayoutText Editor Processing documents Quick Start Guide Loading and recognizing sample image filesScanning and recognizing a single Quick Start Guide Automatic Processing overviewManual Combined Using the OCR Wizard Other applicationsAt a later time Automatic processing Stopping and restarting automatic processing Manual processing Start automatically and finish manually Combined processing Start manually and finish automatically Processing with the OCR Wizard Processing from other applications How to use Direct OCR How to set up Direct OCR How to use OmniPage Pro with PaperPort Processing with Schedule OCR Input from image files Defining the source of page images Scan black and white Input from scannerScan grayscale Scan color Brightness and contrast Scanning with an ADF Scanning without an ADF Describing the layout of the document Multiple columns, no table Single column, no tableSingle column with table Spreadsheet Automatic zoning Zones and backgroundsAuto-zone a whole Auto-zone a part of a Auto-zone a page background Manual zoningDrawing zones on an ignore background Drawing zones on a process background Process zone olive Zone types and properties Text zone brown Ignore zone grayTable zone blue Graphic zone green Working with zones Draw a single zoneMake an irregular zone by addition Join two zones of the same type Make an irregular zone by subtractionSplit a zone Table grids in the image Insert column dividers Insert row dividersMove dividers Remove dividers Using zone templates How to save a zone templateHow to modify a zone template How to unload a template How to replace one template with anotherHow to delete a template file Proofing and editing Green Non-dictionary words These were recognized Editor display and viewsNo Formatting view Retain Fonts and Paragraphs view True Page view Proofreading OCR results Verifying text Verifying text Starting a user dictionary User dictionariesLoading or unloading a user dictionary Editing or deleting a user dictionary Manual training Training IntelliTrain Training files Text and image editing Editing character attributesEditing paragraph attributes Tables Paragraph stylesHyperlinks Editing in True On-the-fly editing To hear text Use these keys Reading text aloud You also have the following keyboard controls Saving and exporting Saving original images Saving recognition results Saving a document as you work No Formatting NF Selecting a formatting levelRetain Fonts and Paragraphs RFP Flowing Page FP True Page TP Spreadsheet Selecting advanced saving options Chapter Copying pages to Clipboard Sending pages by mail To copy pages to the ClipboardTo send pages by e-mail Saving and exporting Technical information Solutions to try first Troubleshooting To test OmniPage Pro in VGA mode Windows NT Testing OmniPage Pro Increasing disk space Increasing memory resources Text does not get recognized properly System or performance problems during OCR Problems with fax recognition Odma support Advanced features in Schedule OCR File types for opening and saving images Supported file types RFP File types for saving recognition results To uninstall or reinstall OmniPage Pro Uninstalling the software D E Index Processing steps, 21 Overview of processing Index