Kofax INDICIUS 6.0 User Manual

Page 66

Advertising
background image

Chapter 4

56

Getting Started Guide (Classification and Separation)

Step 6.1: Analysis using the Overview tab

The Overview tab was first used in

Step 3: Initial Analysis

and displays statistical

information on a document set. Having read the pages in

Step 5: Read Page Content

,

the Overview chart is updated to indicate how clean (accurate) Transformation
Studio has analyzed the set to be. Each document type in the chart is color-coded
according to the following criteria:

Table 4-2. Color Coding of Document Types in Overview Chart

Color

Label

Description

Green

Clean

The document type does not have very much variation and
needs little or no work in Cleanup Documents.

Orange

Poor

The document type has some variation and will need some
work in Cleanup Documents.

Red Very

Poor

The document type has a lot of variation in the text content. It
may need a lot of attention within Cleanup Documents or may
not be suitable for text classification.

Gray

Unknown

No information is available as the document type has not been
read or is “(Unknown),” that is no type is assigned to the
documents.

Note

This data is also visible by displaying the tool tip for a document type in the

chart (hover the mouse over the column).

The analysis of the documents is based on the page content (text) reads. This means
that occasionally a document type will appear to be poor, when it is actually clean
but only suitable for a classification method other than text (for example, image or
templated classification).

Step 6.2: Cleanup

Using the Cleanup Documents tab is an efficient way to cleanup your document set
with assistance from Transformation Studio. Possible problem documents are
identified automatically and displayed for manual confirmation. In addition,
documents that will help Transformation Studio to refine its analysis of the
document type in the most efficient way are displayed. These documents are
continually updated based on the confirmation (or re-classification) of the last
document.

Within Cleanup Documents there are two steps:

Advertising