Step 2: configure text classification, Crea – Kofax INDICIUS 6.0 User Manual

Page 80

Advertising
background image

Chapter 4

70

Getting Started Guide (Classification and Separation)

Figure 4-28. Project Explorer showing Configuration Resources

Step 2: Configure Text Classification

Document Text Classifier

The classifier is created using the Build Document Text Classifier tab. Typically the
text classifier is trained on the documents in the Sample Documents set (after it has
been cleaned during document set management). Training options are selected
before the build process is started.

It is possible to specify whether training is restricted to documents that have been
confirmed, whether extra pages are trained on, and whether to further limit which
pages within a document are used in the training. Typically the first two options are
not selected (that is, all documents are used while training but not extra pages). The
pages to be used is only limited to save processing time (as the unused pages won't
need to be read in production) and if the document type can robustly be identified
from a subset of pages.

X

To build the document text classifier

1

Select Configuration | Build Document Text Classifier into... | Configuration
“Document Classification” to display the Build Document Text Classifier tab.

Sample Documents will already be selected in the “Training Document Set”
list and the document types within the set will be listed in the table.

Advertising