Kofax Getting Started with Ascent Xtrata Pro User Manual

Page 107

Advertising
background image

Chapter 3

88

Ascent Xtrata Pro User's Guide

Max. feature length
Specifies the maximum number of characters that should be used for a feature.
Should not be larger than 64 characters.

Min. feature frequency
Specifies how often a substring must appear inside the training set of a class to be
used as a feature for content classification.

Start features at beginning of word
Specifies that a feature substring must always start at the beginning of a word. If
not checked, the substring can start anywhere.

Max. words per feature
Limits the number of words per feature. A value of zero means unlimited words,
although the total number of characters of the words per feature cannot exceed
the “Max. feature length” property.

Use fuzzy string match
Enables additional fuzziness with the disadvantage of slower classification
performance.

Fuzzy length
Configures the fuzzy string comparison.

Min. class entropy
Controls the importance of a feature, depending on the number of classes in
which it appears. A value of 1.0 requires that a feature only appears inside the
sample documents of a single class; otherwise, it is not used for classification.
The lower the value, the more classes a feature can appear inside the training set.

Note

You must retrain the project before any changes in these settings take affect.

Advertising