Chapter 3 - classification, Introduction, Concept of classification – Kofax Getting Started with Ascent Xtrata Pro User Manual

Page 70: Classification, Chapter 3

Advertising
background image

Ascent Xtrata Pro User's Guide

51

Chapter 3

Classification

Introduction

Ascent Xtrata Pro automatically classifies documents based on format, content, and
the subsequent extraction of items. Classification is performed in the first processing
step, separately from extraction. However, the classification results may
subsequently be changed based on the extraction results.

Ascent Xtrata Pro features a full framework of classification technologies that can be
used together in a flat structure or in a hierarchy. This chapter introduces you to the
classification methods and their usage.

Concept of Classification

In the context of document capture, classification signifies the assignment of a
document to a category. A category is one element of a predefined classification
scheme, which is also called the class hierarchy.

The classification result is the name of the class (in the current hierarchy) for which a
document matches predefined classification criteria. A class hierarchy is defined for
each project; therefore, the set of classification results is limited by the set of defined
classes and their properties.

Classification can either be based on the physical format/layout of a single document
page or on the content returned from full-text OCR. In the simplest case, if all of the
documents are single page documents, or deal with only a single, subject there is no
need to subdivide the documents into smaller parts, such as pages or paragraphs.

On the other hand, if the documents are more complex, it is necessary to analyze and
break them into smaller parts in order to determine the overall classification result.

Advertising