Chemometrics, Calibration methods, Principal component analysis: pca – BUCHI NIRCal User Manual

Page 41: 3chemometrics, 1 calibration methods

Advertising
background image

NIRCal 5.5 Manual, Version A

41

3

Chemometrics

3.1 Calibration Methods

3.1.1 Principal Component Analysis: PCA

Principal Component Analysis is a mathematical, statistical evaluation of a large amount of
chemical data
. In this case the chemical data are the measured NIR spectra.
PCA is made for two reasons:
- to reduces the data amount without loosing necessary information. Noise is truncated by the number
of primary PCs;
- to evaluate the measured spectrum automatically after creating a calibration.
With today's powerful computers, the prime object is no longer the reduction of the data volume.
Today, the main goal of PCA is to find and automatically evaluate characteristics of identity, quality
and quantity in the spectra.

Each spectrum measured with NIRFlex N-500 consists of 1.501 data, which correspond to the
intensity values of the 1.501 support points on the wavenumber scale.
In order to obtain a good calibration, a large number of spectra is needed. For 100 substance spectra,
this already gives us 150.100 data points, which places an enormous computing workload on
computers.
To achieve acceptable computing times, the spectral data are therefore efficiently compressed with
the aid of PCA without loosing any important information. For this purpose, PCA utilises the
redundancy occurring in the spectra. With PCA, so-called principal components are extracted which
are statistically independent from each other and which are therefore orthogonal relative to one
another, yet are still capable of adequately reconstructing the original spectra.
The PCA will always be performed with the calibration spectra set in the selected wavenumber with
the selected pretreatment.
A geometric explanation will serve to visualize the PCA: it is not possible to imagine a space of 1.501
dimensions (selected wavelengths), with each wavelength or wavenumber corresponding to a
dimension. But in this space, a spectrum can be represented as a point. For three dimensions, this
can be shown graphically:


In mathematical terms this point is equivalent to a vector with 1.501 components (I1....I1501). Several
calibration spectra produce a "cloud" of points

– a cluster- in space. For a set of spectra or points in

the 1.501-dimensional space, a coordinate's transformation is now performed in a way that the new
origin comes to lie in the mean centre of all the spectra - mean centering - and the new space
directions - principal components - lie along the greatest variance in the spectra.
The new space directions are calculated in such a way that the features with the widest variances

differences- of all spectra are included in the first space directions and the higher space directions
gradually evolve into noise. Space directions, which contain no any other information than noise, are
no longer taken into account.

Advertising