2 multilinear regression, 1 overview, 2 colinearity – Metrohm Vision – Theory User Manual

Page 8: 3 multiple correlation coefficient, 3 partial least squares (pls), Multilinear regression, Overview, Colinearity, Multiple correlation coefficient, Partial least squares (pls)

Advertising
background image

6

▪▪▪▪▪▪▪

1.2

Multilinear Regression

1.2.1

Overview

Multilinear Regression (MLR) is an extension of the simple linear regression. This method uses
information at more than one wavelength to create a calibration equation of the following form:

[

]

=

+

=

n

i

i

A

i

K

K

c

1

)

(

)

0

(

MLR is useful when the information at a single wavelength does not yield a model that performs
suitably well.

1.2.2

Colinearity

When using more than one wavelength in MLR, there is a risk of colinearity among the chosen
wavelengths. This means simply that absorbance values at two or more of the wavelengths used for
the calibration are correlated: they describe behavior in the data set that is related, not unique.

While such a model may describe the calibration set very well, it may be very sensitive to noise or
systematic errors in the calibration data which may not representative of samples in general.
Consequently, MLR models with colinearity among analysis wavelengths may not provide reliable
analysis of real samples. Vision provides an intercorrelation table for evaluating the extent of
colinearity in an MLR model.

1.2.3

Multiple Correlation Coefficient

A multi-wavelength analog of the correlation coefficient is the multiple correlation coefficient. This
parameter has the same significance as simple correlation coefficient, and is similarly useful for
estimating the quality of a model. As for simple correlation, the multiple correlation coefficient can
have values ranging from zero (0) to one (1), with zero indicating a complete lack of relationship
between spectral and constituent data, and one (1) signifying a perfect fit.

1.3

Partial Least Squares (PLS)

1.3.1

Overview

Partial Least Squares (PLS) is a regression method that allows use of many wavelengths – whether
broad segments or even the entire spectrum – while avoiding the problem of colinearity that besets
MLR. Unlike traditional least squares methods, PLS does not assume that spectral data are exact and
all errors are in constituent values. Instead, the spectral and constituent data are simultaneously
modeled in steps that incrementally account for spectral signal and constituent values. In each step,
part of the spectral data (called a “factor”) and a corresponding part of the constituent data is
subtracted from the data set, leaving spectral and constituent residuals.

With the determination of each factor, the residual information in the calibration data set (the
information yet to be modeled) becomes smaller and smaller. Partial calibrations for each factor
(loadings for spectral data and scores for constituent values) are used to calculate the amount of
variance modeled for each factor. At the end of the process, they are assembled into one overall
calibration equation.

Advertising