The Application of Partial Least Squares to Non-linear Systems

Download Report

Transcript The Application of Partial Least Squares to Non-linear Systems

The Application of Partial Least Squares
to Non-linear Systems
in the Process Industries
Elaine Martin and Julian Morris
Centre for Process Analytics and Control Technology
CPACT
School of Chemical Engineering and Advanced Materials
University of Newcastle, England
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Overview of the Presentation

Motivation for the Application of “Data Mining” in Non-linear
Process Systems

Process Modelling and Analysis of Non-linear Systems

Constrained Partial Least Squares

Local Linear Modelling

Prediction Intervals for Non-linear Partial Least Squares

Conclusions
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Data Rich Information Poor
Enhanced Profitability
and
Improved Customer
Satisfaction
Process Monitoring
for Early Warning
and
Fault Detection
Modern Process
Control
Systems
Process Optimisation
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Process Modelling

Mechanistic models developed from process mass and energy
balances and kinetics provide the ideal form given:
 process
understanding exists
 time is available to construct the model.

Data based models are useful alternatives when there is:
 limited
process understanding
 process data available from a range of operating
conditions.

Hybrid models combine the two approaches.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Process Modelling

Traditionally two types of variables have been used in the
development of a process model/process performance
monitoring scheme:
 Process
variables (X)
 Quality variables (Y)

In practice, a third class exists:
 Confounding
variables (Z).

A confounding variable is any extraneous factor that is related to,
and affects, the two sets of variables under study (X) and (Y).

It can result in a distortion of the true relationship between the
two sets of variables, that is of primary interest.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Global Process Variation
Confidence ellipse
including
confounding
variation
Mal-operation
X
X
X
X
X
X
X
Confidence ellipse
excluding
confounding variation
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Trajectory of
confounding
variable
Partial Least Squares
X  TP T  E
U  BT
Y  UQT  F
•
•
E1  X  t1 p1T
F1  Y
 b1t1q1T
X-block outer relationship (monitoring)
Inner relationship (prediction)
Y-block outer relationship (monitoring)
X and Y-block scores are calculated recursively
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Constrained PLS

To exclude the nuisance source of variability, a necessary
condition is that the derived latent variables, t h , and u h , are not
correlated with the confounding variables:
Z T t h  0 and

ZT u h  0
for h  1,, A .
The idea of constrained PLS is to apply the constraints to
ordinary PLS.
t h  arg min
2
E h  tw Th
u h  arg min
T 2
Fh  uq h
t
u
t h  arg min
T 2
E h  tw h
u h  arg min
T 2
Fh  uq h
Z T t 0
Z Tu 0
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Constrained PLS

Standard constrained optimisation techniques can be used to
solve the equations in each iteration.

An algorithm has been developed that enhances the efficiency
of the constrained PLS algorithm.

The other steps of constrained PLS are as for ordinary PLS.

The resulting latent variables can then be used for process
monitoring with the knowledge that they are not confounded with
the nuisance source of variability.

Any unusual variation detected from these latent variables can
then be assumed to be related to abnormal process behaviour.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Industrial Application

An industrial semi-discrete batch manufacturing operation is
used to illustrate the advantages of the constrained PLS
algorithm over ordinary PLS.

The process involves the production of a variety of products
(recipes), some of which are only manufactured in small
quantities to meet the requirements of specialised markets.

The objective of the analysis was to build a monitoring scheme
to detect the onset of subtle changes in production and final
product quality.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
An Industrial Application

For simplicity, three recipes are selected to demonstrate the
methodology.

A total of thirty-six process variables, including flow rates,
pressures and temperatures, are recorded every minute, whilst
five quality variables are measured off-line in the quality
laboratory every two hours.

A nominal process monitoring scheme was developed using
both ordinary PLS and constrained PLS from 41 ‘ideal’ batches.

A further 6 batches, A4, A10, A29, A35, A38 and B32 were used
for model validation. These batches were known to lie outside
the desirable specification limits.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Industrial Application
Ordinary Partial Least Squares
6
6
B
4
B B
B
2
B
4
B
t2
0
A
A
A
A
-2
BA
AA
2
A
A
A
AAA AA
A
A
AA
AA A
AA
A
AA
AA
A
C
C
t4
C
0
A
A A
A
C
A
CC
C
C
-2
AC
AC
B
B
A
A
A AA
B
A
A
B
CC AC
C
C
A
B
A
A
A
AA
A
-4
-4
-6
-6
-10
-5
0
t1
5
10
Latent variable 1 V Latent variable 2
-6
-4
-2
0
t3
2
4
6
Latent variable 3 V Latent variable 4
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Industrial Application
Ordinary Partial Least Squares
20
T2 statistic
6
B
4
B B
B
2
B
B32
A29
B
0
A10
A A35
A A38
A
A
-2
0
10
20
30
40
0
10
20
30
40
C
C
C
CC
C
50
40
SPE
t2
0
C
A4
10
5
A
A
A
AAA AA
A
A
A
AA
AA
AA
AAAA A
15
-4
30
20
10
-6
0
-10
-5
0
t1
5
Bivariate Scores Plot
10
Hotelling’s T2 and SPE
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Industrial Application
Constrained Partial Least Squares
15
6
A
4
10
A
B
A
B
A C ACA A
C
A A A
AA
C
A A29
A CC
A
B
A CB C
A10
A
AAA
B
B
C
A
B32
AA
A
A
2
t2
5
A
0
A4
-2
-4
t4
A38
A35
A C
A
B
A
B
AC A A A
A BA AC A A
A
A
A CB
A
ACA
AA
CCC
BA A
A
A
A
C
A B
0
A4
-5
A10
-10
-6
A29 B32
A35
A38
-10
-5
0
5
t1
LV 1 versus LV 2
10
-15
-5
0
t3
LV 3 versus LV 4
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
5
Industrial Application
Constrained Partial Least Squares
T 2 statistic
300
200
Hotelling’s T2
100
0
0
10
20
30
40
1000
SPE
800
600
Squared Prediction Error
400
200
0
0
10
20
30
40
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Constrained PLS - Conclusions

Constrained PLS
characteristics:
possesses
the
following
important

It removes that information correlated with the confounding
variables.

The information excluded by constrained PLS contains only
variation associated with the confounding variables.

The derived constrained PLS latent variables achieve
optimality in terms of extracting as much of the available
information as possible contained in the process and quality
data.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Local Linear and Non-linear
Multi-way Partial Least Squares Batch Monitoring
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Batch Process Modelling and Monitoring

Batch processes exhibit non-linear, time variant and dynamic
behaviour.

These characteristics challenge the linear multivariate
statistical technique of multi-way Partial Least Squares (PLS)
that has traditionally been applied in batch process
performance monitoring.

A local model based approach has been developed to
overcome these limitations.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Local Model Approach

Batch processes often exhibit distinct phases of process
operation thus instead of modelling a non-linear time variant
batch process as a global model, batch trajectories are subdivided into individual operating regions.

A local linear PLS model is then developed for each operating
region


Each model can comprise a different number of latent
variables.
A validity function then creates a smooth transition between the
local models to build a global non-linear model.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Validity Function

The validity function determines which operating region the
process lies within at each time point:



Identification of the most appropriate local model
Weighting of local models if two or more are applicable
The validity function is based on a fuzzy logic rule based
function:

Rules based on process variable behaviour
IF x1 is LOW AND x2 is HIGH
THEN model 1 is valid
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Dynamic Feature Addition

Batch process
correlation.
variables also exhibit serial
and cross

Auto Regressive with eXogenous inputs (ARX) structure is a
time series structure used to model such data
ŷ(t)  a1 y(t  1 )  .... ak y(t  k)  b1 x(t  1 )  ....  b j x(t  j)

Including past input and output process variables into the X
data matrix of a PLS model encapsulates some of the dynamic
features within the model.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Application to an Industrial Process

A fed-batch fermentation process is used to demonstrate local
model performance monitoring.

17 batches with good operating conditions and high yield were
selected for the nominal model.

30 batches with standard operating conditions but mid to low
yield were used to assess the monitoring charts.

A model was developed using local dynamic PLS and global
dynamic PLS.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Operating Region Specification

Operating regions specified using process knowledge


4 operating regions identified
Regions based on conditions within the fermenter




Operating region 1: initial start up of the fermenter before
optimum conditions are reached
Operating region 2: initialisation of product growth
Operating region 3: maximum growth rate of product
Operating region 4: reactions are complete
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Operating Region Specification
x1
5
pH
pH
0
-5
0
50
100
150
200
250
300
350
400
x6
5
Addition rate
of chemical A
0
-5
0
50
100
150
200
250
300
350
400
y1
2
Potency
0
-2
0
50
100
150
200
time
250
300
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
350
400
Validity Function

Fuzzy logic rules used to determine movement between
operating regions

Rules applied to
 Power, Substrate Addition Rate, Respiration Rate
model 1
model 2
model 3
model 4
1
validity
0.8
0.6
0.4
0.2
0
0
50
100
150
200
time
250
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
300
350
Global Dynamic PLS
80
70
Predicted and
Actual Values of
Potency
60
potency
50
40
30
20
10
0
0
50
100
150
200
250
obs ervation number
300
350
400
3
2
Residuals of
Global
Dynamic PLS
Model
1
0
-1
-2
-3
-4
0
10
20
30
40
50
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
60
70
Prediction using Local Dynamic PLS Model
80
70
potency
60
Predicted and
Actual Potency
for Each Model
50
40
30
20
10
0
0
50
100
150
200
observation number
250
300
350
400
1
0.5
Residuals of
Local Dynamic
PLS models
residuals
0
-0.5
-1
model
model
model
model
1
2
3
4
-1.5
0
10
20
30
potency
40
50
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
60
Performance Monitoring and Fault Detection
25
20
Local SPE
chart - varying
control limit
15
10
5
0
0
50
100
150
200
250
300
350
400
18
16
Global SPE
chart constant
control limit
14
12
10
8
6
4
2
0
0
50
100
150
200
250
300
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
350
400
Fault Detection
batch 3
local t2
150
Process fault
detected
100
Local SPE
chart
50
0
0
50
100
150
200
250
300
350
400
batch 3
40
False alarm
global t2
30
Global SPE
chart
20
10
0
0
50
100
150
200
250
300
350
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
400
Conclusions

Inclusion of dynamic behaviour improves model performance
through the removal of process structure within the model

Fuzzy model rule based validity function approach allows batch
specific movement between model

Local model approach to performance monitoring leads to
control charts with improved model limits

Local model monitoring charts detect faults and process
deviations earlier than the global model equivalent
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Non-linear Partial Least Squares
Prediction Intervals and Leverage
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Non-linear Partial Least Squares

A simple approach to non-linear PLS has been to extend the
input matrix (X) by including non-linear combinations of the
original variables (such as logarithms, square values, crossproducts, etc.) and then performing linear PLS.

If there is no a priori knowledge, then there is no limitation as to
the number (and kind) of transformation that might be applied.

Thus by pre-treating data sets in this way, the number of nonlinear terms can increase excessively, resulting in large input
and output matrices and the results become difficult to interpret.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Non-linear Partial Least Squares

A more structured approach to the development of a non-linear
PLS model is to modify the NIPALS algorithm by introducing a
non-linear function that relates the output scores u to the input
scores t, without modifying the input and output variables:
u  f t   e  f X, w  e

Wold et al (1989) proposed a non-linear PLS algorithm which
retained the framework of linear PLS but that used second order
polynomial (quadratic) regression:
uj = c0j+ c1j tj + c2j tj 2+ ej
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Prediction Intervals for Non-linear PLS

As for every regression technique, a measure for assessing the
reliability of the predicted values is required.

A common approach is through the use of prediction intervals.
These are the upper and lower confidence limits of the predicted
values.

The larger the magnitude of these intervals, the less precise is
the prediction.

A methodology used to evaluate prediction intervals for neural
network models has been extended to linear and non-linear
partial least squares algorithms.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Calculation of Prediction Intervals

The prediction intervals are computed using a first order Taylor
series expansion and the Jacobian matrix of the functional
mapping provided by the PLS algorithms.

Given a set of input and output training data, X and Y,
respectively, a PLS regression model is built and the Jacobian
matrix F is computed for the same set of training data

When the PLS regression model is used to predict a new output
value, corresponding to a new sample of input variables, the
vector of partial derivatives is computed and the prediction
interval is evaluated


 
PI  y * ,    y *  t n  p ,1 2  s  1  f*T  F T .F



Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK

1
 f* 

Case Study

The data were generated from the simulation of a pH
neutralisation system.

Samples were collected under steady state operating
conditions, thus no time correlation existed between any two
consecutive samples.

The data included four input variables (flowrates of the inlet and
outlet streams of the neutralisation tank) and one output variable
(pH value measured in the outlet stream) and were noise free.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
pH Neutralisation Process
q3
q1
q2
h
pH
q4
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Radial Basis Function PLS

An error based up-dating partial least squares radial basis
function PLS model was built using 350 data samples.

It was constructed from one latent variable with twenty one
nodes included in the inner radial basis function model.

In excess of 99% of the total variance of the output variable was
captured by this representation.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Radial Basis Function PLS
11
10
9
8
pH
7
6
5
4
3
2
1
20
40
60
80
100
120
Sample number
140
160
180
Time Series Plot for the Test Data with Predictions
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
200
Leverage

The quantity

1  f T  FT .F

*


1
 f* 

is similar in form to leverage.

It can be used to provide an additional metric for assessing the
quality of the regression model.

This is achieved by computing the critical value of the
chi-square distribution with degrees of freedom, for predefined
confidence levels, e.g. 95% and 99%, and plotting the value of
for each sample and the critical value of the distribution divided
by (n-p).
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Leverage

When the ‘leverage’ is smaller than the critical value, the
corresponding predicted value is considered to be reliable with
the predefined confidence level and vice versa, when the
‘leverage’ is larger than the limit, the predicted value is
considered to be unreliable.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Radial Basis Function PLS
0.7
11
10
9
0.5
8
0.4
7
pH
Absolute Value of Prediction Error
0.6
0.3
6
5
0.2
4
3
0.1
2
0
20
40
60
80
100
120
Sample number
140
160
180
Leverage for the Test Data
200
1
20
40
60
80
100
120
Sample number
140
Prediction Intervals
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
160
180
200
Conclusions - PLS Prediction Intervals

A methodology proposed for prediction intervals in neural
network modelling was extended to non-linear PLS algorithms.

This approach was known to give approximate, but generally
reliable, results whilst being less computationally expensive than
other more mathematically precise approaches such as the
likelihood, lack-of-fit, jackknife and bootstrap.

The development of the algorithm led to the definition of a
metric, the leverage, which can be used in conjunction with, or
as an alternative to, prediction intervals.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Conclusions
DATA RICH INFORMATION POOR
DATA
INFORMATION
KNOWLEDGE
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK
Acknowledgements

EBM acknowledges Dr Pino Baffi, Dr Baibing Li, Miss Nicola
Fletcher and colleagues in CPACT for the many stimulating
discussions.

EBM acknowledges colleagues at BASF Ag. for stimulating the
research, in particular Gerhard Krennrich and Pekka Teppola.

EBM acknowledges Pfizer for providing the data.
Centre f or Process Analy tics and Control Technology (CPACT)
Univ ersity of Newcastle, UK