Transcript Representation of Processes
Nonlinear Empirical Models CHEE825 Fall 2005 J. McLellan 1
Neural Network Models of Process Behaviour • • generally modeling
input-output
behaviour
empirical
structure models - no attempt to model physical • estimated from plant data CHEE825 Fall 2005 J. McLellan 2
Neural Networks...
• structure motivated by physiological structure of brain • individual nodes or cells - “neurons” -sometimes called “perceptrons” • neuron characteristics - notion of “firing” or threshold behaviour CHEE825 Fall 2005 J. McLellan 3
Stages of Neural Network Model Development • data collection - training set, validation set • specification / initialization - structure of network, initial values • “learning” or training - estimation of parameters • validation - ability to predict new data set
collected under same conditions
CHEE825 Fall 2005 J. McLellan 4
Data Collection • expected range and point of operation • size of input perturbation signal • type of input perturbation signal random input sequence?
number of levels (two or more?) • validation data set CHEE825 Fall 2005 J. McLellan 5
Model Structure • numbers and types of nodes • input, “hidden”, output • depends on type of neural network e.g., Feedforward Neural Network e.g., Recurrent Neural Network • types of neuron functions - threshold behaviour - e.g., sigmoid function, ordinary differential equation CHEE825 Fall 2005 J. McLellan 6
“Learning” (Training) • estimation of network parameters - weights, thresholds and bias terms • nonlinear optimization problem • objective function - typically sum of squares of output prediction error • optimization algorithm - gradient-based method or variation CHEE825 Fall 2005 J. McLellan 7
Validation • use estimated NN model to predict outputs for new data set • if prediction unacceptable, “re-train” NN model with modifications - e.g., number of neurons • diagnostics sum of squares of prediction error R 2 - coefficient of determination CHEE825 Fall 2005 J. McLellan 8
Feedforward Neural Networks • • • signals flow
forward
from input through hidden nodes to output no internal feedback
input nodes
- receive external inputs (e.g., controls) and scale to [0,1] range
hidden nodes
- collect weighted sums of inputs from other nodes and act on the sum with a nonlinear function CHEE825 Fall 2005 J. McLellan 9
Feedforward Neural Networks (FNN) •
output nodes
- similar to hidden nodes BUT they produce signals leaving the network (outputs) • FNN has one input layer, one output layer, and can have many hidden layers CHEE825 Fall 2005 J. McLellan 10
FNN - Neuron Model •
i
th neuron in layer
l
+1
y
i l
1
f
(
j Nl
1
w
ij l
1
y
j l
i l
1
)
threshold value weight state of neuron activation function J. McLellan CHEE825 Fall 2005 11
FNN parameters • weights
w l+1 ij -
weight on output from layer
l
entering neuron
i
in layer
l+1 j
th neuron in • threshold - determines value of function when inputs to neuron are zero • bias - provision for additional constants to be added CHEE825 Fall 2005 J. McLellan 12
FNN Activation Function • typically sigmoidal function
1
1
e
x
CHEE825 Fall 2005 J. McLellan 13
FNN Structure input layer hidden layer output layer CHEE825 Fall 2005 J. McLellan 14
Mathematical Basis • approximation of functions • e.g., Cybenko, 1989 - J. of Mathematics of Control, Signals and Systems • approximation to arbitrary degree given sufficiently large number of nodes - sigmoidal CHEE825 Fall 2005 J. McLellan 15
Training FNN’s • calculate sum of squares of output prediction error
E
j
(
y
j
y
j
)
2 • take current iterates of parameters, calculate forward and calculate
E
• update estimates of weights working backwards “backpropagation” CHEE825 Fall 2005 J. McLellan 16
Estimation • typically using a gradient-based optimization method • make adjustments proportional to
E
w
ij l
1 • issues - highly over-parameterized models - potential for singularity • e.g., Levenberg-Marquardt algo. CHEE825 Fall 2005 J. McLellan 17
How to use FNN for modeling dynamic behaviour?
• structure of FNN suggests
static
model • model dynamic model as
nonlinear difference equation
• essentially a NARMAX model CHEE825 Fall 2005 J. McLellan 18
Linear discrete time transfer function • transfer function
y
k
1
1 1
bz az
• equivalent difference equation 1 1
u
k
y
k
1
ay
k k
bu
k
1 CHEE825 Fall 2005 J. McLellan 19
FNN Structure - 1st order linear example hidden layer y k input layer u k u k-1 output layer y k+1 CHEE825 Fall 2005 J. McLellan 20
FNN model for 1st order linear example • essentially modelling
algebraic
relationship between past and present inputs and outputs • nonlinear activation function not required • weights required - correspond to coefficients in discrete transfer function CHEE825 Fall 2005 J. McLellan 21
Applications of FNN’s • process modeling - bioreactors, pulp and paper, • nonlinear control • data reconciliation • fault detection • some industrial applications - many academic (simulation) studies CHEE825 Fall 2005 J. McLellan 22
“Typical dimensions” • Dayal et al., 1994 - 3-state jacketted CSTR as a basis • 700 data points in training set • 6 inputs, 1 hidden layer with 6 nodes, 1 output CHEE825 Fall 2005 J. McLellan 23
Advantages of Neural Net Models • limited process knowledge required - but be careful (e.g., Dayal et al. paper) • flexible - can model difficult relationships directly (e.g., inverse of a nonlinear control problem) CHEE825 Fall 2005 J. McLellan 24
Disadvantages • potential for large computational requirements implications for real-time application • highly over-parameterized • limited insight into process structure • amount of data required • limited to range of data collection CHEE825 Fall 2005 J. McLellan 25
Recurrent Neural Networks • neurons contain differential equation model - 1st order linear + nonlinearity • contain feedback and feedforward components • can represent continuous dynamics • e.g., You and Nikolaou, 1993 CHEE825 Fall 2005 J. McLellan 26
Nonlinear Empirical Model Representations • Volterra Series (continuous and discrete) • Nonlinear Auto-Regressive Moving Average with Exogenous Inputs (NARMAX) • Cascade Models CHEE825 Fall 2005 J. McLellan 27
Volterra Series Models • higher-order convolution models • continuous
h
2 1
(
1
h
3 2
(
1 2 3
)
1
1
2
)
1 2 2
3
)
d d
1 2 3 CHEE825 Fall 2005 J. McLellan 28
Volterra Series Model • discrete time
j
1
h
2 3 1 1 1 2 2
j
)
3 1 1
j
2
)
2
j
3
)
CHEE825 Fall 2005 J. McLellan 29
Volterra Series models...
• can be estimated directly from data or derived from state space models • causality - limits of sum or integration • functions h i - referred to as the
ith
order kernel • applications - typically second-order (e.g., Pearson et al., 1994 - binder) CHEE825 Fall 2005 J. McLellan 30
NARMAX models • nonlinear difference equation models • typical form
1 )
1
1 ),...)
• dependence on lagged y’s -
autoregressive
• dependence on lagged u’s -
moving average
CHEE825 Fall 2005 J. McLellan 31
NARMAX examples • with products, cross-products
1 )
1
1 )
3 • 2nd order Volterra model 2
1 )
– as NARMAX model in u only, with second order terms CHEE825 Fall 2005 J. McLellan 32
Nonlinear Cascade Models • made from
serial nonlinear
and and
parallel linear dynamic
arrangements of
static
elements • e.g., 1st order linear dynamic element fed into a “squaring” element – obtain products of lagged inputs – cf. second order Volterra term CHEE825 Fall 2005 J. McLellan 33