Representation of Processes

Download Report

Transcript Representation of Processes

Nonlinear Empirical Models CHEE825 Fall 2005 J. McLellan 1

Neural Network Models of Process Behaviour • • generally modeling

input-output

behaviour

empirical

structure models - no attempt to model physical • estimated from plant data CHEE825 Fall 2005 J. McLellan 2

Neural Networks...

• structure motivated by physiological structure of brain • individual nodes or cells - “neurons” -sometimes called “perceptrons” • neuron characteristics - notion of “firing” or threshold behaviour CHEE825 Fall 2005 J. McLellan 3

Stages of Neural Network Model Development • data collection - training set, validation set • specification / initialization - structure of network, initial values • “learning” or training - estimation of parameters • validation - ability to predict new data set

collected under same conditions

CHEE825 Fall 2005 J. McLellan 4

Data Collection • expected range and point of operation • size of input perturbation signal • type of input perturbation signal random input sequence?

number of levels (two or more?) • validation data set CHEE825 Fall 2005 J. McLellan 5

Model Structure • numbers and types of nodes • input, “hidden”, output • depends on type of neural network e.g., Feedforward Neural Network e.g., Recurrent Neural Network • types of neuron functions - threshold behaviour - e.g., sigmoid function, ordinary differential equation CHEE825 Fall 2005 J. McLellan 6

“Learning” (Training) • estimation of network parameters - weights, thresholds and bias terms • nonlinear optimization problem • objective function - typically sum of squares of output prediction error • optimization algorithm - gradient-based method or variation CHEE825 Fall 2005 J. McLellan 7

Validation • use estimated NN model to predict outputs for new data set • if prediction unacceptable, “re-train” NN model with modifications - e.g., number of neurons • diagnostics sum of squares of prediction error R 2 - coefficient of determination CHEE825 Fall 2005 J. McLellan 8

Feedforward Neural Networks • • • signals flow

forward

from input through hidden nodes to output no internal feedback

input nodes

- receive external inputs (e.g., controls) and scale to [0,1] range

hidden nodes

- collect weighted sums of inputs from other nodes and act on the sum with a nonlinear function CHEE825 Fall 2005 J. McLellan 9

Feedforward Neural Networks (FNN) •

output nodes

- similar to hidden nodes BUT they produce signals leaving the network (outputs) • FNN has one input layer, one output layer, and can have many hidden layers CHEE825 Fall 2005 J. McLellan 10

FNN - Neuron Model •

i

th neuron in layer

l

+1

y

i l

 1 

f

(

j Nl

  1

w

ij l

 1

y

j l

 

i l

 1

)

threshold value weight state of neuron activation function J. McLellan CHEE825 Fall 2005 11

FNN parameters • weights

w l+1 ij -

weight on output from layer

l

entering neuron

i

in layer

l+1 j

th neuron in • threshold - determines value of function when inputs to neuron are zero • bias - provision for additional constants to be added CHEE825 Fall 2005 J. McLellan 12

FNN Activation Function • typically sigmoidal function 

1

1

e

x

CHEE825 Fall 2005 J. McLellan 13

FNN Structure input layer hidden layer output layer CHEE825 Fall 2005 J. McLellan 14

Mathematical Basis • approximation of functions • e.g., Cybenko, 1989 - J. of Mathematics of Control, Signals and Systems • approximation to arbitrary degree given sufficiently large number of nodes - sigmoidal CHEE825 Fall 2005 J. McLellan 15

Training FNN’s • calculate sum of squares of output prediction error

E

 

j

(

y

j

y

j

)

2 • take current iterates of parameters, calculate forward and calculate

E

• update estimates of weights working backwards “backpropagation” CHEE825 Fall 2005 J. McLellan 16

Estimation • typically using a gradient-based optimization method • make adjustments proportional to 

E

w

ij l

 1 • issues - highly over-parameterized models - potential for singularity • e.g., Levenberg-Marquardt algo. CHEE825 Fall 2005 J. McLellan 17

How to use FNN for modeling dynamic behaviour?

• structure of FNN suggests

static

model • model dynamic model as

nonlinear difference equation

• essentially a NARMAX model CHEE825 Fall 2005 J. McLellan 18

Linear discrete time transfer function • transfer function

y

k

 1  

1 1

 

bz az

• equivalent difference equation  1  1

u

k

y

k

 1 

ay

k k

bu

k

 1 CHEE825 Fall 2005 J. McLellan 19

FNN Structure - 1st order linear example hidden layer y k input layer u k u k-1 output layer y k+1 CHEE825 Fall 2005 J. McLellan 20

FNN model for 1st order linear example • essentially modelling

algebraic

relationship between past and present inputs and outputs • nonlinear activation function not required • weights required - correspond to coefficients in discrete transfer function CHEE825 Fall 2005 J. McLellan 21

Applications of FNN’s • process modeling - bioreactors, pulp and paper, • nonlinear control • data reconciliation • fault detection • some industrial applications - many academic (simulation) studies CHEE825 Fall 2005 J. McLellan 22

“Typical dimensions” • Dayal et al., 1994 - 3-state jacketted CSTR as a basis • 700 data points in training set • 6 inputs, 1 hidden layer with 6 nodes, 1 output CHEE825 Fall 2005 J. McLellan 23

Advantages of Neural Net Models • limited process knowledge required - but be careful (e.g., Dayal et al. paper) • flexible - can model difficult relationships directly (e.g., inverse of a nonlinear control problem) CHEE825 Fall 2005 J. McLellan 24

Disadvantages • potential for large computational requirements implications for real-time application • highly over-parameterized • limited insight into process structure • amount of data required • limited to range of data collection CHEE825 Fall 2005 J. McLellan 25

Recurrent Neural Networks • neurons contain differential equation model - 1st order linear + nonlinearity • contain feedback and feedforward components • can represent continuous dynamics • e.g., You and Nikolaou, 1993 CHEE825 Fall 2005 J. McLellan 26

Nonlinear Empirical Model Representations • Volterra Series (continuous and discrete) • Nonlinear Auto-Regressive Moving Average with Exogenous Inputs (NARMAX) • Cascade Models CHEE825 Fall 2005 J. McLellan 27

Volterra Series Models • higher-order convolution models • continuous      

h

2     1

(  

1

h

3 2

(   

1 2 3 

)

1 

1 

2

)

  

1 2 2 

3

)

d d

1 2 3 CHEE825 Fall 2005 J. McLellan 28

Volterra Series Model • discrete time   

j

 1   

h

2     3 1 1   1 2 2 

j

)

3  1  1 

j

2

)

 2 

j

3

)

CHEE825 Fall 2005 J. McLellan 29

Volterra Series models...

• can be estimated directly from data or derived from state space models • causality - limits of sum or integration • functions h i - referred to as the

ith

order kernel • applications - typically second-order (e.g., Pearson et al., 1994 - binder) CHEE825 Fall 2005 J. McLellan 30

NARMAX models • nonlinear difference equation models • typical form

1 )

1

1 ),...)

• dependence on lagged y’s -

autoregressive

• dependence on lagged u’s -

moving average

CHEE825 Fall 2005 J. McLellan 31

NARMAX examples • with products, cross-products

1 )

1

1 )

 3 • 2nd order Volterra model 2 

1 )

– as NARMAX model in u only, with second order terms CHEE825 Fall 2005 J. McLellan 32

Nonlinear Cascade Models • made from

serial nonlinear

and and

parallel linear dynamic

arrangements of

static

elements • e.g., 1st order linear dynamic element fed into a “squaring” element – obtain products of lagged inputs – cf. second order Volterra term CHEE825 Fall 2005 J. McLellan 33