No Slide Title

Transcript No Slide Title

Neuro-Fyzzy Methods
for
Modeling and Identification
Presented by:
Ali Maleki
Presentation Agenda
Introduction
Fuzzy Systems
Artificial Neural Networks
Neuro-Fuzzy Modeling
Simulation Examples
Introduction
Control Systems
Competion
Environment requirements
Energy and material costs
Demand for robust, fault-tolerant systems
Extra needs for Effective process modeling techniques
Conventional modeling?
lack precise, formal knowledg about the system
Strongly nonlinear behavior,
High degree of uncertainty,
Time varying characteristics
Introduction
Solution: Neuro-fuzzy modeling
A powerful tool which can facilitate the effective
development of models by combining information from
different source:
Empirical models
Heuristics
Data
Neuro-fuzzy models
Describe systems by means of fuzzy if-then rules
Represented in a network structure
Apply algorithms from the area of Neural Networks
Introduction
Neuro-fuzzy modeling
- neural networks
- fuzzy systems
Both are motivated by imitating human reasoning process
Relationships:
In neural networks :
implicitly, coded in the network and its parameters
In fuzzy systems:
explicitly, in the form of if–then rules
Neuro–fuzzy systems combine the semantic transparency of rule-based
fuzzy systems with the learning capability of neural networks
Nonlinear system identification
NARX (nonlinear autoregressive with exogenous input) model
Regressor vector:
Dynamic order of the system:
Represented by the number of lags nu and ny
Task of nonlinear system identification:
Infer unknown function f from available data sequences
Nonlinear system identification
Multivariable systems:
Nonlinear state-space discription
Task of nonlinear system identification:
Infer unknown functions g and h from available data sequences
Neural Networks
Neuro-fuzzy systems
Splines
Interpolated look-up tables
Accurate Predictor
Accurate predictor +
model that can be used to learn something about system +
Analyse system properties
Fuzzy Models
Fuzzy Models
Definition:
A mathematical model which in some way uses fuzzy sets
In system identification:
rule-based fuzzy models
Example:
IF heating is high THEN temperature increase is fast
Linguistic terms
To make such a model operational: Linguistic terms must be
defined more precisely using fuzzy sets
Fuzzy sets defined through their membership functions
Fuzzy Models
Types of fuzzy models:
(depending on the structure of if-then rules)
Mamdani Model
IF D1 is low and D2 is high THEN D is medium
Takagi-Sugeno Model
IF D1 is low and D2 is high THEN D=k
(zero-
IF D1 is low and D2 is high THEN D=0.7D1+0.2D2+0.1
(first-
order)
order)
Mamdani Model
Linguistic terms
Number of rules
Linguistic fuzzy model is useful for representing qualitative
knowledge
Example – Mamdani Model
Oxygen flow rate
Gas burner
Heating power
Linguistic terms is defined by membership function
Example – Mamdani Model
Oxygen flow rate
Gas burner
Heating power
Membership functions can be defined
by the model developer based on prior knowledge
or
(automatically) by using data (constructed - adjusted)
Takagi-Sugeno Model
Mamdani model is typically used in knowledge-based (expert)
systems
In data driven identification, Takagi_Sageno model has becom
popular
Consequent parameter vector
Scalar offset
Number of rules
Takagi-Sugeno Model
The output y is computed by taking the weighted average of the
individual rules contribution:
Degree of fulfilment of the ith rule
In a special case:
Takagi-Sugeno Model (zero-order)
Rules:
Input-output equation:
This model is a special case of the Mamdani system, in which
Consequent fuzzy sets degenerate to singletons (real numbers)
Takagi-Sugeno Model
usually,
antecedent fuzzy sets are usually defined to describe distinct,
partly overlapping regions in the input space
Then,
The parameters ai are approximate local linear models of the
considered nonlinear system
TS model:
Piece-wise linear approximation of a nonlinear function
Example: Takagi-Sugeno Model
static characteristic of an actuator with
dead zone
and a
non-symmetrical response for positive and negative inputs
Fuzzy Logic Operators
In fuzzy systems with multiple inputs, the antecedent
proposition is usually represented as a combination of terms,
by using logic operators
‘and’ (conjunction)
‘or’ (disjunction) and
‘not’ (complement)
In fuzzy set theory,
several families of operators have been introduced for these
logical connectives
Example: Fuzzy Logic Operators
conjunctive form of the antecedent
degree of fulfillment:
minimum conjunction operator
product conjunction operator
Dynamic Fuzzy Models
TS NARX model
Regressor vector
Dynamic Fuzzy Models
TS state-space model
Advantages of the state-space modeling approach:
• structure of the model can easily be related to the physical
structure of the real system (model parameters are physically
relevant)
• This is not necessarily the case with input-output models.
• Dimension of the regression problem in state-space modeling
is often smaller than with input–output models
Artificial Neural Networks
ANNs:
• Inspired by the functionality of biological neural networks
• Can generalizing from a limited amount of training data
• black-box models of nonlinear, multivariable static and
dynamic systems
• can be trained by using input–output data
ANNs
•
•
•
consist of:
Neurons
Interconnection among them
Weights assigned to these interconnections
Multi-Layer Neural Network
•
•
•
One input Layer
One output layer
A number of hidden layers
Activation function:
•
Tangent hyperbolic
•
•
Threshold function
Sigmoidal function
Linear neurons
Multi-Layer Neural Network - Training
Training definition:
• adaptation of weights in a multi-layer network such that
the error between the desired output and the network
output is minimized
Training steps:
• Feedforward computation
• Weight adaptation
Gradient-descent optimization
Error backpropagation
Multi-Layer Neural Network - Structure
A network with one hidden layer is sufficient for most
approximation tasks
More layers:
• Can give a better fit
• But the training takes longer
number of neurons in the hidden layer:
• Too few neurons give a poor fit
• Too many neurons result in overtraining of the net
(poor generalization to unseen data)
Dynamic Neural Networks
static feedforward network combined with an external
feedback connection
First-order NARX model
Dynamic Neural Networks
Recurrent Networks
Feedback:
•
•
•
Internally in the neurons (Elman network)
Internally to other neurons in the same layer
Internally to neurons in preceding layer (Hopfield Network)
Hopfield Network
Elman Network:
Error Backpropagation
Input:
Desired output:
Error:
Cost function:
Adjusting the weights:
minimization of the cost function
Error Backpropagation
network’s output y is nonlinear in the weights
Therefore,
The training of a MNN is thus a nonlinear optimization problem
Methods:
• Error backpropagation (first-order gradient)
• Newton, Levenberg-Marquardt methods (second-order
gradient)
• Genetic algorithms and many others techniques
Error Backpropagation
First-order gradient
Update rule:
Weight vector in iteration n
Learning rate
Jacobian of the network
nonlinear optimization problem is thus solved by using the first
term of its Taylor series expansion
Error Backpropagation
Second-order gradient:
Second-order gradient method make use of the second term
Hessian
Update rule:
Error Backpropagation
Difference between first-order and Second-order gradient
methods:
Size and direction of gradient-decent step
Second order methods are usually more effective than first-order ones
Error Backpropagation
For output layer
Update law for output weights:
For hidden layer
Update law for hidden layer weights:
Backpropagation error
Radial Basis Function Network
RBF network is a twolayer network as figure below
Ususal choise for basis function is Gaussian function
Radial Basis Function Network
Adjustable weights are only present in the output layer
Free parameters of RBF nets are
• Output weights
• Parameters of the basis functions (centers and radii)
Output is linear in the weights, and these weights can be
estimated by least-squares methods
adaptation of the RBF parameters (center and radial) is a
nonlinear optimization problem that can be solved by the
gradient-descent techniques
Neuro-Fuzzy Modeling
Fuzzy system as a layered structure (network), similar to ANNs
of the RBF type
gradientdescent training algorithms for parameter optimization
This approach is usually referred to as Neuro-Fuzzy Modeling
•
•
Zero-order TS fuzzy model
First-order TS fuzzy model
Neuro-Fuzzy Modeling
Zero-order TS fuzzy model
Typical membership function
Input-output equation
Neuro-Fuzzy Modeling
First-order TS fuzzy model
Input-output equation
Neuro-Fuzzy Networks - Constructing
•
•
Prior knowledge (can be of a rather approximate nature)
Process data
Integration of knowledge and data:
• Expert knowledge as a collection of if–then rules
•
(Initial model creation – fine tune using process data)
Fuzzy rules are constructed from scratch by using
numerical data
(expert can confront the information stored in the rule base with his
own knowledge)
(can modify the rules)
(supply additional rules to extend the validity of the model)
Comparision the second method with Truly black-box structures
possibility to interpret the obtained results
Neuro-Fuzzy Networks – Structure and parameters
System identification steps:
• Structure identification
• Parameter estimation
•
choice of the model’s structure determines the flexibility of
the model in the approximation of (unknown) systems
•
model with a rich structure can approximate more
complicated functions, but, will have worse generalization
properties
•
Good generalization means that a model fitted to one
data set will also perform well on another data set from
the same process.
Neuro-Fuzzy Networks – Structure and parameters
Structure selection process involves:
•
Selection of input variables
•
Number and type of membership functions, number of rules
(These two structural parameters are mutually related)
Neuro-Fuzzy Networks – Structure and parameters
Selection of input variables
• Physical inputs
• Dynamic regressors (defined by the input and output lags)
Typical sources of information:
• Prior knowledge
• Insight in the process behavior
• Purpose of the modeling exercise
Automatic data-driven selection can then be used to compare
different structures in terms of some specified performance
criteria.
Neuro-Fuzzy Networks – Structure and parameters
Number and type of membership functions, number of rules
•
Determine the level of detail (granularity) of the model
Typical criteria
• Purpose of modeling
• Amount of available information (knowledge and data)
Automated methods can be used to add or remove membership
functions and rules.
Neuro-Fuzzy Networks – Gradient-based learning
Zero-order ANFIS model
Consequent parameters
Jacobian
Update law
Centers and spreads of the Gaussian membership functions
Neuro-Fuzzy Networks – Hybrid Learning
Output-layer parameters in RBF networks can be estimated by
linear least-squares (LS) techniques
LS methods are more effective than the gradient-based update rule
Hybrid methods:
•
•
One-shot least-squares estimation of the consequent parameters
Iterative gradient-based optimization of the membership functions
Choice of LS estimation method:
•
•
In terms of error minimization is not crucial
If consequent parameters are to be interpreted as local models
great care must be taken
PROBLEM: over-parameterization
numerical problems - over-fitting - meaningless parameter estimates
Example – Hybrid Learning
Approximation of second-order polynomial by a first-order
ANFIS model
Membership function for t1<u<t2
Model:
Output of TS model
•
Model has four free parameters, while three are sufficient to fit the
polynomial
Neuro-Fuzzy Networks – Hybrid Learning
To avoid over-parameterization, the basic least-squares criterion
can be combined with additional criteria for local fit, or with
constraints on the parameter values
•
•
•
Local Least-Squares Estimation
Constrained Estimation
Multi-Objective Optimization
Hybrid Learning - Global LS Estimation
Hybrid Learning - Local LS Estimation
While the global solution gives the minimal prediction error, it
may bias the estimates of the consequents as parameters of
local models.
If locally relevant model parameters are required, a weighted LS
approach applied per rule should be used.
•
•
The consequent parameters of the individual rules are estimated
independently (result is not influenced by the interactions of the rules)
Larger prediction error is obtained than with global least squares
Example – Global and Local LS Estimation
Approximation of second-order polynomial by a first-order ANFIS model
Model:
Local LS estimation
Global LS estimation
Hybrid Learning - Constrained Estimation
Knowledge about the dynamic system such as its stability,
minimal or maximal static gain, or its settling time can be
translated into convex constraints on the consequent
parameters
Hybrid Learning - Multi-Objective Optimization
Minimize the weighted sum of the global and local identification
criteria
Initialization of Antecedent Membership Functions
For a successful application of gradient-descent learning to the
membership function parameters, good initialization is
important
Initialization methods:
• Template-Based Membership Functions
• Discrete Search Methods
• Fuzzy Clustering
Initialization - Template-Based Membership Functions
domains of the antecedent variables are a priori partitioned by a
number of membership functions (usually evenly spaced and
shaped)
severe drawback of this approach is that the number of rules in
the model grows exponentially
•
•
•
Complexity of the system’s behavior is typically not uniform.
Some operating regions can be well approximated by a local linear
model, while other regions require a rather fine partitioning.
In order to obtain an efficient representation with as few rules as
possible, the membership functions must be placed such that they
capture the non-uniform behavior of the system.
Initialization - Discrete Search Methods
Iterative tree-search algorithms can be applied to decompose the antecedent
space into hyper-rectangles by axis-orthogonal splits.
Advantage: Its effectiveness for high-dimensional data and the transparency
of the obtained partition.
Drawback: Tree building procedure is sub-optimal (greedy) and hence the
number of rules obtained can be quite large
Initialization – Fuzzy Clutering
Based on the similarity, data vectors are clustered such that the data within a
cluster are as similar as possible, and data from different clusters are as
dissimilar as possible.
The number of clusters in the data can either be determined a priori or
sought automatically by using cluster validity measures and merging
techniques
Simulation Example
1. A simple fitting problem of a univariate static function
•
It demonstrates the typical construction procedure of a neuro-fuzzy
model. Numerical results show that an improvement in performance
is achieved at the expense of obtaining if-then rules that are not
completely relevant as local descriptions of the system.
2. Modeling of a nonlinear dynamic system
•
Illustrates that the performance of a neuro-fuzzy model does not
necessarily improve after training. This is due to overfitting which in
the case of dynamic systems can easily occur when the data only
sparsely cover the domains
Simulation Example - Static Function
ANFIS model with linear consequent function
Number of rules: five rules
Construction of initial model:
Gustafson-Kessel algorithm
Fit of the function with initial model – local models - membership functions
Simulation Example - Static Function
this initial model can easily be interpreted in terms of the local behavior
It is reasonably accurate (RMS= 0.0258)
ANFIS method, 100 learning epochs
anfis function of the MATLAB Fuzzy Logic Toolbox
Fit of the function with fine-tuned model, local models, membership functions
Simulation Example - Static Function
RMS error is about 23 times better than the initial model
Initial model
RMS error = 0.0258
Fine-tuned model
RMS error = 0.0011
Simulation Example - Static Function
after learning, the local models are much further from the true local
description of the function
Initial model
Fine-tuned model
Fine-tuned model are thus less accurate in describing the system locally
Simulation Example – pH Neutralization Process
Influent streams
Acid
Buffer
Base
Acid flowrate = cte
Buffer flowrate = cte
Base stream flowrate
Neutralization
tank
Effluent stream
Neutralization
tank
pH in the tank
Simulation Example - pH Neutralization Process
Identification and validation data sets:
•
•
Simulating the model by Hall and Seborg for random change of the
influent base stream flow rate
N = 499 samples with the sampling time of 15 s.
The process is approximated as a first–order discrete-time NARX model
Simulation Example - pH Neutralization Process
Membership functions
Befor training
After training
Simulation Example - pH Neutralization Process
Rules:
Initial Rules:
Fine Tuned Rules: after 1000 epochs of hybrid learning using the ANFIS
function of the MATLAB Fuzzy Logic Toolbox
Simulation Example - pH Neutralization Process
Overtraining Problem:
•
Comparision of RMS ERROR befor and after training
•
Prediction befor and after training
References
[1] Robert Babuska, “Neuro-Fuzzy Methods for Modeling and Identification”,
Recent Advances in Intelligent Paradigms and Application, SpringerVerilag, 2002
[2] Robert Babuska, “Fuzzy Modeling and Identification Toolbox User’s Guide
- For Use with MATLAB”, 1998.
[3] A. Trabelsi, F. Lafont, M. Kamoun and G. Enea, “Identification of
Nonlinear Systems by Adaptive Fuzzy Takagi-Sugeno Model”,
International Journal of Computational Cognition, Volume 2, Number 3,
Pages 137–153, September 2004.
[4] Jan Jantzen, “Neurofuzzy Modelling”, Technical University of Denmark,
Department of Automation,1998.
THANK YOU VERY MUCH
For your
Attention
Presented by:
Ali Maleki
Titration Curve
Acid flowrate = cte
Buffer flowrate = cte
Base stream flowrate
Neutralization
tank
pH in the tank

No Slide Title

Transcript No Slide Title

Directory