Transcript Slide 1

Model Calibration and Validation
Dr. Dawei HAN
Department of Civil Engineering
University of Bristol, UK
Slide
1 of 55
Q  f ( R, T ,...)
A mathematical model is used to represent the real system
Slide
2 of 55
‘All models are wrong, some are useful’
ARMA model
George Box
1919Princeton University
(graduated from University College of London)
Slide
3 of 55
White box model --- Black Box models
F  ma
all necessary
information available
no a priori information
Calibration and validation needed
(Glass box)
Slide
4 of 55
Deterministic model ---
The
same
input
The
same
output
Stochastic model
The
same
input
The
different
output
Randomness
(pdf for input,
parameters, output)
Slide
5 of 55
All computer models are deterministic
The
same
input
The
same
output
Slide
6 of 55
Ensemble simulation
The
output
The input
as pdf
Uncertainty in
models and
their
parameters
Randomness
1) For each model (pdf for input, parameters, output)
2) Committee models (combining models)
Slide
7 of 55
Ensemble weather simulation
Does it represent the real probability distribution?
Slide
8 of 55
Climate models (which one to trust?)
Slide
9 of 55
Is the natural system really stochastic?
Einstein
Germany, USA
1879-1955
Nobel Prize (1921)
Bohr
Danish physicist
1885 - 1962
Nobel Prize (1922)
Slide
10 of 55
Quantum Mechanics
Bohr
Einstein
“Einstein,
stop telling
God what to
do.”
“God does not
play dice”
http://www.aip.org/history/einstein/ae63.htm
Slide
11 of 55
Coin tossing: random?
Slide
12 of 55
“There is nothing
--- Prof. Diaconis
random about this world"
Stanford university
http://www-stat.stanford.edu/~cgates/PERSI/cv.html
Slide
13 of 55
Hydrological systems appear as stochastic, due
to insufficient information to us?
Slide
14 of 55
Hydraulic modelling in a data-rich world
Professor Paul Bates
University of Bristol
So, more information is available
(e.g., remote sensing)
http://www.ggy.bris.ac.uk/staff/staff_bates.html
Slide
15 of 55
Not all information is useful.
Useful information?
Matlab user guide: Fuzzy toolbox
Slide
16 of 55
Questions for a modeller
How complicated the model should be?
What input data should be used?
How long the records should be used for model development?
Slide
17 of 55
How complicated the model should be?
Slide
18 of 55
The data
Model too simple
(underfitting)
Model too complicated
(overfitting)
Slide
19 of 55
A suitable model
Slide
20 of 55
Occam's razor (Ockham's razor)
One should not increase, beyond
what is necessary, the number of
entities required to explain
anything.
William of Ockham
1288-1348
Ockham village, Surry, England
Slide
21 of 55
"Make everything as simple as possible, but not simpler."
Einstein
Slide
22 of 55
Model selection method
Cross validation
Akaike information criterion
Bayesian information criterion
…
Slide
23 of 55
Model calibration (training, learning)
? to predict future data drawn from the same distribution
http://www.cs.cmu.edu/~awm/
Slide
24 of 55
Holdout validation
1) Randomly choose 30%
of the data as a test set
2) The remainder is a
training set
3) Perform regression on
the training set
4) Estimate future
performance with the test
set
http://www.cs.cmu.edu/~awm/
Slide
25 of 55
Model parameter estimation
(fitting to the data)
Least squares method
Maximum likelihood
Maximum a posteriori
Nonlinear optimisation
Genetic algorithms
…
Slide
26 of 55
Estimate future performance with the test set
Linear regression
Mean Squared Error = 2.4
http://www.cs.cmu.edu/~awm/
Slide
27 of 55
Estimate future performance with the test set
Quadratic regression
Mean Squared Error = 0.9
http://www.cs.cmu.edu/~awm/
Slide
28 of 55
Estimate future performance with the test set
Join the dots
Mean Squared Error = 2.2
http://www.cs.cmu.edu/~awm/
Slide
29 of 55
The test set method
Positive:
•Very simple
Negative:
• Wastes data: 30% less data for model
calibration
•If you don’t have much data, the test-set
might just be lucky or unlucky
Slide
30 of 55
Cross Validation
Repeated partitioning a sample of data into subsets:
training and testing
Seymour Geisser
1929-2004
University of Minnesota
http://en.wikipedia.org/wiki/Seymour_Geisser
Slide
31 of 55
Leave-one-out Cross Validation
Mean Squared Error of 9 sets = 2.2 (single test 2.4)
Slide
32 of 55
Leave-one-out Cross Validation
Mean Squared Error of 9 sets = 0.962 (single test 0.9 )
Slide
33 of 55
Leave-one-out Cross Validation
Mean Squared Error of 9 sets = 3.33 (single test 2.2 )
Slide
34 of 55
Leave-one-out Cross Validation
Positive:
•only waste one data point
Negative:
• More computation
• One test point might be too small
Slide
35 of 55
k-fold Cross Validation
K=3
Randomly break the dataset into k partitions (in our example
we’ll have k=3 partitions coloured Red Green and Blue)
Slide
36 of 55
3-fold Cross Validation
For the red partition: Train on all the points not in the red partition.
Find the test-set sum of errors on the red points.
Ditto with other 2 colours, use the mean error of the three sets
Slide
37 of 55
3-fold Cross Validation
Slide
38 of 55
Other model selection method
Akaike information criterion
Bayesian information criterion
…
AIC and BIC: only need the training error
Slide
39 of 55
Model data input selection method
The Gamma test (model free)
Information theory (model free)
Cross validation
…
Slide
40 of 55
Four example data sources
Line
Logistic function
Sine
Mackey-Glass
Slide
41 of 55
From the measured data
Slide
42 of 55
Fit models to the data
(cross validation)
Underfitting
Overfitting
Slide
43 of 55
The Gamma Test
It estimates what proportion of the variance of the target
value is caused by the unknown function and what
proportion is caused by the random variable.
G is an estimate for the noise variance relative to
the best possible model results.
Slide
44 of 55
500 points generated
from the function with
added noise of 0.075
The Gamma
estimated noise
variance is 0.073
Slide
45 of 55
If G is small, the output value is largely determined
by the input variables.
If G is large,
1) Some important input variables are missing
2) Too much measurement noise
3) Data record is too short
4) Gaps in the data record
Slide
46 of 55
G Archive
http://users.cs.cf.ac.uk/Antonia.J.Jones/GammaArchive/IndexPage.htm
Gamma Test,
Computer Science, Cardiff University
Prof. Antonia Jones
Slide
47 of 55
HYDROLOGICAL PROCESSES 2008
Slide
48 of 55
Information Theory
"A Mathematical Theory of
Communication.“ (1948)
Shannon
MIT
1916-2001
Slide
49 of 55
Information entropy
a measure of the uncertainty associated with a random variable.
H ( X )   p) x) log p( x)
Slide
50 of 55
Transinformation
measures the redundant or mutual information between and . It
is described as the difference between the total entropy and the
joint entropy
0.8
Transinformation
0.75
0.7
0.65
0.6
0.55
0.5
0.45
0.4
0
200
400
600
800
1000
number of data
1200
1400
Slide
51 of 55
Prediction is very difficult,
especially if it's about the future
Nils Bohr, Nobel laureate in Physics
Slide
52 of 55
Heraclitus (Ancient Greek )
You Can't Step in the
Same River Twice.
Change is real, and
stability is illusory
Slide
53 of 55
Energy is conserved, but entropy is always increasing
Slide
54 of 55
Nonstationary of the earth, solar system, universe
Slide
55 of 55
The End
Thank you
Slide
56 of 55