Transcript Title

Overall procedure of validation
Calibration
Adjusting of physical modeling parameters
in the model to improve agreement with
experimental data
Model accuracy assessment by comparison of
model outputs with experimental measurements.
Validation
If calibrated by experiment, prediction at
untried conditions and validate again.
Prediction
Validated
Model
Blind prediction
New
Experiment
Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ).
-1-
Optional
Approaches for calibration
• Traditional (deterministic) calibration
– Parameters are estimated as a single value such that minimize the
squared error between the computer model and experimental data.
– As a result, the model is given by a single function.
• Statistical calibration
– Also called Calibration under Uncertainty (CUU).
– Parameters are estimated using statistical inference technique to
incorporate uncertainty due to the observation error.
– As a result, the model is given by the confidence bounds.
-2-
Approaches for statistical calibration
– Based on section 13.5, Oberkampf textbook.
• Frequentist approach
– Parameter is constant, but unknown because of limited data.
– The most popular method is to follow the two steps.
1. point estimate parameters by maximum likelihood estimation (MLE).
2. draw samples of parameters by bootstrap technique.
– Advantage is they are simpler and easier to use than Bayesian methods.
But less applied to the calibration problem.
• Bayesian approach
– Parameter is treated as random variable, characterized by the
probability distribution conditional on the data.
– Also called Bayesian updating. Before updating, the distribution is prior,
after updating , it is posterior.
-3-
Calibration in order of complexity
• Deterministic calibration
– Carry out parameter estimation using optimization technique to obtain a
single function as a calibrated model.
– The method is not useful because uncertainty is not included. It is like to
use the mean value of the uncertainty in the design decision.
• Statistical calibration without discrepancy
– Carry out parameter estimation using statistical technique to obtain
confidence bounds of the calibrated model.
– Bayesian approach is common, with the MCMC as the technique for
parameter estimation in the probabilistic way.
– Due to lack of knowledge, model often differs inherently from the reality.
No matter how many data used for calibration, they may fail to agree.
– Without accounting for this, assuming the model is correct, we end up
with large error, mistaken that they are from experiments, not the model.
-4-
Calibration in order of complexity
• Statistical calibration with discrepancy
– How to model the discrepancy ?
Gaussian process regression (GPR) is employed to express the
discrepancy in approximate manner.
– Estimation includes not only the calibration parameters but also the
associated GPR parameters.
– The discrepancy term has two purposes.
1. close the gap between the model and reality, making further
improvement of the calibration.
2. validate the model accuracy. If small discrepancy, the model is good.
-5-
Calibration in order of complexity
• Statistical calibration with surrogate model
– During the MCMC, thousands of model evaluations are needed.
If the model is expensive, surrogate model should be introduced.
– GPR is employed for this purpose, where design of computer
experiments (DACE) is critical in the process.
– Estimation includes three parts: calibration parameters, GPR
parameters for surrogate model, GPR parameters for discrepancy.
Efficiency decreases quickly as the number of parameters increases.
– MLE plug-in approach:
Surrogate GPR model is deterministic. Parameters are point-estimated.
Only the others are estimated probabilistically.
– Full Bayesian approach:
Includes all the parameters in the estimation. This is the ultimate
complexity in the calibration. This is the topic the KOH has addressed.
-6-
Outline of the calibration lecture
• Motivating example
• Deterministic calibration
• Statistical calibration without discrepancy
– Bayesian approach
– Frequentist approach
• Statistical calibration with discrepancy
– GPR revisited.
• Statistical calibration with surrogate model
– MLE plug-in approach
– Full Bayesian approach
• Applications
-7-
Motivating example
• Problem addressed in
– Loeppky, Jason L., Derek Bingham, and William J. Welch. "Computer
model calibration or tuning in practice." Technometrics, submitted for
publication (2006).
– Bayarri, Maria J., et al. "A framework for validation of computer models."
Technometrics 49.2 (2007).
– Originally Fogler, H. S., (1999), Elements of Chemical Reaction
Engineering, Prentice Hall.
• Chemical kinetics model
yT  x   1.5  3.5exp  1.7 x 
– Describes a chemical reaction process with initial chemical
concentration 5 and reaction rate 1.7. Amount of chemical remaining at
time x is investigated.
-8-
Motivating example
• Chemical kinetics model
yT  x   1.5  3.5exp  1.7 x 
y F  x   yT  x    ,  ~ N  0,0.3
– Make virtual experimental (or observation) data with the noise
Three replicates are made at 11 points of equal interval in [0,3].
– Repeated with data for right figure given in note page
6
6
5
5
4
4
3
3
2
2
1
1
0
0
0.5
1
1.5
2
2.5
0
3
0
0.5
1
1.5
2
2.5
3
• Objective
– Find computer model to simulate observations as closely as possible.
-9-
Simplest computer model
y M  m  x | q   5exp  q x 
– Somewhat wrong guess due to lack of knowledge.
– Calibrate q such that minimizes the SSE between model and data.
min f q     yi  yi
F
q
M

2
where yi M  S  xi , q 
• Optimum solution from matlab fminsearch
q  0.6223
6
SSE  10.97, RMSE  0.5766
where SSE    yi  yi
F
M

2
, RMSE  SSE / n
• Using nlinfit and nlparci with second data
but with n-1, (see notes page)
q  0.6271
RMSE  0.4855
CI  [0.5556 0.6986]
- 10 -
q=0.6223, sumsq=10.97
5
4
3
2
1
0
0
0.5
1
1.5
2
2.5
3
Discussion
• If error was due only to noise, we would have expected
RMSE=0.3
• Given the true function, we can see that the model is not
good. What are the clues without looking at the true
function?
• What is the hierarchy of calibration methods without
discrepancy?
Calibration with improved models
• Computer model and optimum solution
yM  m  x | q   q1 exp  q2 x 
– Two parameters optimization problem
q1  4.351,q2  0.511
6
q1=4.351,q2=0.511, sumsq=8.922
5
4
3
SSE  8.92, RMSE  0.520
– Solution improved. but still substantial gap.
2
1
0
0
0.5
1
1.5
2
2.5
3
• Computer model and optimum solution
y  m  x | q   q1  q2 exp  q3 x 
6
M
– Three parameters optimization problem
q1  1.558, q 2  3.588, q3  1.899
SSE  2.77, RMSE  0.290
q1 = 1.558, q2 =3.588, q3 =1.899, SSE = 2.774
5
4
3
2
1
– Excellent match (true q1 = 1.5, q2 =3.5, q3 =1.7)
– Model change made on ad-hoc basis. Besides, the close match is
undoubtedly just luck. Is this possible in the real practice ?
0
- 12 -
0
0.5
1
1.5
2
2.5
3
Calibration under uncertainty
• Bayesian approach
– Assume that the model is accurate representation of reality.
y M  m  x | q   5exp  q x 
– Field data is given by
yi F  yi M  i  m  xi ,q    i ,  ~ N  0,s 2 
– Posterior distribution of the unknown parameters q, s2
p q ,s 2 | YF   s 2 
 n /2 1
 1

exp  2  YF  m  X,q   YF  m  X,q 
 2s

• Posterior distribution
1
0.9
20
0.8
15
0.7
10
0.6
5
0.5
0
1
0.4
0.8
0.3
0.2
0.4
0.9
0.8
0.6
0.7
0.6
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.2
- 13 -
0.5
0.4
Calibration under uncertainty
• Posterior samples after MCMC (N=5e3)
0.9
400
0.8
300
0.7
200
0.6
100
0.5
0.4
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
1.4
0
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
400
1.2
300
Means of q and s
0.6295 0.5921
0.6235 0.5870
0.6249 0.5866
0.6269 0.5952
1
200
Simple optimization
0.8
100
0.6
0.4
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0
0.4
0.5
0.6
0.7
0.8
0.9
1
• Posterior prediction
q  0.6223, s  0.5766
6
5
yi M  m  x | qi   5exp  qi x 
4
yi  yi   i  m  x,q    i ,  i ~ N  0, s i 
P
1.1
3
M
2
1
0
- 14 -
0
0.5
1
1.5
2
2.5
3
Calibration under uncertainty
• Frequentist approach
– Likelihood of YF
L  YF | q ,s 2   s 2 
 n /2
 1

exp  2  YF  m  X,q   YF  m  X,q 
 2s

• Maximum likelihood estimation
max
L  Y F | q ,s 2 
2
Optimum solution
q *  0.6223, s *  0.5766
q ,s
1
0.9
20
0.8
15
0.7
10
0.6
5
0.5
0
1
0.4
0.8
0.3
0.2
0.4
0.9
0.8
0.6
0.7
0.6
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.2
- 15 -
0.5
0.4
Calibration under uncertainty
• Bootstrap sampling
– Make virtual experimental data by applying the estimated parameter into
the model and the noise.
y F  y M    5exp  q * x    ,  ~ N  0,s *2 
– Go through MLE estimation using the data, and repeat this N times to
get the samples of parameters.
6
5
4
3
2
1
0
0
0.5
1
1.5
22
2.5
2.5
33
Meeker, William Q., and Luis A. Escobar. Statistical
methods for reliability data. Vol. 314. Wiley. com, 1998.
- 16 -
Calibration under uncertainty
• Discussion
– Confidence bounds of model is now obtained.
i.e., at x=1.5, the bound is (0.75, 3.19).
– Due to the incorrect model, end up with large bound. However, this is
the best available solution under this condition.
– Within this large bounds, not only measurement error but also the model
error are included. We need to account for this by introducing
discrepancy function.
6
5
4
3
2
1
0
0
0.5
1
1.5
2
- 17 -
2.5
3