An Introduction to Computer Experiments and their Design

Download Report

Transcript An Introduction to Computer Experiments and their Design

An Introduction to Computer Experiments
and their Design Problems
Tony O’Hagan
University of Sheffield
8 Sept 2006, DEMA2006
Slide 1
Outline
1. Computer codes and their problems
2. Gaussian process representation
3. Design
4. Conclusions
www.mucm.group.shef.ac.uk
Slide 2
Models and uncertainty
 In almost all fields of science, technology,
industry and policy making, people use
mechanistic models to describe complex realworld processes

For understanding, prediction, control
 Growing realisation of importance of
uncertainty in model predictions


Can we trust them?
Without any quantification of output uncertainty,
it’s easy to dismiss them
www.mucm.group.shef.ac.uk
Slide 3
Computer codes
 A computer code is a software implementation
of a mathematical model for some real process
 Given suitable inputs x that define a particular
instance, the code output y = f(x) predicts the
true value of that real process
 A single run of the model can take an
appreciable amount of time


In some cases, months!
Even a few seconds can be too long for tasks
that require many thousands of runs
www.mucm.group.shef.ac.uk
Slide 4
What are models for?
 Prediction and optimisation
 What will the model output be for these inputs?
 What inputs will optimise the output?
 Uncertainty analysis
 Given uncertainty in model inputs, how
uncertain are outputs?
 Which input uncertainties are most influential?
 Calibration and data assimilation
 How can we use data to improve the model?
 Many of these tasks implicitly require many
model runs
www.mucm.group.shef.ac.uk
Slide 5
Computation
 Consider uncertainty analysis
 Given uncertain input X, what can we say about
the distribution of output Y = f(X)?
 Monte Carlo is the simplest method
 Sample x1, x2, …, xN from distribution of X
 Run model to get outputs y1, y2, …, yN
 Use this as a sample of the output distribution
 Easy to implement but impractical if model
takes more than a few seconds to run

10,000 minutes is a week
www.mucm.group.shef.ac.uk
Slide 6
Gaussian process representation
 More efficient approach

First work in early 1980s – DACE
 Represent the code as an unknown function


f(.) becomes a random process
We represent it as a Gaussian process
 Training runs



Run model for sample of x values
Condition GP on observed data
Typically requires many fewer runs than MC

And x values don’t need to be chosen randomly
www.mucm.group.shef.ac.uk
Slide 7
Bayesian formulation
 Prior beliefs about function

conditional on hyperparameters
 Data
 Posterior beliefs about function

conditional on hyperparameters
www.mucm.group.shef.ac.uk
Slide 8
Emulation
 Analysis is completed by prior distributions for,
and posterior estimation of, hyperparameters

Roughness parameters in B crucial
 The posterior distribution is known as an
emulator of the computer code



Posterior mean estimates what the code would
produce for any untried x (prediction)
With uncertainty about that prediction given by
posterior variance
Correctly reproduces training data
www.mucm.group.shef.ac.uk
Slide 9
2 code runs
 Consider one input and one output
 Emulator estimate interpolates data
 Emulator uncertainty grows between data
points
dat2
10
5
0
0
1
2
3
4
5
6
x
www.mucm.group.shef.ac.uk
Slide 10
3 code runs
 Adding another point changes estimate and
reduces uncertainty
dat3
10
5
0
0
1
2
3
4
5
6
x
www.mucm.group.shef.ac.uk
Slide 11
5 code runs
 And so on
9
8
7
dat5
6
5
4
3
2
1
0
0
1
2
3
4
5
6
x
www.mucm.group.shef.ac.uk
Slide 12
Frequentist formulation
 Pretend the function is actually sampled from a
Gaussian process population of functions


Absurd, really!
But properties of inferences depend on it
 Best linear unbiased predictor is the same as
Bayesian posterior mean


With weak prior distributions
Similarly for variances
www.mucm.group.shef.ac.uk
Slide 13
Then what?
 Use the emulator to make inference about
other things of interest

E.g. uncertainty analysis, calibration
 Conceptually very straightforward in the
Bayesian framework


But of course can be computationally hard
Frequentist approach has not generally been
extended to some of the more complex
analyses
www.mucm.group.shef.ac.uk
Slide 14
Design
 The design problem is to choose x1, x2, …, xN
 Design space  is usually rectangular
 Often rather arbitrary
 May be high dimensional
 Objective is to build an accurate emulator
across 

Formally optimising for some specific analysis
is generally inappropriate (and too hard)
 Usual approach is to aim for a design that fills
 uniformly

Minimises uncertainty between design points
www.mucm.group.shef.ac.uk
Slide 15
Latin hypercubes
 LH designs




Divide the range of each variable into N equal
segments
Choose a value in each segment (uniformly)
Permute each coordinate randomly
Covers each coordinate evenly
 Maximin LH


Generate many LH designs
Choose one for which minimum distance
between points is greatest
www.mucm.group.shef.ac.uk
Slide 16
Poor LH design
0
0.2
0.4
0.6
0.8
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0
x2
1.0
0.0
0.2
0.4
0.6
0.8
1.0
x1
www.mucm.group.shef.ac.uk
Slide 17
Maximin LH design
0
0.2
0.4
0.6
0.8
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0
Y
1.0
0.0
0.2
0.4
0.6
0.8
1.0
X
www.mucm.group.shef.ac.uk
Slide 18
Projection
 Projections of LH designs onto lower
dimensional spaces are also LH designs


Not necessarily maximin, but usually quite even
Important because typically only a few inputs
are influential
 There are other ways of generating space-
filling designs


Low discrepancy sequences
Don’t necessarily have good projections
www.mucm.group.shef.ac.uk
Slide 19
Other considerations
 Maximin LH designs don’t have points close
together


By definition!
But such pairs help to identify hyperparameters


Particularly roughness parameters
Maybe add extra points differing from existing
ones only by a small amount in one dimension
 Sequential designs would be very helpful


Low discrepancy sequences
Adaptive designs for partitioned emulators
www.mucm.group.shef.ac.uk
Slide 20
Some design challenges
 Space filling designs that are good in all
projections
 Understanding the value of low-distance pairs
 Designs for non-rectangular or unbounded 
 Sequential/adaptive design


E.g. a good 150-point design with a good 100point subset
Adaptation to roughnesses and heterogeneity
 Design of real-world experiments for
calibration
www.mucm.group.shef.ac.uk
Slide 21
MUCM
 This is a substantial and topical research area
 MUCM (Managing Uncertainty in Complex
Models) is a new £2M research project



Funded by RCUK Basic Technology scheme
4 year grant, 7 RAs + 4 PhDs in 5 centres
Henry Wynn (LSE) leading design work


But enough problems for lots of people to work on!
mucm.group.shef.ac.uk
 Year-long programme at SAMSI (USA)
www.mucm.group.shef.ac.uk
Slide 22