Engineering subprogramme, 7 November 2006

Download Report

Transcript Engineering subprogramme, 7 November 2006

Engineering subprogramme, 7 November 2006

Tony O’Hagan

Outline

Three parts: • • • Turbofan engine vibration model Reification Predictors and validation

Part 1: The new model

Turbofan vibration model

Rolls-Royce, Derby, UK Maker of civil aeroplane engines Simulator of a fan assembly Our example has 24 blades Primary concern is with vibration If amplitude is too high on any one blade it may break In effect this will destroy the engine Rolls-Royce Trent 500 engine

Model details

24 inputs are vibration resonant frequency of each blade 24 outputs are amplitude of vibration for each blade Other factors Amount of damping – more results in more complex behaviour and longer model run times Model resolution – it’s possible to run the solver on higher or lower resolution grids Could also vary e.g. number of blades, operating rpm and temperature

Parameter uncertainty

It’s not possible to manufacture and assemble blades to be all identical and perfectly oriented Variation in resonant frequencies of blades creates complex variations in their vibration amplitude Uncertainty distribution on each model input is the distribution achieved within manufacturing tolerances

Question:

Given an assembly of blades sampled from this distribution, what is the risk of high amplitude vibrations resulting?

Emulation

Strategy: Emulate single output = blade 1 amplitude 24 inputs = frequencies of blades 1 to 24 Because of rotational symmetry, each model run gives up to 24 design points Simulate random blade assemblies Results Output depends most strongly on blade 1 input Also on neighbouring inputs, 2 and 24, etc But high-order dependencies on all inputs So far we’ve failed to emulate accurately even with very many design points

Challenges

What’s going on here?

Can we find a way to achieve the original strategy?

Should we try instead to emulate max amplitude?

This may also be badly behaved!

Part 2: Reification

Reification – background

1.

2.

Kennedy & O’Hagan (2001), “Bayesian calibration of computer models” KO’H henceforth Goldstein & Rougier (2006), “Reified Bayesian modelling and inference for physical systems” GR henceforth GR discuss two problems with KO’H Meaning of calibration parameters is unclear Assuming stationary model discrepancy, independent of code, is inconsistent if better models are possible Reification is their solution

Meaning of calibration parameters

The model is wrong We need prior distributions for calibration parameters Some may just be tuning parameters with no physical meaning How can we assign priors to these?

Even for those that have physical meanings, the model may fit observational data better with wrong values What does a prior mean for a parameter in a wrong model?

Example: some kind of machine

Simulator says output is proportional to input Energy in gives work out Proportionality parameter has physical meaning Observations with error Without model discrepancy, this is a simple linear model LS estimate of slope is 0.568

But true parameter value is 0.65

X 1.0

1.2

1.4

1.6

1.8

2.0

Y 0.559

0.693

0.868

0.913

1.028

1.075

0.9

0.8

0.7

0.6

1.3

1.2

1.1

1 0.5

Model discrepancy

0.9

0.8

0.7

0.6

1.3

1.2

1.1

1 0.5

1 1.2

1.4

x 1.6

1.8

2 1 Red line is LS fit Black line is simulator with true parameter 0.65

Model is wrong In reality there are energy losses 1.5

1 0.5

3 2.5

2 0 1 1.2

1.4

x 1.6

1.8

2 x 3 4 2 5

Case 1

Suppose we have No model discrepancy term Weak prior on slope Then we’ll get 1.3

1.2

1.1

1 0.9

0.8

0.7

0.6

0.5

1 1.2

1.4

x 1.6

Calibration close to LS value, 0.568

Quite good predictive performance in [0, 2+] Poor estimation of physical parameter 1.8

2

Case 2

1.3

1.2

1.1

1 0.9

0.8

0.7

0.6

Suppose we have 0.5

1 1.2

1.4

x 1.6

1.8

No model discrepancy term Informative prior on slope based on knowledge of physical parameter Centred around 0.65

Then we’ll get Calibration between LS and prior values 2 Not so good predictive performance Poor estimation of physical parameter

Without model discrepancy

Calibration is just nonlinear regression y = f(x, θ) + e Where f is the computer code Quite good predictive performance can be achieved if there is a θ for which the model gets close to reality Prior information based on physical meaning of θ can be misleading Poor calibration Poor prediction

Case 3

1.3

1.2

1.1

1 0.9

0.8

0.7

Suppose we have 0.6

0.5

1.2

1.4

x 1.6

1.8

2 1 GP model KO’H discrepancy term with constant mean Weak prior on mean Weak prior on slope Then we’ll get Calibration close to LS value for regression with non zero intercept The GP takes the intercept Slope estimate is now even further from the true physical parameter value, 0.518, albeit more uncertain Discrepancy estimate ‘corrects’ generally

upwards

Case 4

1.3

1.2

1.1

1 0.9

0.8

0.7

Suppose we have 0.6

0.5

1.2

1.4

x 1.6

1.8

2 1 GP model KO’H discrepancy term with constant mean Weak prior on mean Informative prior on slope based on knowledge of physical parameter Centred around 0.65

Then we’ll get Something like linear regression with informative prior on the slope Slope estimate is a compromise and loses physical meaning Predictive accuracy weakened

Adding simple discrepancy

Although the GP discrepancy of KO’H is in principle flexible and nonparametric, it still fits primarily on its mean function Prediction looks like the result of fitting the regression model with nonlinear f plus the discrepancy mean This process does not give physical meaning to the calibrated parameters Even with informative priors The augmented regression model is also wrong

Reification

GR introduce a new entity, the ‘reified’ model To reify is to attribute the status of reality Thus, a reified simulator is one that we can treat as real, and in which the calibration parameters should take their physical values Hence prior distributions on them can be meaningfully specified and should not distort the analysis GR’s reified model is a kind of thought experiment It is conceptually a model that corrects such (scientific and computational) deficiencies as we can identify in f

The GR reified model is not regarded as perfect It still has simple additive model discrepancy as in KO’H The discrepancy in the model is now made up of two parts Difference between f and the reified model For which there is substantive prior information Discrepancy of the reified model Independent of both models

Reification doubts

Can the reified model’s parameters be regarded as having physical meaning?

Allowing for model discrepancy between the reified model and reality makes this questionable Do we need the reified model?

Broadly speaking, the decomposition of the original model’s discrepancy is sensible But it amounts to no more than thinking carefully about model discrepancy and modelling it as informatively as possible

Case 5

1.3

1.2

1.1

1 0.9

0.8

0.7

0.6

Suppose we have 0.5

1 1.2

1.4

1.6

1.8

2 x GP model discrepancy term with mean function that reflects the acknowledged deficiency of the model in ignoring losses to friction Informative prior on slope based on knowledge of physical parameter Then we’ll get Something more like the original intention of bringing in the model discrepancy!

Slope parameter not too distorted, model correction having physical meaning, good predictive performance

Moral

There is no substitute for thinking Model discrepancy should be modelled as informatively as possible Inevitably, though, the discrepancy function will to a greater or lesser extent correct for unpredicted deficiencies Then the physical interpretations of calibration parameters can be compromised If this is not recognised in their priors, those priors can distort the analysis

Final comments

There is much more in GR than I have dealt with here Definitely repays careful reading E.g. relationships between different simulators of the same reality Their paper will appear in JSPI with discussion This presentation is a pilot for my discussion!

Part 3: Validation

Simulators, emulators, predictors

A simulator is a model, representing some real world process An emulator is a statistical description of a simulator Not just a fast surrogate Full probabilistic specification of beliefs A predictor is a statistical description of reality Full probabilistic specification of beliefs Emulator + representation of relationship between simulator and reality

Validation

What can be meaningfully called validation?

Validation should have the sense of demonstrating that something is right The simulator is inevitably wrong There is no meaningful sense in which we can validate it What about the emulator?

It makes statements like, “We give probability 0.9 to the output f(x) lying in the range [a, b] if the model is run with inputs x.” This can be right in the sense that (at least) 90% of such intervals turn out to contain the true output

Validating the emulator

Strictly, we can’t demonstrate that the emulator actually is valid in that sense The best we can do is to check that the truth on a number of new runs lies appropriately within probability bounds And apply as many such checks as we feel we need to give reasonable confidence in the emulator’s validity In practice, check it against as many (well chosen) new runs as possible Do Q-Q plots of standardised residuals and other diagnostic checks

Validating a predictor

The predictor is also a stochastic entity We can validate it in the same way Although getting enough observations of reality may be difficult We may have to settle for the predictor not being yet shown to be invalid!

Validity, quality, adequacy

So, a predictor/emulator is valid if the truth lies appropriately within probability bounds Could be conservative Need severe testing tools for verification The quality of a predictor is determined by how tight those bounds are Refinement versus calibration A predictor is adequate for purpose if the bounds are tight enough If we are satisfied the predictor is valid over the relevant range we can determine adequacy

Conclusion – terminology

I would like to introduce the word ‘predictor’, alongside the already accepted ‘emulator’ and ‘simulator’ I would like the word ‘validate’ to be used in the sense I have done above Not in the sense that Bayarri, Berger, et al have applied it, which has more to do with fitness for purpose And hence involves not just validity but quality Models can have many purposes, but validity can be assessed independently of purpose