Transcript Slide 1
Model Calibration and Validation Dr. Dawei HAN Department of Civil Engineering University of Bristol, UK Slide 1 of 55 Q f ( R, T ,...) A mathematical model is used to represent the real system Slide 2 of 55 ‘All models are wrong, some are useful’ ARMA model George Box 1919Princeton University (graduated from University College of London) Slide 3 of 55 White box model --- Black Box models F ma all necessary information available no a priori information Calibration and validation needed (Glass box) Slide 4 of 55 Deterministic model --- The same input The same output Stochastic model The same input The different output Randomness (pdf for input, parameters, output) Slide 5 of 55 All computer models are deterministic The same input The same output Slide 6 of 55 Ensemble simulation The output The input as pdf Uncertainty in models and their parameters Randomness 1) For each model (pdf for input, parameters, output) 2) Committee models (combining models) Slide 7 of 55 Ensemble weather simulation Does it represent the real probability distribution? Slide 8 of 55 Climate models (which one to trust?) Slide 9 of 55 Is the natural system really stochastic? Einstein Germany, USA 1879-1955 Nobel Prize (1921) Bohr Danish physicist 1885 - 1962 Nobel Prize (1922) Slide 10 of 55 Quantum Mechanics Bohr Einstein “Einstein, stop telling God what to do.” “God does not play dice” http://www.aip.org/history/einstein/ae63.htm Slide 11 of 55 Coin tossing: random? Slide 12 of 55 “There is nothing --- Prof. Diaconis random about this world" Stanford university http://www-stat.stanford.edu/~cgates/PERSI/cv.html Slide 13 of 55 Hydrological systems appear as stochastic, due to insufficient information to us? Slide 14 of 55 Hydraulic modelling in a data-rich world Professor Paul Bates University of Bristol So, more information is available (e.g., remote sensing) http://www.ggy.bris.ac.uk/staff/staff_bates.html Slide 15 of 55 Not all information is useful. Useful information? Matlab user guide: Fuzzy toolbox Slide 16 of 55 Questions for a modeller How complicated the model should be? What input data should be used? How long the records should be used for model development? Slide 17 of 55 How complicated the model should be? Slide 18 of 55 The data Model too simple (underfitting) Model too complicated (overfitting) Slide 19 of 55 A suitable model Slide 20 of 55 Occam's razor (Ockham's razor) One should not increase, beyond what is necessary, the number of entities required to explain anything. William of Ockham 1288-1348 Ockham village, Surry, England Slide 21 of 55 "Make everything as simple as possible, but not simpler." Einstein Slide 22 of 55 Model selection method Cross validation Akaike information criterion Bayesian information criterion … Slide 23 of 55 Model calibration (training, learning) ? to predict future data drawn from the same distribution http://www.cs.cmu.edu/~awm/ Slide 24 of 55 Holdout validation 1) Randomly choose 30% of the data as a test set 2) The remainder is a training set 3) Perform regression on the training set 4) Estimate future performance with the test set http://www.cs.cmu.edu/~awm/ Slide 25 of 55 Model parameter estimation (fitting to the data) Least squares method Maximum likelihood Maximum a posteriori Nonlinear optimisation Genetic algorithms … Slide 26 of 55 Estimate future performance with the test set Linear regression Mean Squared Error = 2.4 http://www.cs.cmu.edu/~awm/ Slide 27 of 55 Estimate future performance with the test set Quadratic regression Mean Squared Error = 0.9 http://www.cs.cmu.edu/~awm/ Slide 28 of 55 Estimate future performance with the test set Join the dots Mean Squared Error = 2.2 http://www.cs.cmu.edu/~awm/ Slide 29 of 55 The test set method Positive: •Very simple Negative: • Wastes data: 30% less data for model calibration •If you don’t have much data, the test-set might just be lucky or unlucky Slide 30 of 55 Cross Validation Repeated partitioning a sample of data into subsets: training and testing Seymour Geisser 1929-2004 University of Minnesota http://en.wikipedia.org/wiki/Seymour_Geisser Slide 31 of 55 Leave-one-out Cross Validation Mean Squared Error of 9 sets = 2.2 (single test 2.4) Slide 32 of 55 Leave-one-out Cross Validation Mean Squared Error of 9 sets = 0.962 (single test 0.9 ) Slide 33 of 55 Leave-one-out Cross Validation Mean Squared Error of 9 sets = 3.33 (single test 2.2 ) Slide 34 of 55 Leave-one-out Cross Validation Positive: •only waste one data point Negative: • More computation • One test point might be too small Slide 35 of 55 k-fold Cross Validation K=3 Randomly break the dataset into k partitions (in our example we’ll have k=3 partitions coloured Red Green and Blue) Slide 36 of 55 3-fold Cross Validation For the red partition: Train on all the points not in the red partition. Find the test-set sum of errors on the red points. Ditto with other 2 colours, use the mean error of the three sets Slide 37 of 55 3-fold Cross Validation Slide 38 of 55 Other model selection method Akaike information criterion Bayesian information criterion … AIC and BIC: only need the training error Slide 39 of 55 Model data input selection method The Gamma test (model free) Information theory (model free) Cross validation … Slide 40 of 55 Four example data sources Line Logistic function Sine Mackey-Glass Slide 41 of 55 From the measured data Slide 42 of 55 Fit models to the data (cross validation) Underfitting Overfitting Slide 43 of 55 The Gamma Test It estimates what proportion of the variance of the target value is caused by the unknown function and what proportion is caused by the random variable. G is an estimate for the noise variance relative to the best possible model results. Slide 44 of 55 500 points generated from the function with added noise of 0.075 The Gamma estimated noise variance is 0.073 Slide 45 of 55 If G is small, the output value is largely determined by the input variables. If G is large, 1) Some important input variables are missing 2) Too much measurement noise 3) Data record is too short 4) Gaps in the data record Slide 46 of 55 G Archive http://users.cs.cf.ac.uk/Antonia.J.Jones/GammaArchive/IndexPage.htm Gamma Test, Computer Science, Cardiff University Prof. Antonia Jones Slide 47 of 55 HYDROLOGICAL PROCESSES 2008 Slide 48 of 55 Information Theory "A Mathematical Theory of Communication.“ (1948) Shannon MIT 1916-2001 Slide 49 of 55 Information entropy a measure of the uncertainty associated with a random variable. H ( X ) p) x) log p( x) Slide 50 of 55 Transinformation measures the redundant or mutual information between and . It is described as the difference between the total entropy and the joint entropy 0.8 Transinformation 0.75 0.7 0.65 0.6 0.55 0.5 0.45 0.4 0 200 400 600 800 1000 number of data 1200 1400 Slide 51 of 55 Prediction is very difficult, especially if it's about the future Nils Bohr, Nobel laureate in Physics Slide 52 of 55 Heraclitus (Ancient Greek ) You Can't Step in the Same River Twice. Change is real, and stability is illusory Slide 53 of 55 Energy is conserved, but entropy is always increasing Slide 54 of 55 Nonstationary of the earth, solar system, universe Slide 55 of 55 The End Thank you Slide 56 of 55