STATISTICAL METHODS TO DERIVE EXTREMES: BASICS, APPLICATIONS E.I.KHLEBNIKOVA VOEIKOV MAIN GEOPHYSICAL OBSERVATORY SAINT-PETERSBURG, RUSSIA Contact details: Dr.
Download ReportTranscript STATISTICAL METHODS TO DERIVE EXTREMES: BASICS, APPLICATIONS E.I.KHLEBNIKOVA VOEIKOV MAIN GEOPHYSICAL OBSERVATORY SAINT-PETERSBURG, RUSSIA Contact details: Dr.
STATISTICAL METHODS TO DERIVE EXTREMES: BASICS, APPLICATIONS E.I.KHLEBNIKOVA VOEIKOV MAIN GEOPHYSICAL OBSERVATORY SAINT-PETERSBURG, RUSSIA Contact details: Dr. Elena I. Khlebnikova Tel: +7 812 2472102 Fax: +7 812 2478661 e-mail: [email protected] P le f t t a il m ean r ig h t t a il Extremes: the events connected with the behavior of a meteorological variable at the tails of distribution Generalizations: a) meteorological field, b) complex of variables OUTLINE Introduction 1. The Generalized Extreme Value (GEV) distribution. 1.1. Basics. 1.2. Methods of parameter estimation. 1.3. Drawbacks and extensions. 1.4. Choices at the application stage. Examples. 2. Threshold exceedances approach to derive extremes. 2.1. Basics. 2.3. The Generalized Pareto Distribution (GPD). 2.4. Example. 3. Climate change and extremes. 4. Conclusions. CLASSIC EXTREME VALUE ANALYSIS Basic set-up: X ( m ) max{V1 , V2 ,..., Vm } V1 , V2 ,..., i.i.d ., F ( x ) P{Vi x} P{ X ( m ) x} [ F ( x)]m for any m When m , one may use the Three Types Theorem: X ( m ) un P x F (bn x u n ) G ( x ) bn G(x) belongs to one of three types Type 1 (Fisher-Tippet 1, Gumbel): G1 ( x) exp( e x ), Type 2 (Fréchet): 0, G2 ( x ) exp( x ), Type 3 (Weibull): exp( | x | ), G3 ( x ) 1, x (1) x0 x 0, 0 (2) x0 x 0, 0 (3) GENERALIZED EXTREME VALUE (GEV) DISTRIBUTION exp( (1 k ( x ) / )1/ k ), k 0 G ( x) exp( exp( ( x ) / )), k 0 - location parameter - scale parameter k 0 - "long-tailed" case, k 0 - "exponental" tail, k 0 - "short-tailed" case, finite endpoint (1) DOMAINS OF ATTRACTION OF EV DISTRIBUTIONS Type I (Gumbel) 1 F ( x) ( x) 2 a) normal distribution x e y2 / 2 dy b) monotonic transformation of normal distribution (log-normal and others) F ( x) exp( e x ) c) Gumbel distribution d) exponential distribution F ( x) 1 e x Type II (Fréchet) a) Pareto distribution b) Fréchet distribution F ( x) 1 ax , Type III (Weibull) F ( x ) x, a) uniform distribution b) truncated exponential distribution c) Weibull distribution 0, a 0, x a1/ 0 x 1 CHOICE OF DISTRIBUTION TYPE 1. Theoretical consideration of distribution functions Type l (Gumbel) g (t ) 0, such that (1 F (t xg (t ))) /(1 F (t )) exp( x) , xF t xF Type ll (Fréchet) (1 F (tx)) /(1 F (t )) x , 0, t Type lll (Weibull) (1 F ( xF xh )) /(1 F ( xF h)) x , h0 2. Graphical method 3. Hypothesis testing xF 0, xF TYPES OF ASYMPTOTIC EXTREMAL DISTRIBUTION FUNCTION 6 Gumbel , K=0 Weibull, k=+0.3 Frechet, k=-0.3 5 4 -ln(-ln(P) 3 2 1 0 -1 -2 -2 -1 0 1 2 X 3 4 5 6 ESTIMATORS OF PLOTTING POSITIONS FOR EMPIRICAL PROBABILITIES pj pj pj pj ( j 0.5) / n j /(n 1) ( j 0.3) /(n 0.4) ( j c) /(n 1 2c) The formula (1) is the best in many respects (1) (2) (3) (4) HYPOTHESIS TESTING (ACCORDING TO GUMBEL) X 1 X 2 ... X n X * - median H 0 : X i G1 D ln[( X n X * ) /( X * X 1 )] (1) D N ( D , D ) n D ln{ ln( n) ln 1[ ln 2 ln 1 (1 0.51/ n )]} (2) D [0.861 ln( n) 0.490]1 (3) if D D D 1.96 H 0 is rejected at 5% significance level METHODS OF PARAMETER ESTIMATION GUMBEL DISTRIBUTION (k =0) 1. Linear approximation of empirical probabilities in double-logarithmic scale (graphical method) X 1 X 2 ... X n - the ordered sample of maxima p j - empirical probabilities ( X j , Y j ) Gumbel plot 2. Method of moments Method of empirical reduced moments: X 1 X 2 ... X n , p j - empirical probabilities Y j ln( ln( p j )) ~ , ~ m x x - mean value and standard deviation of X, ~ , ~ - mean value and standard deviation of Y m y y ~ x / ~ y ~ m ~ ~ / ~ m x y x y k, , CONTINUED 3. MAXIMUM LIKELIHOOD METHOD ML - estimate of θ maximizes likelihood function L ( X 1 ,..., X n ) ln P ( X 1 ,..., X n ) considered under fixed X 1 ,..., X n as function of θ ML - equations for Gumbel distribution 1 n ln exp( X j / ) , n j 1 n X j exp( X j / ) j 1 (1) n exp( X / ) j j 1 (2) CONTINUED GEV-DISTRIBUTION 1. Method of moments Probability weighted moments k 0,1,2,... bk E x( F ( x)) k (1) Unbiased estimates ofbk (k=0,1,2): 1 b0 n bk 1 n n X j , j 1 n ( j 1)( j 2)...( j k ) Xj ( n 1 )( n 2 )...( n k ) j k 1 (2) Estimates of parameters 2 for GEV-distribution k 7.8590 c 2.9554 c where с ( 2b1 b0 ) (3b2 b0 ) ln 2 / ln 3 ( 2b1 b0 ) k ( (1 k )(1 2 k )) , b0 ( (1 k ) 1) / k 2. Maximum likelihood method (3) DESIGN VALUES AND THEIR ACCURACY T - return period P{ X ( m ) X T } 1 1 T (1) 1 X T [1 ( ln(1 )) k ] k T 1 X T ln[ ln(1 )] T k 0 k 0 (2) (3) Standard errors (Abild et al., 1992) for k=0 X~ ( 2 (0.608z 2 0.514z 1.109) / n)1/ 2 , T 1 where z ln( ln(1 )) T (4) GEV-MODEL. COMMON PRACTICE 1. Receiving the ordered sample of extremes. 2. Calculation of plotting positions. 3. Gumbel probability plotting. 4. Hypothesis testing. 5. Estimation of initial values for parameters (by the method of moments). 6. Estimation of parameters by the maximum-likelihood method (for 2- and 3-parameter distribution). 7. Derivation of design-values. 8. Risk analysis and decision-making. -ln(-ln(p)) MAXIMUM DISTRIBUTIONS FOR WIND SPEED SERIES (ST.PETERSBURG) 9 8 7 6 5 4 3 2 1 0 -1 -2 -3 annual monthly 2 4 6 8 10 12 14 16 18 20 22 wind speed, m/s Distinctions: a) time scale in use, b) location parameter (theory for normal process: year month ln(12)) ANNUAL MINIMUM TEMPERATURE OF 5-DAY PERIOD (ST. - PETERSBURG) 6 5 1936-65 1966-95 4 -ln(-ln(p)) 3 2 1 0 -1 -2 -34 -32 -30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10 -8 temperature ANNUAL MAXIMUM TEMPERATURE (ST. - PETERSBURG) 5 1936-65 1966-95 4 -ln)-ln(P) 3 2 1 0 -1 -2 26 27 28 29 30 31 temperature 32 33 34 EXTENDING THE CLASSICAL METHOD X n ,1 X n , 2 ... X n ,r - r - largest order statistics of i.i.d. sample of size n Yj ( X j ) / f (Y1 ,..., Yr ) r 1/ k exp[ (1 kYr ) f (Y1 ,..., Yr ) r exp[ exp( Yr ) r 1 (1 ) ln(1 kY j )] k j 1 r Y ] j j 1 , can be computed by ML-method k 0 k 0 (1) (2) DRAWBACKS (DEPENDENCE, SEASONALITY) Dependence: theory may be applied provided adequate adjustment of parameters. Theoretical parameters for stationary normal process: bT ( 2 ln( T )) 1/ 2 1/ 2 u ( 2 ln( T )) T ( 2 ln( T ))1/ 2 - scale parameter - location parameter ν - mean number of upcrossins over zero-level per time unit 1/ν -undimensional time scale Seasonality - partitioning the data by season, - seasonal adjustment THE INFLUENCE OF NON-STATIONARITY ON EXTREME DISTRIBUTIONS 7 6 5 annual maximum monthly maxima -ln(-ln(p)) 4 3 2 1 0 -1 -2 -3 20.0 22.5 25.0 27.5 30.0 temperature 32.5 35.0 37.5 EXTREMES AND THE SCALE OF REGION 5,0 NORTHERN HEMISPHERE 4,5 0 ZONE 35-80 N 4,0 C 3,5 3,0 P[maxX<=C,X] = 1- FIXED POINT 2,5 0,01 -LOG() 1E-3 Standardized threshold of non-exceeding for monthly mean air temperature. January, Northern Hemisphere. According to Khlebnikova, 1987. excess THRESHOLD EXCEEDANCES APPROACH u T Let X F , xF - upper endpoint Px u y / x u Theorem: F (u y ) F (u ) Fu ( y ) 1 F (u ) 1/ k y Fu ( y ) H ( y, u , k ) , where H ( y, , k ) 1 1 k - GPD u xF GENERALIZED PARETO DISTRIBUTION 1 (1 kx / )1 / k , k0 H ( x, , k ) k 0 1 exp( x / ), (1) 0 - scale parameter, k - shape parameter k 0: k 0: 0 x 0 x /k EX /(1 k ), k 1 VarX /((1 k ) (1 2k )), 2 2 (2) 1 k 2 (3) ESTIMATION OF GPD PARAMETERS a) Linear approximation based on mean excess function E[ X u / X u ] b) 1 k k u 1 k (1) Method of moments Y X u _ Y , s - sample mean and standard 2 2 _ _ _ 1 1 k Y s 1; Y Y s 1 2 2 c) Method of Probability-Weighted Moments k b0 /( 2b1 b0 ) 2 ; (1 k )b0 ; where d) (2) b0 ,b1 - probability-weighted moments Maximum likelihood method (3) POISSON - GPD MODEL (combination of properties of exceedances) 1. The number of exceedances N of the level u has a Poisson distribution with mean λ. 2. The excess values y1 ,..., yn over threshold u - from GPD. Estimation of design valuesX T XT u 1 ( T ) , k X T u ln( T ) , k k 0 k 0 EXAMPLE Data: the set of daily mean temperature in Arkhangelsk (650N, 400E), 1936-2001, January. Purpose: to derive design values for minimum Methods in use: 1) based on GEV distributions for monthly minimum, 2) based on GPD distribution for excesses APPROXIMATION BY GEV METHOD 5 4 empiric Weibull, k=0.34 3 2 1 0 -1 -2 -40 -35 -30 -25 -20 -15 -10 Monthly minimum daily mean air temperature. January. Arkhangelsk GPD MODEL FOR EXCEEDANCES -36 -34 6 -32 -30 -28 -26 -24 -22 -20 NUMBER OF EXCEEDANCES mean excess 5 4 1 MEAN EXCESS 3 2 1 -18 0 -20 -22 -24 -26 -28 level -30 -32 -34 -36 -38 number of exceedances 2 ESTIMATES OF QUANTILES BASED ON DIFFERENT THRESHOLDS -41 T=10 T=20 T=40 T=100 -40 Quantile XT -39 -38 -37 -36 -35 -20 -22 -24 -26 -28 -30 Threshold -32 -34 -36 COMPARATIVE ESTIMATES OF QUANTILES USING ANNUAL MINIMA AND GPD DISTRIBUTIONS -36 0 -37 -35 C quantile XT -38 0 -27 C -39 0 -25 C -40 minimum 0 -23 C -41 0 -22 C -42 0 20 40 60 RETURN PERIOD T 80 100 Profile Log Likelihood UNCERTAINTY OF ESTIMATION ABOUT EXTREME QUANTILES -89 T=100 -90 -91 T=50 -92 T=25 -93 -94 200 400 600 800 XT 1000 Profile likelihood plots for the T-year return value X T (flow rates from the River Nidd). Adapted from Davison and Smith (1990). Poisson-GPD model: XT u k [1 ( T ) k ], u 100 CLIMATE CHANGES IN ORDER STATISTICS -40 0 -42 =1.5 C/10 years -46 0 temperature, C -44 -48 -50 -52 -54 -56 -58 -60 1930 1940 1950 1960 1970 1980 1990 2000 Annual minimum air temperature. Yakutsk (Russia), 1936-2001 2010 CHANGES IN EXTREME QUANTILES. According to I. Matyasovszky (2000) -9 Pecs Szeged Miscolc Budapest o C -11 -13 -15 -17 -19 1900 1920 1940 1960 1980 2000 Daily minimum temperatures in Hungary (winter, q=0.05) EXTREMES AND CLIMATE CHANGE 1. The problem of interpretation of the concept “return period” in climate change conditions. 2. Combination of EV-models with models of inter- and intraanual variability. 3. Application of Monte-Carlo technique based on well-developed stochastic models of meteorological processes with taking account climate forecasts. CONCLUSIONS 1. The most commonly-used approach to derive extremes is based on GEV-distribution for maxima of meteorological series. 2. There are many fitting methods to calculate parameters of GEV distribution. The modifications of method of moments can be recommended for estimating initial values of parameters. 3. The most effective estimates of parameters are provided by the maximum likelihood method. Some calculating algorithms are available to realize this method. 4. Alternative approach to derive extremes is based on considering threshold exceedances and using GPD-distribution for excesses. 5. Uncertainty in estimating extreme values parameters has to be taken into account to interpret these characteristics for applied purposes. 6. It is recommended to use the maximum likelihood method accompanied by the Monte-Сarlo technique for evaluating confidence intervals. 7. It is necessary to develop new approaches to interpretation of climate variability for applications in climate changes conditions. Such approaches should be based on combining extreme value models and advanced statistical models of meteorological processes in different time-space scales. REFERENCES 1. Gumbel, E.J. (1958). Statistics of extremes. New York: Columbia Univ. Press. 2. Сramer H., M. R. Leadbetter (1967). Stationary and related stochastic processes. Sample function properties and their applications. John Wiley, New York. 3. Leadbetter, M.R., G. Lindgren, H. Rootzen (1983). Extremes and related properties of random sequences and processes. 4. Davison, A.C. and Smith, R.L. (1990). Models for exceedances over high thresholds (with discussion). J. R. Statist.Soc., 52, pp.393-442. 5. Reiss, R.-D., and M. Thomas (Birkhauser, Second Edition, 2001). Statistical modeling of extreme values from insurance, finance, hydrology, and other fields. 6. Buishand , T.A. (1989). Statistics of extremes in climatology. Statistica Nederlandica , 43, 1-30. 7. Farago, T., and R.W. Katz (1990). Extremes and design values in climatology. Report No. WCAP-14, WMO/TD-No. 386, World Meterological Organization, Geneva. 8. Palutikof, J.P. , B.B.Brabson, D.H.Lister, and S.T.Adcock (1999). A review of methods to calculate extreme wind speeds. Meteorological Applications, 6, 119-132. 9. Khlebnikova, E.I., I.A.Sall, E.E.Sibir. (1988). On the use of space characteristics of excursions of meteorological fields for climate change analysis (in Russian). Trudy GGO, 516, 110-120. 10. Matyasovszky I. (2000). A method to estimate temporal behavior of extreme quantiles. Idöájáras, v.104, No.1, pp.43-51. WEB PAGES 1. http:// www.esig.ucar.edu/extremevalues/extreme.html, including “Lecture notes on environmental statistics” by Richard Smith (Chapter 8 on extremes) 2. http://www.cru.uea.uk.projects/mice/extremes_description.pdf SOFTWARE 1. Xtremes: http://www.xtremes.de Windows software for statistical analysis of extremes. 2. Alec McNeil’s S-Plus routines: http://www.math/ethz.ch/~mcneil/software.html