www.ec.gc.ca [email protected] Verification of multimodel ensemble forecasts using the TIGGE dataset Laurence J. Wilson Environment Canada Anna Ghelli ECMWF With thanks to Marcel Vallée.
Download ReportTranscript www.ec.gc.ca [email protected] Verification of multimodel ensemble forecasts using the TIGGE dataset Laurence J. Wilson Environment Canada Anna Ghelli ECMWF With thanks to Marcel Vallée.
www.ec.gc.ca [email protected] Verification of multimodel ensemble forecasts using the TIGGE dataset Laurence J. Wilson Environment Canada Anna Ghelli ECMWF With thanks to Marcel Vallée Outline • Introduction – TIGGE goals and verification • Status of verification of TIGGE ensembles – Standard methods – Spatial methods • Precipitation verification project: plan and early results • Summary • SEE the extended abstract also November 7, 2015 Verification and the goals of TIGGE • Goals: – Enhance collaborative research – Enable evolution towards GIFS – Develop ensemble combination methods; bias removal • Essential question: If we are going to move towards a GIFS, then we must demonstrate that the benefits of combined ensembles are worth the effort with respect to single-center ensembles. OR: Do we get a “better” pdf by merging ensembles? • Verification – Relevant, user-oriented November 7, 2015 Status of Verification of TIGGE ensembles • Mostly model-oriented verification so far – Upper air data – Against analyses – Standard scoring and case studies • Studies on the TIGGE website – Park et al, 2008 • First study involving several months of data • Found modest improvement with combined ensembles, greatest benefits in tropics and lower atmosphere • “Advantage” of using one’s own analysis as truth – Pappenberger et al. • Case study of flooding event in Romania • User-oriented, Q-Q plots, RPS and RMSE main scores used • Multimodel ensemble has best average properties, ECMWF next. November 7, 2015 Studies using TIGGE data (cont’d) • Johnson and Swinbank, 2008 – – – – Study of calibration/combination methods Used only 3 ensembles Mslp and 2m temperature, but from analyses Multimodel ensemble improves on individual ensembles, but not by much in general. More at 2m than 500mb • Matsueda 2008 – Comparison of 5 combined ensembles vs ECMWF alone – RMSE skill and RPSS with ECMWF as standard forecast – Multimodel eps outperforms ECMWF at medium and longer ranges. November 7, 2015 Status of TIGGE – related verification • Current efforts – Verification of surface variables? – This conference ---? – Studies using spatial methods • Ebert – application of CRA technique to ensemble forecasts. So far, only ECMWF. • Application of Wilks minimum spanning tree or T. Gneiting’s multi-dimensional rank histogram for TC centers. (idea stage) – Precipitation verification project: November 7, 2015 Precipitation verification project • Goal: to verify global 24h precipitation forecasts from all the ensembles in the TIGGE archive and combinations • One region at a time, using highest density observations • Canada and Europe so far • Methodology – Cherubini et al upscaling, verify only where data available – Single station, nearest gridpoint where data is sparser – Kernel density fitting following Peel and Wilson to look at extremes of distributions. November 7, 2015 Precipitation verification project : methodology - Europe • Upscaling: – 1x1 gridboxes, limit of model resolution – Average obs over grid boxes, at least 9 stns per grid box (Europe data) – Verify only where enough data – Matches obs and model resolution locally – Answers questions about the quality of the forecasts within the capabilities of the model – Most likely users are modelers. November 7, 2015 European Verification -Upscaled observations according to Cherubini et al (2002) -OBS from gauges in Spain, Portugal, France, Italy, Switzerland, Netherlands, Romania, Czech Republic, Croatia, Austria, Denmark, UK, Ireland, Finland and Slovenia -At least 9 stns needed per grid box to estimate average -24h precip totals, thresholds 1,3,5,10,15,20,25,30 mm -one year (oct 07 to oct 08 November 7, 2015 Reliability – Summer 08 – Europe – 42h November 7, 2015 Reliability – Summer 08- Europe 114 h November 7, 2015 Reliability – Winter 07-08 – Europe – 114h November 7, 2015 ROC – Summer 08 – Europe – 42h November 7, 2015 ROC – Summer 08 – Europe – 114 h November 7, 2015 Precipitation verification project: methodology - Canada • Single station verification – Canadian verification over 20 widely-spaced stations, only one station per gridbox; comparison of nearest gridpoint fcst to obs – Pointwise verification, does not (we cannot) upscale properly because don’t have the data density necessary. – Valid nevertheless as absolute verification of model predictions November 7, 2015 Results – Canada – ROC curves – 24h ROC Curve 0.4 Hit Rate 0.9 ecmwf 0.825 ( 0.838 ) 0.785 ( 0.842 ) 0.0 0.2 0.2 msc 0 0.6 0.6 0.1 0.30.2 0.4 0.5 0.6 0.7 0.8 0.9 0.8 0 0.4 0.4 0.1 0.6 0.7 0.8 0.0 Hit Rate 0.8 0.5 0.3 0.2 1.0 1.0 ROC Curve 1 0.0 0.2 0.4 0.6 0.8 False Alarm Rate November 7, 2015 1.0 1 0.0 0.2 0.4 0.6 False Alarm Rate 0.8 1.0 Results – Canada – ROC Curves – 144h ROC Curve 0.3 0 0.2 0.1 1.0 1.0 ROC Curve 0.2 0.4 0.3 0.4 0.8 0.8 0.5 0.5 0.6 0.664 ( 0.66 ) 0.6 Hit Rate msc 0.7 0.8 0.4 0.6 0.4 0.7 ecmwf 0.703 ( 0.701 ) 0.2 0.9 0.2 0.9 0.0 1 0.0 Hit Rate 0.6 0.8 0.1 0 0.0 0.2 0.4 0.6 0.8 False Alarm Rate 1 1.0 0.0 0.2 0.4 0.6 False Alarm Rate November 7, 2015 0.8 1.0 RMSE of pcpn probability – Canada – Oct 07 to Oct 08 – 20 stns 2.0 mm 10 mm BOM in blue (darker blue); ECMWF in red; UKMET in green CMC in gray; NCEP in cyan (lighter blue) November 7, 2015 Verification of TIGGE forecasts with respect to surface observations – next steps • Combined ensemble verification • Other regions – Southern Africa should be next. – Non-GTS data is available • Evaluation of extreme events – kernel density fitting to ensembles. • Other high-density observation datasets such as SHEF in the US • Other variables: TC tracks and related surface weather • Use of spatial verification methods • THEN maybe we will know the answer to the TIGGE question. November 7, 2015 www.ec.gc.ca November 7, 2015 Issues for TIGGE verification • Use of analyses as truth – advantage of one’s own model. Alternatives: – – – – Each own analysis Analyses as ensemble (weighted or not) Random selection from all analyses Use “best” analysis; eliminate the related model from comparison – Average analysis (may have different statistical characteristics) – Model-independent analysis (restricted to data – rich areas, but that is where verification might be most important for most users • Problem goes away for verification against observations (as long as they are not qc’d with respect to any model) November 7, 2015 Park et al study – impact of analysis used as truth in verification November 7, 2015 Issues for TIGGE Verification (cont’d) • Bias adjustment/calibration – Reason: to eliminate “artifical” spread in combined ensemble arising from systematic differences in component models – First (mean) and second (spread) moments – Several studies have/are being undertaken – Results on benefits not conclusive so far • Due to too small sample for bias estimation? – Alternative: Rather than correcting bias, eliminate inter-ensemble component of bias and spread variation. November 7, 2015 ROC – Winter 07-08 – Europe – 42h November 7, 2015 ROC – Winter 07-08 – Europe – 114h November 7, 2015 Reliability – Winter 07-08 – Europe – 42h November 7, 2015 Results – Canada – Brier Skill, Resolution and Reliability Brier Skill and components - POP ~6400 cases, 20 stns 0.2 BSS, REL and RES 0.1 0 1 2 3 4 5 6 -0.1 -0.2 BSS - MSC BSS - ECMWF REL - MSC REL - ECMWF -0.3 RES - MSC RES - ECMWF -0.4 Forecast day November 7, 2015 7 8 9 10