www.ec.gc.ca [email protected] Verification of multimodel ensemble forecasts using the TIGGE dataset Laurence J. Wilson Environment Canada Anna Ghelli ECMWF With thanks to Marcel Vallée.

Download Report

Transcript www.ec.gc.ca [email protected] Verification of multimodel ensemble forecasts using the TIGGE dataset Laurence J. Wilson Environment Canada Anna Ghelli ECMWF With thanks to Marcel Vallée.

www.ec.gc.ca
[email protected]
Verification of multimodel ensemble
forecasts using the
TIGGE dataset
Laurence J. Wilson
Environment Canada
Anna Ghelli
ECMWF
With thanks to Marcel Vallée
Outline
• Introduction – TIGGE goals and
verification
• Status of verification of TIGGE
ensembles
– Standard methods
– Spatial methods
• Precipitation verification project: plan and
early results
• Summary
• SEE the extended abstract also
November 7, 2015
Verification and the goals of TIGGE
• Goals:
– Enhance collaborative research
– Enable evolution towards GIFS
– Develop ensemble combination methods; bias
removal
• Essential question: If we are going to move
towards a GIFS, then we must demonstrate
that the benefits of combined ensembles are
worth the effort with respect to single-center
ensembles. OR: Do we get a “better” pdf by
merging ensembles?
• Verification – Relevant, user-oriented
November 7, 2015
Status of Verification of TIGGE
ensembles
• Mostly model-oriented verification so far
– Upper air data
– Against analyses
– Standard scoring and case studies
• Studies on the TIGGE website
– Park et al, 2008
• First study involving several months of data
• Found modest improvement with combined ensembles, greatest
benefits in tropics and lower atmosphere
• “Advantage” of using one’s own analysis as truth
– Pappenberger et al.
• Case study of flooding event in Romania
• User-oriented, Q-Q plots, RPS and RMSE main scores used
• Multimodel ensemble has best average properties, ECMWF next.
November 7, 2015
Studies using TIGGE data (cont’d)
• Johnson and Swinbank, 2008
–
–
–
–
Study of calibration/combination methods
Used only 3 ensembles
Mslp and 2m temperature, but from analyses
Multimodel ensemble improves on individual ensembles, but
not by much in general. More at 2m than 500mb
• Matsueda 2008
– Comparison of 5 combined ensembles vs ECMWF alone
– RMSE skill and RPSS with ECMWF as standard forecast
– Multimodel eps outperforms ECMWF at medium and longer
ranges.
November 7, 2015
Status of TIGGE – related verification
• Current efforts – Verification of surface
variables?
– This conference ---?
– Studies using spatial methods
• Ebert – application of CRA technique to ensemble
forecasts. So far, only ECMWF.
• Application of Wilks minimum spanning tree or T.
Gneiting’s multi-dimensional rank histogram for TC
centers. (idea stage)
– Precipitation verification project:
November 7, 2015
Precipitation verification project
• Goal: to verify global 24h precipitation forecasts from
all the ensembles in the TIGGE archive and
combinations
• One region at a time, using highest density
observations
• Canada and Europe so far
• Methodology
– Cherubini et al upscaling, verify only where data available
– Single station, nearest gridpoint where data is sparser
– Kernel density fitting following Peel and Wilson to look at
extremes of distributions.
November 7, 2015
Precipitation verification project :
methodology - Europe
• Upscaling:
– 1x1 gridboxes, limit of model resolution
– Average obs over grid boxes, at least 9 stns
per grid box (Europe data)
– Verify only where enough data
– Matches obs and model resolution locally
– Answers questions about the quality of the
forecasts within the capabilities of the model
– Most likely users are modelers.
November 7, 2015
European Verification
-Upscaled observations
according to Cherubini
et al (2002)
-OBS from gauges in
Spain, Portugal, France,
Italy, Switzerland,
Netherlands, Romania,
Czech Republic, Croatia,
Austria, Denmark, UK,
Ireland, Finland and
Slovenia
-At least 9 stns needed
per grid box to estimate
average
-24h precip totals,
thresholds
1,3,5,10,15,20,25,30 mm
-one year (oct 07 to oct
08
November 7, 2015
Reliability – Summer 08 – Europe – 42h
November 7, 2015
Reliability – Summer 08- Europe 114 h
November 7, 2015
Reliability – Winter 07-08 – Europe – 114h
November 7, 2015
ROC – Summer 08 – Europe – 42h
November 7, 2015
ROC – Summer 08 – Europe – 114 h
November 7, 2015
Precipitation verification project:
methodology - Canada
• Single station verification
– Canadian verification over
20 widely-spaced stations,
only one station per gridbox;
comparison of nearest
gridpoint fcst to obs
– Pointwise verification, does
not (we cannot) upscale
properly because don’t have
the data density necessary.
– Valid nevertheless as
absolute verification of
model predictions
November 7, 2015
Results – Canada – ROC curves – 24h
ROC Curve
0.4
Hit Rate
0.9
ecmwf
0.825 ( 0.838 )
0.785 ( 0.842 )
0.0
0.2
0.2
msc
0
0.6
0.6
0.1
0.30.2
0.4
0.5
0.6
0.7
0.8
0.9
0.8
0
0.4
0.4
0.1
0.6
0.7
0.8
0.0
Hit Rate
0.8
0.5
0.3
0.2
1.0
1.0
ROC Curve
1
0.0
0.2
0.4
0.6
0.8
False Alarm Rate
November 7, 2015
1.0
1
0.0
0.2
0.4
0.6
False Alarm Rate
0.8
1.0
Results – Canada – ROC Curves – 144h
ROC Curve
0.3
0
0.2 0.1
1.0
1.0
ROC Curve
0.2
0.4
0.3
0.4
0.8
0.8
0.5
0.5
0.6
0.664 ( 0.66 )
0.6
Hit Rate
msc
0.7
0.8
0.4
0.6
0.4
0.7
ecmwf
0.703 ( 0.701 )
0.2
0.9
0.2
0.9
0.0
1
0.0
Hit Rate
0.6
0.8
0.1 0
0.0
0.2
0.4
0.6
0.8
False Alarm Rate
1
1.0
0.0
0.2
0.4
0.6
False Alarm Rate
November 7, 2015
0.8
1.0
RMSE of pcpn probability – Canada – Oct
07 to Oct 08 – 20 stns
2.0 mm
10 mm
BOM in blue (darker blue); ECMWF in red; UKMET in green
CMC in gray; NCEP in cyan (lighter blue)
November 7, 2015
Verification of TIGGE forecasts with respect to
surface observations – next steps
• Combined ensemble verification
• Other regions – Southern Africa should be
next. – Non-GTS data is available
• Evaluation of extreme events – kernel density
fitting to ensembles.
• Other high-density observation datasets such
as SHEF in the US
• Other variables: TC tracks and related surface
weather
• Use of spatial verification methods
• THEN maybe we will know the answer to the
TIGGE question.
November 7, 2015
www.ec.gc.ca
November 7, 2015
Issues for TIGGE verification
• Use of analyses as truth – advantage of one’s own
model. Alternatives:
–
–
–
–
Each own analysis
Analyses as ensemble (weighted or not)
Random selection from all analyses
Use “best” analysis; eliminate the related model from
comparison
– Average analysis (may have different statistical characteristics)
– Model-independent analysis (restricted to data – rich areas,
but that is where verification might be most important for most
users
• Problem goes away for verification against
observations (as long as they are not qc’d with respect
to any model)
November 7, 2015
Park et al study – impact of analysis used
as truth in verification
November 7, 2015
Issues for TIGGE Verification (cont’d)
• Bias adjustment/calibration
– Reason: to eliminate “artifical” spread in
combined ensemble arising from systematic
differences in component models
– First (mean) and second (spread) moments
– Several studies have/are being undertaken
– Results on benefits not conclusive so far
• Due to too small sample for bias estimation?
– Alternative: Rather than correcting bias,
eliminate inter-ensemble component of bias
and spread variation.
November 7, 2015
ROC – Winter 07-08 – Europe – 42h
November 7, 2015
ROC – Winter 07-08 – Europe – 114h
November 7, 2015
Reliability – Winter 07-08 – Europe – 42h
November 7, 2015
Results – Canada – Brier Skill,
Resolution and Reliability
Brier Skill and components - POP
~6400 cases, 20 stns
0.2
BSS, REL and RES
0.1
0
1
2
3
4
5
6
-0.1
-0.2
BSS - MSC
BSS - ECMWF
REL - MSC
REL - ECMWF
-0.3
RES - MSC
RES - ECMWF
-0.4
Forecast day
November 7, 2015
7
8
9
10