Document 7846935

Download Report

Transcript Document 7846935

NOAA THORPEX PI WORKSHOP
January 17-19, 2006, Camp Springs, MD
Assimilation of real observations using the
GFS model and the Maximum Likelihood
Ensemble Filter
Milija Zupanski and Arif Albayrak
Cooperative Institute for Research in the Atmosphere
Colorado State University
Fort Collins, CO 80523-1375
E-mail: [email protected]
NOAA THORPEX, JAN 17-19, 2006
Outline
 Maximum Likelihood Ensemble Filter (MLEF)
 Hessian preconditioning
 THORPEX related research
 Details of algorithm implementation on the NCEP computers
 Preliminary results with GFS T62L28 and real observations
 Future
NOAA THORPEX, JAN 17-19, 2006
Maximum Likelihood Ensemble Filter (MLEF)
(Zupanski 2005, MWR; Zupanski and Zupanski 2006, MWR)
- Estimate of the conditional mode of the posterior PDF
- Ensembles used to estimate the uncertainty of the conditional mode
- Non-differentiable minimization with Hessian preconditioning:
(Generalized conjugate-gradient and BFGS quasi-Newton algorithms)
- Augmented control variable: initial conditions, model bias, empirical
parameters, boundary conditions
- Not sample based
• Becomes the Reduced-Rank KF for linear operators and Gaussian PDF
• Error covariances computed from minimization, not from a sample
MLEF currently implemented with:
- Carbon and biosphere transport models: LPDM, PCTM, SIB (D. Zupanski)
- GEOS-5 column model (D. Zupanski)
- RAMS CSU non-hydrostatic model: Mesoscale, LES (D. Zupanski)
- CSU Shallow-water model (M. Zupanski)
- NCEP GFS model (M. Zupanski)
- Ensemble Cloud Model (NASA) + Super Parameterization (M. Zupanski)
NOAA THORPEX, JAN 17-19, 2006
Forecast (prior) error covariance
Ensemble
size
12
f
P
 [p
f
1
f
2
p
f
NE
 p ]
pif  M ( x a  pia )  M ( x a )
•
•
•
•
 p1f,i 
 f 
 p2,i 
f
pi  
 


pf 
 S ,i 
State-space
dimension
Control vector is the most likely forecast
Square-root used in the algorithm (full covariance can be calculated)
No sampling of error covariance
Provides dynamic continuity between the analysis and forecast
NOAA THORPEX, JAN 17-19, 2006
MLEF Analysis Step
Minimize cost function in the subspace spanned by ensemble perturbations
J 
1
1
[ x  x f ]T Pf-1 [ x  x f ]  [ yobs - H ( x )]T R 1 [ yobs - H ( x )]
2
2
Analysis increment is a linear combination of pf vectors
x a  x f  Pf1/ 2 w  w1 p1f  w2 p2f    wN pNf E
Similar to variational, however:
 Non-differentiable iterative minimization with superior preconditioning
 Solution in ensemble subspace (reduced rank)
 Analysis uncertainty estimate
NOAA THORPEX, JAN 17-19, 2006
Hessian preconditioning in MLEF
Change of variable (Hessian preconditioning)
xk  x  P
f

Z b  z1b
1/ 2
f
I  Z  Z 
b T
z2b   z Nb E
b
-1 / 2
ζk
Ensemble-size matrix


zib  R 1 / 2 H ( x f  pif ) - H ( x f )

Background (first guess)
ζ k 1  ζ k   k d k
k – iteration index
Physical space
x0
-gx
Preconditioning space
xmin

J=const.
-g
min
J=const.
Use ETKF transformation in Hessian preconditioning
NOAA THORPEX, JAN 17-19, 2006
Analysis (posterior) error covariance
Pa1 2  Pf1 2 ( I  Z T Z )-1 2
Pa1 2  [ p1a
p2a  pNa E ]
zi  R -1 2 H ( x a  pif )  R -1 2 H ( x a )
Analysis (minimizer)
• Analysis error covariance estimated from minimization algorithm
• At the minimum: Inverse Hessian = Analysis error covariance
Conjugate-gradient algorithm:
1
 2J 
a
1
P  G   2 
 x  x  min
Quasi-Newton algorithm:
P a  G 1  ViT1 V0TV0 Vi 1  i si siT
Vi  I   i yi siT
yi  gi 1  gi
si  xi 1  xi
NOAA THORPEX, JAN 17-19, 2006
THORPEX related research
MLEF with T62L28 GFS and real observations
• Code development completed and debugged
• GFS model + interpolation from model to observations (using the SSI code)
• Currently tested PREPBUFR observations, will include satellite/radar/lidar
Compare MLEF with other EnKF methods
• Evaluate if dynamical localization holds
• Ensemble size, robustness of the algorithm
• Code efficiency: script driven algorithm, exploits the NCEP code structure
Dual-resolution MLEF
• Control forecast in operational resolution, ensembles in coarse resolution
• Save computational time, possible practical advantages
• Test this capability, using two GFS resolutions (T62, T126)
• Evaluate the ensemble size issue
NOAA THORPEX, JAN 17-19, 2006
Implementation on the NCEP computers
(IBM SP - snow)
GFS model
MLEF Driving Script
Interface
Parallel programming
Single script
(ensemble or control)
MLEF programs
Column matrix
representation
On-the-fly LoadLeveler
script submission for
ensemble runs in parallel
Continuous data flow
using observation dates
Keep the original GFS and SSI directories,
compilation separate from the MLEF
Observation operator
SSI interpolation to
observation location
QC obs: Setuprhsall.f90
NOAA THORPEX, JAN 17-19, 2006
Practical implementation issues
MLEF is script driven (UNIX)
• Allows quick adjustment to the upgraded codes of GFS and SSI (GSI)
• Minimal adjustment of the operational GFS and SSI scripts
• Separate directories, original compilation
Parallel ensemble runs
• Defined number of ensembles and number of CPUs
• On-the-fly LL script optimally combines ensembles in groups
• Run scripts in parallel, check when done
Quality Control (QC) of observations
• SSI has a quality control of observations: Reject, Keep, Change obs error
• Each ensemble run has its own QC, thus not necessarily the same outcome as
the control
• Critically important to keep track of the QC-ed obs in calculation
NOAA THORPEX, JAN 17-19, 2006
MLEF with GFS model and real
observations: Preliminary Results
 Cold start on January 7, 2004, 06 UTC
 4 data assimilation cycles
 6-hour cycle
 200 ensembles
 Ensemble initiation: Random perturbation of spectral coefficients
 Observations from NCEP operational file:
surface pressure, winds, temperature ~ 360,000 obs per cycle
 Control variables: spectral coefficients - T, ln(ps), div, vort
 GFS T62L28 model
NOAA THORPEX, JAN 17-19, 2006
Innovation vector frequency histogram
[ HPf H T  R]1/ 2 [ y  H ( x a )]
Cycles 1-4
INNOVATION VECTOR HISTOGRAM
(GFS T62L28 + MLEF + real obs: 200 ens)
1.00
PDF
0.80
0.60
0.40
0.20
0.00
-5
-4
-3
-2
-1
0
1
2
3
4
5
Bin Categories
• Insufficient width => overestimation of Pf or R ?
• Width improves with cycles (only 4 cycles at present, however)
• Close to the Gaussian distribution N(0,1) => Adequate PDF assumptions,
MLEF does not show divergence - need more cycles
NOAA THORPEX, JAN 17-19, 2006
INFORMATION (TRANSFORM) MATRIX SPECTRUM
[ I  Z T Z ]1  U[ I  Λ]1U T  [ I  R1/ 2 HPf H T R 1/ 2 ]1
Cycles 1-4: Eigenvalues [ I  Λ]1
Information Matrix Power Spectrum
(GFS T62L28 + MLEF + real obs: 200 ens)
1
Eigenvalue
0.8
Cycle 1
Cycle 2
Cycle 3
Cycle 4
0.6
0.4
0.2
0
1
41
81
121
161
• Each cycle benefits from observations (eigenvalues < 1)
• Similar, steady improvement in cycles 2-4 => expected to continue
NOAA THORPEX, JAN 17-19, 2006
Future
Complete GFS + MLEF + real observations experiments
• Explore the need for error covariance localization
• Ensemble initiation techniques
• SSI observation error compatibility: disproportionate cost function
decrease
Comparison operational analysis and other EnKF
• Adopt the common post-processing and evaluation programs
Prepare for dual-resolution MLEF
• T126 control, T62 ensembles
• Adjust control forecast script
• Interpolation from T126 to T62
Add all operational observations
• Algorithmically simple, just change the observation masks
• Special attention to specific humidity and ozone: Gaussian PDF?
NOAA THORPEX, JAN 17-19, 2006