Transcript Document

Advanced data assimilation methods
with evolving forecast error covariance
Four-dimensional variational analysis
(4D-Var)
Shu-Chih Yang (with EK)
Find the optimal analysis
T1  Tt  1 (forecast)
T2  Tt  2 (observation)
Best estimate the true value
• Least squares approach
Find the optimal weights to minimize the analysis error covariance

  22   12 
Ta   2
T   2
T
2  1
2  2






 1
 1
2 
2 
• Variational approach
Find the analysis that will minimize a cost function, measuring its
 to the background and to the observation
distance
1 (T  T1 )2 (T  T2 )2  J
J(T)  

= 0 for T = Ta
,
2
2
2  1
 2  T
Both methods give the same Ta !

3D-Var
How do we find an optimum analysis of a 3-D field of model
variable xa, given a background field, xb, and a set of
observations, yo?
1
2
J(x)= 1 (x-xb)TB-1(x-xb) + [yo-H(x)]TR-1[yo-H(x)]
2
Distance to observations (Jo)
Distance to forecast (Jb)
J(xa)=0
at J(xa)=J
min
 find the solution in 3D-Var
Directly set J(xa)=0 and solve
(B-1+HTR-1H)(xa-xb)=HTR-1[yo-H(xb)]
Usually solved as
(I+ B HTR-1H)(xa-xb)=
B HTR-1[yo-H(xb)]
(Eq. 5.5.9)
Minimize the cost function, J(x)
A descent algorithm is used to find the minimum of
the cost function.
This requires the gradient of the cost function, J.
J T
J     x;
x 
J 
J
x
Ex: “steepest descent” method

4D-Var
J(x) is generalized to include observations at different times.
yo


previous forecast
yo
xb
xa

yo
t0
corrected forecast

Find the initial condition such that its
forecast best fits the observations
within the assimilation interval
yo
ti
tn
assimilation window
J(x(t0))=
1
[x(t0)-xb(t0)]TB0-1[x(t0)-xb(t0)]+
2
T -1 o
1 i N o
[y
-H(x
)]
Ri [y i-H(xi)]

i
i
2 i 0
Need to define J(x(t0)) in order to minimize J(x(t0))


Separate J(x(t0)) into “background” and “observation” terms
J
 Jb
 Jo
J  Jb  J o ,


 x(t 0 )  x(t 0 )  x(t 0 )
First, let’s consider Jb(x(t0))
Given a symmetric matrix A, and
a
function J  1 xT Ax ,
2
J
the gradient is give by
 Ax
x
1
b
T 1
b
J b  [x(t
0 )  x (t 0 )] B [x(t 0 )  x (t 0 )]
2

J b
 B1[x(t 0 )  x b (t 0 )]
x(t 0
)
Jo is more complicated, because it involves the
integration of the model:
1 N
J o  [H(xi )  yio ]R-1i [H(xi )  yio ]
2 i 0
If J =
where
yTAy
and y = y(x), then
y 
y k


x 
k,l x l
J y  ,
   Ax
x x 
T
is a matrix.

xi=Mi[x(ti-1)]
 (H(xi )  yio )  H  M i

 Hi L(t0 ,ti )  Hi Li-1Li-2 L L0

 x0
 xi  x 0
[Hi Li-1Li-2 ...L0 ]T  LT0 L LTi2LTi1HTi  LT (ti ,t0 )HTi
  Jo  N T
T -1
o

   L (t 0 ,ti )Hi R i [H (xi )  yi ]
  x(t 0 )  i =0
Adjoint model integrates
increment backwards to t0
weighted increment at
observation time, ti, in
model coordinates
Simple example:
Use the adjoint model to integrate backward in time
t0
t1
t2
t3
t4
d0
d1
d2
d3
d4
Jo/x0

Jb/x0


d0 + LT0 (d1 + LT1 (d2 + LT2 (d3 + LT3 d4 )))

+ b

 1
Start from
B0 [x(t 0 )  x (t 0 )]
o
di  HTi R1
[H(x
)

y
i
i
i]
the end!
• In each iteration, J is used
 to determine the
direction to search the Jmin.
• 4D-Var provides the best estimation of the
analysis state and error covariance is evolved
implicitly.
3D-Var vs. 4D-Var
3D-Var
1. 4D-Var assumes a perfect
model. It will give the same
yo
credence to older

Jo
Jo
previous forecast observations as to newer

yo
observations.
• algorithm modified by
xb
corrected forecast
Derber (1989)
Jb xa
 Jo
yo
2. Background error covariance
Jo

is time-independent in 3D-Var,
yo
but evolves implicitly in 4Dti
tn
t0
Var.
assimilation window
Figure from http://www.ecmwf.int/
3. In 4D-Var, the adjoint model
is required to compute J.
Practical implementation: use the incremental
form
1
1 N
T 1
J(x 0 )  (x 0 ) B0 x 0  [HiL(t 0,t i )x 0  doi ]T R1[HiL(t0 ,t i )x 0  doi ]
2
2 i 0
where x  x  x b and d  y o  H(x)

With this form, it is possible to choose a “simplification


operator,
S” to solve
the cost function in a low dimension
space (change the control variable).
Now, w=Sx and minimize J(w)
The choice of the simplification operator
• Lower resolution
• Simplification of physical process
Example of using simplification operator
Both TLM and ADJ use a
low resolution and also
simplified physics due to
the limitation of the
computational cost.

Example with the Lorenz 3-variable model
Nonlinear model
x=[x1,x2,x3]
dx1
  px1  px2
dt
dx2
 rx1  x1 x 3  x 2
dt
dx3
 x1 x 2  bx3
dt
Tangent linear model
x=[x1, x2, x3]
M M
L

x x i
  p

 r  x3

 x 2
0 

x11
1 x
x1 b 

p
Adjoint model
x*=[x*1, x*2, x*3]
M 
T
L   
x i 
 p r  xx33

  p
1

x1
x
 0
T
• The background state is needed in both L and LT

(need tosave the model trajectory)
• In a complex NWP model, it is impossible to write
explicitly this matrix form
x 2 

x1 
b

Example of tangent linear and adjoint codes
use forward scheme to integrate in time
In tangent linear model
x 3(t  t)  x 3 (t)
 x 2 (t)x1(t)  x1(t)x 2 (t)  bx 3 (t)
t
x 3(t  t)  x 3 (t)  [x 2 (t)x1(t)  x1(t)x 2 (t)  bx 3 (t)]t
forward in time
In adjoint model

x *3 (t)  x *3 (t)  (1 bt)x *3 (t  t)
x *2 (t)  x *2 (t)  (x1(t)t)x *3 (t  t)
x1* (t)  x1* (t)  (x 2 (t)t)x1* (t  t)
backward in time
x1* (t  t)  0
* Try an example in Appendix B (B.1.15)

RMS error of 3D-Var and 4D-Var in Lorenz model
Experiments: DA cycle and observations: 8t, R=2*I
4D-Var assimilation window: 24t
3DVar
observation error
4DVar
4D-Var in the Lorenz model (Kalnay et al., 2005)
Win=8
16
24
32
40
48
56
64
72
Fixed window
0.59
0.59
0.47
0.43
0.62
0.95
0.96
0.91
0.98
Start with
short window
0.59
0.51
0.47
0.43
0.42
0.39
0.44
0.38
0.43
Impact of the window length
•
•
•
Lengthening the assimilation window reduces the RMS
analysis error up 32 steps.
For the long windows, error increases because the cost
function has multiple minima.
This problem can be overcome by the quasi-static
variational assimilation approach (Pires et al, 1996),
which needs to start from a shorter window and
progressively increase the length of the window.
Schematic of multiple minima and increasing window
size (Pires et al, 1996)
J(x)
Jmin2
Jmin1
1
2
……
final
failed
Dependence of the analysis error on B0
Win=8
B=
B3D-Var
50%
B3D-Var
40%
30%
B3D-Var B3D-Var
20%
10%
5%
B3D-Var B3D-Var B3D-Var
RMSE
0.78
0.59
0.53
0.52
0.51
0.50
Dependence of the analysis error on the B0
0.65
>2.5
•Since the forecast state from 4D-Var will be more accurate
than 3D-Var, the amplitude of B should be smaller than the one
used in 3D-Var.
• Using a covariance proportional to B3D-Var and tuning its
amplitude is a good strategy to estimate B.