Lecture One: Understanding the 3DVAR in a Practical Way Jidong Gao [email protected] Center for Analysis and Prediction of Storms University of Oklahoma.

Download Report

Transcript Lecture One: Understanding the 3DVAR in a Practical Way Jidong Gao [email protected] Center for Analysis and Prediction of Storms University of Oklahoma.

Lecture One:
Understanding the 3DVAR in a Practical Way
Jidong Gao
[email protected]
Center for Analysis and Prediction of Storms
University of Oklahoma
A simple example
•
A variational data assimilation algorithm can involve:
Variational calculus;
inverse problem theory;
estimation theory;
optimal control theory; and
various computational theories
•
Can we present a relatively easy but comprehensive
outline of variational data assimilation without being
annoyed by all those complicated theories ?
•
Let’s begin with a very simple example, that involves
all major aspects of the variational data assimilation.
•
Consider the temperature in our class room. This room is airconditioned. According to room characteristics, power of AC,
etc., we estimate (forecast) that the room temperature should
be Tb=180C. We call it background T. This cannot be perfect,
since there are various random disturbances, for example, the
door of the room can be opened randomly, the air conditioner
does not work perfectly according to spec., etc. Let the
standard deviation of background σb=10 c.
 b  10 C
•
On the other hand, suppose there is a thermometer installed
in this room. The reading of the thermometer is, say, To=20o
C. We call it observation. The reading is not perfect either, Let
the standard deviation of the measured temperature from the
thermometer be σo=0.50 c.
•
The question is, what should be the best estimate of the
room temperature ?
• The simple way is to make a weighted average,
Ta  ( o2Tb   b2To ) /( o2   b2 )  19.6o C
• We know that the above weights are optimal weights
determined by the minimum variance estimation theory, as
you have learnt from Drs. Carr and Xue’s lectures.
Based on Bayesian estimation, or maximum likelihood
estimate theory, we can derive that the best solution can be
obtained by minimizing the following cost function,
1 (T  Tb )2 1 (T  To )2
J

2
2 b
2  o2
The minimization of J is the solution of the equation
dJ T  Tb T  To


 0.
2
2
dT
b
o
It should be easy to find that this is the maximum
likelihood estimate. The answer is the same as the
weighted average derived.
Posterior error
•
You may ask how good this estimate is? This is actually a crucial
question. In the world of data assimilation, the estimate of the
accuracy of result is of the same importance as the result itself !!
By using error variance of the estimate by defining,
2 

Ta -Tt 
2
Go through some algebraic manipulations, yield,
 2   b2 o2 /( b2   o2 )  0.52 /(12  0.52 )  0.2
Obviously,
 2   b2  1
and
 2   o2  0.52  0.25
Why the analysis is more close to the observation?
(Bkgrd: 18oC, Obs: 20oC and Analysis: 19.6oC)
Comments:
This simple example shows how to solve the data
assimilation problem in the variational way. We need to
minimize the cost function J. In order to do this, we have
to calculate the gradient of cost function with respect to
analysis variable T, dJ/dT, and set to zero.
For real 3DVAR/4DVAR, a scalar T is replaced by a
vector, the two scalar error variances are replaced by
error covariance matrices. The procedures to solve them
are quite similar.
The 3DVAR Formulation
A general cost function is defined as
J  J B  JO  Jc
1
1
 ( x  xb )T B 1 ( x  xb )  [ H ( x)  y o ]T R 1[ H ( x)  y o ]  J c
2
2
Goal: Find the analysis state x that minimizes J.
Where,
x
xb
B
R
yo
y = H(x)
Jc
is the NWP model state;
is the background state;
is the background error covariance matrix;
is the observation error covariance matrix;
is the observation vector;
is the operator that brings the model state x to the
observational state variables.
For 4DVAR, H includes the full prediction model
represents the dynamic constraints.
What should we know before we
can solve the 3DVAR problem ?
•
•
•
•
•
•
•
•
•
Cost function J measures the fit of analysis x to background xb and observation yo, and
dynamic constraints (Jc) are also satisfied in some way.
What really is J? a vector, a numerical number, or a matrix ?
What we should know before minimization ?
B: unknown, but can be estimated. It is a priori, 3DVAR is thus also called a priori
estimate. B is vitally important!! It decides how observations spread to nearby grid
points. However, B is also most difficult one to get. Its dimension is huge 1010-14 and its
inverse is impossible. Simplification is necessary…And this is a very active research area
for data assimilation in the past 30 years. Among the data assimilation community,
there are two basic methods: 1) assume B to be diagonal. This can be done only in
spectral space (Parrish and Derber 1992). However, this approximation is not acceptable
for grid point models. 2) B is modeled by parameterized formulism. This reduces the
dimension and the inversion of B can be avoided through judiciary choices of control
variables (Huang 2000, Purser et al. 2003a, b).
R: observation error covariance matrix, also includes the representative error, usually
diagonal, can be decided “off-line” based on each type of observation used.
xb: background state usually comes from previous forecast.
yo: obs. Every new type of observation may have positive, or negative impact to the
whole 3DVAR system. Active research area: OU, radar data; Wisconsin, satellite data.
Jc: One, or more equation constraints. Also can be a good research topic.
y=H(x): forward observational operator (including interpolation operator). Also a lot
of research in this area.
With all of the above being readily taken care of and coded, we can begin to think
about the minimization.
The procedure for minimization of J by iteration
First Guess
x=(u,v,p,t…)
Minimization algorithm
Find the new control
Variable x=(u,v,p,t…)
Calculate cost function J
xk 1  xk   f (J )
Iteration loop
Calculate gradient of
cost function, dJ/dx
No
Convergence criterion
YES
Model forecast
Output x=(u,v,p,t…)
a scalar
From the flow chart, there are three important tasks for
the minimization procedure:
1)
2)
3)
Calculate the cost function.
Calculate the gradient of the cost function.
Select a minimization algorithm.
The first task was already discussed; the second
task usually requires the use of the adjoint
technique; and the third one is also crucial! To
develop an efficient minimization algorithm is an
active research topic in applied mathematics for the
past 40 years...
For us we just need to pick up a good one
and know how to use it. You may find one from
book “Numerical Recipe” on-line: www.nr.com
Simple example of how to use
minimization algorithm
x  y  z  6

2 x  y  z  3
x  2 y  z  8

To solve this in a variational way, define a
cost function first, let it to be,
1
1
2
J  ( x  y  z  6)  (2 x  y  z  3) 2
2
2
1
 ( x  2 y  z  8) 2
2
•
Then, we need the gradient of the cost function, it’s a vector of 3
variables,
 dJ
 dx  6 x  y  20

 dJ
 x  6 y  4 z  19

 dy
 dJ
 4 y  3 z  11

 dz
Subroutine FCN(N, X, F)
integer::N; real::X(N), F
F=(x(1)+x(2)+x(3)-6)**2+(2*x(1)-x(2)-x(3)-3)**2
+(x(1)+2*x(2)+x(3)-8)**2
return; end
Subroutine GRAD(N, X, G)
integer::N; real::X(N), G(N)
G(1)=6*x(1)+x(2)-20;
G(2)=x(1)+6*x(2)+4*x(3)-19
G(3)=4*x(2)+3*x(3)-11
return; end
&
•
•
•
•
•
•
•
•
•
•
•
•
•
Program main
Integer::n
parameter(n=3)
Integer:: I, maxfn
Real:: Fvalue, X(n), G(n), Xguess(n)
Real:: dfred, gradtl
External:: fcn, grad
Do i=1,n; Xguess(i) = 0.0; end do ! Provide the first guess
dfred=0.002
! Accuracy criterion about cost function
Gradtl=1.0E-7
! Accuracy criterion about the norm of
! the gradient
Maxfn=50
Call umcgg(fcn, grad, n, xguess, gradtl, maxfn, dfred, x, g, fvalue) !algorithm
print*, ‘x=‘,x(1), x(2), x(3)
end
The minimization algorithm, like this umcgg, one of the
conjugate gradient methods, requires you to provide
subroutines, FCN and GRAD, for calculating J and its
gradient. We can quickly get the answer: (x, y, z)= (3, 2, 1)
after only 2, or 3 iterations. Because the problem is simple,
you do not need many iterations.
Comments:
• All variational data assimilation algorithms work in
similar ways; you define a cost function, get its
gradient, and feed them into a minimization
algorithm along with a first guess of the solution.
• But, large, real problems are not that easy ti
implement. One of the outstanding problem is how
to calculate the gradient of cost function efficiently.
This brings out the adjoint technique, which allows
us to efficiently calculate transposes of large
matrices found in the gradient calculation.
• What is the adjoint, and how to use it?
•
It’s only a mathematical tool to help you to get
the gradient of the cost function.
•
In R. M. Errico paper What is an adjoint
model?, It is said “…the adjoint is used as a
tool for efficiently determining the optimal
solutions. Without this tool, the optimization
problem (including minimization and
maximization) could not be solved in a
reasonable time for application to real-time
forecasting”. This is a good statement. Then
what does it mean exactly ?
•
A simple maximization example
Suppose we have a fence with 200m, and want
to use it to make a rectangular yard with a
maximum possible area. How can we do it ? Let
x, and y are the long and wide of the yard
respectively, then we can define a cost function,
like,
J  xy   (2 x  2 y  200)
and the gradient of the cost function,
dJ / dx  y  2

dJ / dy  x  2
The ‘multiplier’ is equivalent to the ‘adjoint
variable’, the role of this parameter is help to
calculate the gradient of cost function!