Statistical Method for long

Download Report

Transcript Statistical Method for long

Statistical Methods
for long-range forecast
By Syunji Takahashi
Climate Prediction Division
JMA
Let’s thinking chaos of our atmosphere
using Lorenz system
Lorenz equations nonlinear equations
dX
 10X  10Y
dt
dY
 28X  Y  XZ
dt
dZ
8
  Z  XY
dt
3
: approximated equations of convection
: representing the essential nature of our
atmosphere
(Chaotic feature)
: easy to solve it by PC
X : a component of stream function
Y,Z: two components of temperature
Lorenz system making Chaos !
25
60
20
15
10
z
5
30
0
-5
-10
-15
0
-30
-20
0
30
x
-25
1
101
201
301
401
501
601
701
801
901
Trajectory of solution
Time series of solution X
on X-Z plain
Two solution with slight different initials
Features of solution
Circling around two attractors (Lorenz attractor) alternatively
With no certain period
Small difference becoming greater soon
(Chaos)
Predictability Problem
25
20
15
Predicting the
average value
within the period
10
5
0
-5
-10
-15
-20
-25
1
101
201
301
401
501
601
901
0.5
6
Probability Density
5
Predicted Value
801
Probability Density of
Predicted Value
Initial Disturbance and
predicted Value
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-0.6
701
0.4
0.3
0.2
0.1
0
-0.4
-0.2
0
0.2
Initial Disturbance
0.4
0.6
-6
-4
-2
0
2
Predicted Value
4
6
Statistics of 3000 times
simulations with small
disturbance generated
stochastically
Predictability of the Second Kind

X  X0
dX
  
dt
T

Y  Y0 
dY
  
dt
T

Z  Z0 
dZ
  
dt
T
25
20
15
10
5
0
-5
-10
-15
-20
1
101
201
301
401
501
601
6
0.5
5
4
0.45
Probability Density
Predicted Value
801
901
Solution of Lorenz
system with a forcing
Probability Density of
Predicted Value
Initial Difference and
Predicted Value
3
2
1
0
-1
-2
-3
-4
-5
-6
-0.6
701
0.4
0.35
0.3
0.25
Forcing generates bias
in the solutions
0.2
0.15
0.1
0.05
0
-0.4
-0.2
0
0.2
InitialD isturbance
0.4
0.6
-6
-4
-2
0
2
Predicted Value
4
6
→ Signal
Why Probability function
becoming normal ?
Central limit theory
Mean of any stochastic variables
becoming to be normally distributed
Probability Density of
Predicted Value
Probability Density
0.5
(not strictly speaking)
0.4
0.3
0.2
Central Dogma of Statistic
0.1
0
-6
-4
-2
0
2
Predicted Value
4
6
1 N
x   xi
N i1
N 
 (x   )2 
P( x ) 
exp 

2
2
2 


1
Chaotic feature of the atmosphere
and long-range Forecast
In both numerical and statistical prediction
•
We can’t predict the long-term future precisely.
•
Possible target of long-range forecast is biased state
caused by boundary forcing.
•
Possible target of long-range forecast is averaging state.
4 Probabilistic forecast is essential.
5
Noise can be assumed to be normally distributed.
History of long-range forecast at JMA and
statistical methods
1942 starting long-range forecasting
frequent cool summers
1943 formal issuing 1 month, 3 months, seasonal
harmonic analysis
criticism on the accuracy
1949 division being closed
frequent unusual weather
1953 restarting long-range forecast
increasing upper sounding data
simple regression method
1974 establishment of long-range forecast division
analog method
increasing demand for climate information
1987 publish of monthly report of climate system
multiple regression method
spectral analysis
1996 rearranging climate prediction division
1 month numerical prediction
2003 3 months numerical prediction
probability forecast
OCN
CCA
Statistical methods at JMA
1 Analog Method
Cluster Analysis
not being used
How is they similar ?
2 Spectral or Harmonic Methods
not being used
sometimes good, but not always
3 Optimal Climate Normal (OCN)
now being used
simple !
4 Multiple Regression Analysis Simple Regression
not being used
available technique
5 Canonical Correlation Analysis (CCA)
now being used
fashionable technique !
Concept of Analog Method
Basis of analog method
similar states will evolve similarly and similar in the
future
Analog method using 500hPa patterns
Searching past years which similar to the target year
in 500 hPa pattern, to predict the future of the year
using the past futures of these similar years
Selecting 10 similar years, frequency distribution of
these temperatures in the past futures is considered
as a probability forecast.
Definitions of Similarity or Distance
Forecasting year
Similar
Euclid distance
   
d   ( xi  yi ) 2  ( x  y)t ( x  y)
Similar
r
Different definition, different results!
Correlation coefficient
(x
i
 x )( yi  y )
2
(
x

x
)
 i
2
(
y

y
)
 i
Cluster Analysis
grouping method
Gathering the nearest pairs of member or group
Various definitions of the distance between groups
Distance Space
Example of Cluster Analysis
Data:
1971-2000 sequences of monthly temp.(Feb.)
Distance:
unity minus correlation coefficient
Group distance: Word method (It is my favorite)
2.5
Dendro-gram
2
1.5
1
0.5
0
0
1
AUSTRA
INDIA
2
3
NEW CAL
PERU
4
5
VENEZ
PHILIP
6
7
8
THAI(2)
KAZAK
9
10
THAI(1)
JAPN
11
Analysis of time series
Sometimes, obvious cycle appears in the sequence of
meteorological element.
In that case, prediction using the periodicity is very efficient.
3 month mean temperature of Eastern Japan
Power spectrum of 3 month mean temp.
25
800
700
15
Power(100*℃*℃)
3 months mean
temperature(0.1℃)
20
10
5
0
600
500
400
300
200
-5
100
-10
0
0
10
20
30
40
50
accumulated month
60
70
80
0
0.05
0.1
0.15
0.2
0.25
Frequency(1/month)
0.3
0.35
0.4
Prediction by Auto-Regression Model
Assuming Auto-Regression Model such as
xi  a1 xi1  a2 xi2    am xim   i
 i  N (0 ,  2 )
Determining the coefficients and variance of noise from past data.
We can predict the future as
xt 1  a1 xt  a2 xt 1    am xt m1   i
And so on
Successive Case
3 month mean temperature and prediction
25
3 months mean
temperature(0.1℃)
20
15
10
5
0
-5
-10
0
10
20
30
40
50
accumulated month
60
70
80
Failure Case
3 month mean temperature and prediction
25
3 months mean
temperature(0.1℃)
20
15
10
5
0
-5
-10
0
10
20
30
40
50
accumulated month
60
70
80
30
20
10
0
-10
-20
-30
1960
Temp. deviation(0.1℃)
Temp. deviation(0.1℃)
Optimal Climate Normal (OCN)
30
S equence of m onthly m ean tem p.
in Eastern Japan (Jan.)
Normal, past 30 years mean is not
always optimal ‘first estimate’
in the case of being obvious
increasing or decreasing trend
1970
1980
1990
2000
S equence of m onthly m ean tem p.
in Eastern Japan (Jan.)
or climatic jump.
Investigating the past data,
20
10 years mean is the optimal first
estimate in both temp. and precip.
10
0
-10
of Japan.
-20
-30
1960
1970
1980
1990
2000
Break Time for 10 minutes
EXCELL files of
Chaos,
Cluster analysis,
and Spectral analysis
are prepared in this PC.
Situation of Multiple Regression Model
predictand
year
Tem p.
1980
-13
1981
8
1982
-21
1983
-11
1984
7
1985
5
1986
-12
1987
9
1988
-19
1989
-9
1990
6
1991
5
1992
0
1993
-21
1994
24
1995
5
1996
6
1997
4
1998
5
1999
4
2000
15
predictors
NHPV
FEP V
-4.0
-4.9
1.9
-25.4
0.7
16.4
-14.8
20.2
8.7
-18.4
25.1
5.6
-20.8
30.5
-33.2
-6.4
-32.5
-3.3
43.5
3.2
-3.1
-9.2
8.5
13.7
-4.8
14.4
16.6
-12.2
21.1
14.8
-58.7
19.1
31.4
-42.9
-12.0
-39.8
3.9
-27.9
-8.7
33.6
19.5
-9.2
N H ZI
FEZI
6.1
-13.6
2.1
-12.7
1.0
10.2
5.6
-0.4
3.2
-17.3
10.6
-7.2
3.6
-13.9
14.1
-12.8
4.0
-17.4
2.1
14.9
-0.1
-24.4
-4.8
11.1
-23.9
7.6
5.5
-28.3
3.6
11.2
0.8
3.5
-28.5
-0.8
-24.8
29.4
-3.5
6.2
12.1
-27.1
43.6
-2.4
O KH O TK
27.9
-1.2
-8.2
-6.1
-19.9
-20.5
11.1
-12.2
33.8
26.1
-4.9
-0.8
-3.9
7.1
-14.2
-15.5
1.4
-5.3
50.0
-1.6
4.1
Predictand Vector
Independent Data
M ID H
0.4
0.3
-11.5
-11.6
-5.6
-6.8
-10.3
-2.7
2.6
2.2
4.0
-0.8
-4.1
-12.7
13.8
-1.0
5.2
1.5
14.2
9.2
4.6
O KIN A W
13.8
-3.3
-21.4
-3.9
-6.6
-12.8
-8.7
5.9
-2.6
-7.1
9.5
9.6
-5.0
-3.0
8.0
11.2
10.6
-4.5
14.5
-6.6
-7.4
O G A SH
9.6
-5.0
-3.8
2.9
-21.9
-11.6
-15.2
8.5
1.3
-1.7
1.5
6.1
3.7
2.8
-0.5
11.6
-3.4
-2.3
16.0
-0.6
-4.4
W PAH
0.6
-2.1
6.5
14.9
1.9
-1.7
-2.4
0.0
7.4
-0.6
-2.3
-0.9
1.9
9.3
2.8
3.9
2.8
0.4
-5.1
-14.3
-15.1
Predictor Matrix
Multiple Regression Equation
The multiple regression model assumes predictand vector is
sum of a linear combination of predictors and a noise.




y(t )   0  1 x1 (t )     m1 xm1 (t )  
This can be rewritten as

 
y (t )  X  
Su
1 x11  x1,m1 


X    
 
1 x  x 
n1
n ,m1 

 1 
  
   
 
 m
Determine Regression Coefficients
The coefficient vector is usually estimated so as to
minimize the sum of squared errors of predicted vector.
 ˆ 2


S  y  y  y  Xa
2
 minimize

t 
( X X )a  X y

t
1
t 
a  (X X ) X y
t
Visual Image of Multiple Regression

y   y1 ,, yn 
Predictand
Vector
Residual Vector


yˆ   yˆ1 ,, yˆ n   Xa
Predicted Vector
Orthogonal Projection


yˆ  Xa
y to S(X)
Subspace S(X)

 
m
S ( X )  z : z  X ,  V

Visual Image of Multiple Regression

  N 0,

( I  P)
Residual Vector

X
True Regression Vector
Error Vector

Projection of
Error Vector

P  X ( X X ) X 
t
1
t
2

Property of Regression Residual
S

2
E S    m  n
  n  m
2
2
S
ˆ 
nm
2
FPE  n  m
2
nm
FPE 
S
nm
Detecting Trend Using Simple Regression
Mean Temperature (℃)
9
Sequence of w inter m ean tem p.
in Tokyo,Japan
8
7
6
5
4
3
1940
1950
1960
y  a0  a1  x  
1970
1980
1990
2000
a   (℃ / year )
1
Property of Calculated Trend
Confidence Interval of the estimated trend
 t n  2
S
S
 a1  1  t n  2
n  2 xx
n  2 xx
Confidence Interval of the regression line
 t n  2
 1 x  x 2 

S  
 xx 
n
n  2
Property of Calculated Trend
Mean Temperature (℃)
9
Sequence of w inter m ean tem p.
in Tokyo,Japan
trend  4.90 (℃ / 100 years)
8
7
6
Confidence Interval of
the regression line
5
4
3
1940
1950
1960
1970
1980
1990
2000
Confidence Interval of the estimated trend
3.6  trend  6.10 (℃ / 100 years)
Warming trend in Tokyo is significant
Urbanization and Warming Trend
Tem perature Trend and
U rbanization Index
3.0
Temperature Trend
(℃/100years)
2.5
2.0
1.5
1.0
0.5
0.0
0
20
40
Housing-Land Ratio (%)
60
Estimation of Global Warming Trend
sec tion  0.65 (℃ / 100 years)
Confidence Interval of the section (constant)
 t n  2
 1 x2 
 1 x2 


S  
S  
 n  xx   a    t n  2  n  xx 
0
0

n  2
n  2
0.47  sec tion  0.82 (℃ / 100 years)
Global warming trend and its confidence interval can
be estimated even using the data effected urbanization
Why is CCA Currently Used ?
Disadvantages of Multiple Regression method
1 Covariance matrix being singular
t
X X
singular
t
(X X )
1
not calculated
2 Not taking accounts of the correlations among the
predictands
Regressions of many predictands are independently
determined.
CCA Flow Chart
X
A
U
Predictor
Matrix
Predictor
Matrix
Canonical
Variable
(Real Space)
(EOF Space)
(CCA Space)
Multiple Regression
CCA Prediction
Yˆ  X ( X t X ) 1 X t Y
Yˆ  HU
Vˆ  U
Y
B
Predictand
Matrix
Predictor
Matrix
Canonical
Variables
(Real Space)
(EOF Space)
(CCA Space)
V
Transform Real Space to EOF Space
S ( X )  Kernel ( X )

t
D  diag(1 ,,  p )
X t XE  ED
XX t A  AD
A  XED

1
2
S ( X )  S ( XX )  S ( A)
t
P  X ( X t X ) 1 X t  AAt
Determining Canonical Component
S (Y )  S ( B)

v


u  Ar


v  Bs

t
AA v  u

u
S ( X )  S ( A)
t 
u u 1
t 
v v 1


BB u  v
t
Summary of Multiple Regression and CCA
1
Multiple regression and CCA are fashionable tools ,which
are available to treat bulk data.
2
Selection of variable is very important for successive
prediction using independent data. Stepwise and allsubsets methods are available.
3 Rank deficiency problem can be avoided by transformation
the real data to one in EOF space in both multiple
regression and CCA case.
4 Correlation between predictands can be considered in CCA.
CCA is the most fashionable tool.
Thank you