Decomposition Method - City University of Hong Kong
Download
Report
Transcript Decomposition Method - City University of Hong Kong
Decomposition Method
1
Types of Data
Time series data: a sequence of observations
measured over time (usually at equally spaced
intervals, e.g., weekly, monthly and annually).
Examples of time series data include:
Gross Domestic Product each quarter;
annual rainfall;
daily stock market index
Cross sectional data: data on one or more variables
collected at the same point in time
2
Time Series vs Causal Modeling
Causal (regression) models: the investigator
specifies some behavioural relationship and
estimates the parameters using regression
techniques;
Time series models: the investigator uses the
past data of the target variable to forecast the
present and future values of the variable
3
Time Series vs Causal Modeling
On the other hand, there are many cases
when one cannot, or one prefers not to,
build causal models:
1. insufficient information is known about the
behavioural relationship;
2. lack of, or conflicting, theories;
3. insufficient data on explanatory variables;
4. expertise may be unavailable;
5. time series models may be more accurate
4
Time Series vs Causal Modeling
Direct benefits of using time series models:
1. Little storage capacity is needed;
2. some time series models are automatic in that
user intervention is not required to update the
forecasts each period;
3. some time series models are evolutionary in
that the models adapt as new information is
received;
5
Classical Decomposition of
Time Series
Trend – does not necessarily imply a
monotonically increasing or decreasing series
but simply a lack of constant mean, though in
practice, we often use a linear or quadratic
function to predict the trend;
Cycle – refers to patterns or waves in the data
that are repeated after approximately equal
intervals with approximately equal intensity. For
example, some economists believe that “business
cycles” repeat themselves every 4 or 5 years;
6
Classical Decomposition of
Time Series
Seasonal – refers to a cycle of one year
duration;
Random (irregular) – refers to the
(unpredictable) variation not covered by the
above
7
Decomposition Method
Multiplicative Models
Yt TRt SNt CLt IRt
Additive Models
Yt TRt SNt CLt IRt
Find the estimates of these four components.
8
Multiplicative Decomposition
Examples:
(1) US Retail and Food Services Sales from
1996 Q1 to 2008 Q1
Figure 2.1
(2) Quarterly Number of Visitor Arrivals in Hong
Kong from 2002 Q1 to 2008 Q1
Figure 2.2
9
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
108
307
107
306
106
305
105
304
104
303
103
302
102
301
101
300
100
399
199
398
198
397
197
396
196
Sales Y(t) (in MN US$)
Figure 2.1 US Retail Sales
US Retail & Food Services Sales
500,000
450,000
400,000
350,000
300,000
250,000
200,000
150,000
100,000
50,000
0
Time
Back
10
Figure 2.2 Visitor Arrivals
Number of Visitor Arrivals in Hong Kong
2500000
2000000
1500000
1000000
500000
108
Q
307
Q
107
Q
306
Q
106
Q
305
Q
105
Q
304
Q
104
Q
303
Q
103
Q
302
Q
102
0
Q
Number of Visitors Y(t)
3000000
Time
11
Cycles are often difficult to identify with a
short time series.
Classical decomposition typically combines
cycles and trend as one entity:
Yt TCt SNt IRt
12
Illustration : Consider the following 4-year
quarterly time series on sales volume:
Period (t)
Year
Quarter
Sales
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
1
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
72
110
117
172
76
112
130
194
78
119
128
201
81
134
141
216
2
3
4
13
Figure 2.3
14
Step 1 : Estimation of seasonal
component (SNt)
Yt = TCt SNt IRt
SˆNt
Yt
TCt IRt
72 110 117 172
Moving Average
4
for periods 1 – 4 117.75
110 117 172 76
Moving Average
4
for periods 2 – 5
118.75
15
Period (t)
Year
Quarter
Sales
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
1
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
72
110
117
172
76
112
130
194
78
119
128
201
81
134
141
216
2
3
4
MA (t)
117.75
118.75
119.25
122.5
128
128.5
130.25
129.75
131.5
132.25
136
139.25
143
16
Assuming the average of the observations is
also the median of the observations, the MA
for periods 1 – 4, 2 – 5, 3 – 6 are centered at
positions 2.5, 3.5 and 4.5 respectively.
17
To get an average centered at periods 3, 4, 5 etc. the
means of two consecutive moving averages are
calculated:
117.75 118.75
Centered Moving
2
Average for period 3
118.25
118.75 119.25
Centered Moving
2
Average for period 4
119
18
Period (t)
Year
Quarter
Sales
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
1
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
72
110
117
172
76
112
130
194
78
119
128
201
81
134
141
216
2
3
4
MA (t)
CMA(t)
117.75
118.75
119.25
122.5
128
128.5
130.25
129.75
131.5
132.25
136
139.25
143
118.25
119
120.875
125.25
128.25
129.375
130
130.625
131.875
134.125
137.625
141.125
19
Because the CMAt contains no seasonality and
irregularity, the seasonal component may be
Yt
~
estimated by
SNt
CMAt
117
~
For example, SN 3
0.989
118.25
~ 172
SN 4
1.445
119
20
Period (t)
Year
Quarter
Sales
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
1
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
72
110
117
172
76
112
130
194
78
119
128
201
81
134
141
216
2
3
4
~
MA (t)
CMA(t)
SN(t )
117.75
118.75
119.25
122.5
128
128.5
130.25
129.75
131.5
132.25
136
139.25
143
118.25
119
120.875
125.25
128.25
129.375
130
130.625
131.875
134.125
137.625
141.125
0.989429175
1.445378151
0.628748707
0.894211577
1.013645224
1.499516908
0.6
0.911004785
0.970616114
1.49860205
0.588555858
0.949512843
21
~
After all SN t s have been computed, they are
further averaged to eliminate irregularities in the
series. We also adjust the seasonal indices so that
they sum to the number of seasons in a year (i.e., 4
for quarterly data, 12 for monthly data). Why?)
22
Quarter
Average
1 (0.628748707 + 0.6 + 0.588555858)/3=
2 (0.894211577 + 0.911004785 + 0.949512843)/3=
3 (0.989429175 + 1.013645224 + 0.970616114)/3=
4 (1.445378151 + 1.499516908 + 1.49860205)/3=
Sum =
23
Step 2 : Estimation of Trend/Cycle
Define deseasonalized (or seasonally adjusted)
series as
Dt Yt SNˆ t
for example, D1 = 72/0.6063 = 118.7506
24
25
TCt may be estimated by regression using a linear
trend:
Dt 0 1t t
t 1, 2, 3
TCˆ t Dˆ t b0 b1t ,
where b0 and b1 are least squares estimates of
0 and 1 respectively.
26
EXCEL regression output :
So,
TˆCt 113.6997914 1.854638009t
27
For example,
TˆC1 113.6997914 1.8546380091
115.5544294
Tˆ C 2 113.6997914 1.8546380092
117.4090674
28
29
Step 3 : Computation of fitted
values and out-of-sample forecasts
Yˆt TˆCt SˆN t
In - samplefit :
Yˆ 115.5544 0.6063 70.0621
1
Yˆ16 143.37401.4825 212.5516
30
Out of sample forecast :
Yˆ17 TˆC17 SˆN17
113.670 1.85517 0.6063
145.2286 0.6063
88.054
Yˆ18 TˆC18 SˆN18
113.670 1.85518 0.9191
147.0833 0.9191
135.1796
31
32
Figure 2.4
33
Measuring Forecast Accuracy :
Let et Yt Yˆt be theerrorsof forecast.
1)
Mean Squared Error
n
MSE et2 n
t 1
RMSE MSE
2)
Mean Absolute Deviation
n
MAD et n
t 1
RMAD MAD
34
et =
Method A
–2
1.5
–1
2.1
0.7
Method B
–4
0.7
0.5
1.4
0.1
Method A : MSE =
MAD =
2.43
1.46
Method B : MSE =
MAD =
3.742
1.34
35
Naive Prediction
Yˆt Yt 1
Theil’s u Statistics
U
Y Y
Yt Yˆt
2
2
t
t 1
n
n
if U = 1 Forecasts produced are no better than naive forecast
U = 0 Forecasts produced perfect fit
The smaller the value of U, the better the forecasts.
36
MSE = 11.932
MAD = 2.892
Theil’s U = 0.0546
37
Out-of-Sample Forecasts
1) Expost forecast
Prediction for the period in which actual
observations are available
2) Exante forecast
Prediction for the period in which actual
observations are not available.
38
“back” casting
T2
T1
estimation period
Ex-ante
forecast
Ex-post
forecast
in-sample
simulation
T3
Time
(today)
39
Additive Decomposition
Yt TCt SNt IRt
Yt
Yt
Trend
Trend
(Multiplicative Seasonality)
Time
(Additive Seasonality)
Time
40
Multiplicative decomposition is used when the time
series exhibits increasing or decreasing seasonal
variation (Yt=TCt SNt IRt)
Yr 1
Yr 2
TCt
SNt
Yt
Yt – Yt-1
Q1
Q2
Q3
Q4
11.5
13
14.5
16
1.5
0.5
0.8
1.2
17.25
6.5
11.6
19.2
–10.75
5.1
7.6
Q1
Q2
Q3
Q4
17.5
19
20.5
22
1.5
0.5
0.8
1.2
26.25
9.5
16.4
26.4
–16.75
6.9
10
41
Additive decomposition is used when the time
series exhibits constant seasonal variation
(Yt=TCt + SNt + IRt)
Yr 1
Yr 2
TCt
SNt
Yt
Yt – Yt-1
Q1
Q2
Q3
Q4
11.5
13
14.5
16
1.8
–1
–1.5
0.7
13.3
12
13
16.7
–1.3
1
3.7
Q1
Q2
Q3
Q4
17.5
19
20.5
22
1.8
–1
–1.5
0.7
19.3
18
19
22.7
–1.3
1
3.7
42
Step 1 : Estimation of seasonal
component (SNt)
Calculation of MAt and CMAt is the same as per
multiplicative decomposition
Initial seasonal component may be estimated by
~
SNt Yt CMAt
For example,
~
SN 3 117 118.25 1.25
~
SN 4 172 119 53
43
Seasonal indices are averaged and adjusted
so that they sum to zero (Why?)
44
45
Step 2 : Estimation of Trend/Cycle
Deseasonalized series is defined as
Dt Yt SNˆ t
TCt may be estimated by regression as per
multiplicative decomposition
46
i.e.,
Dt = o + 1t + t
ˆ b b t as per
TCˆt D
and
t
0
1
Multiplicative decomposition
47
So,
and
TCˆt 113.22708331.980637255t
Yˆt TˆCt SˆNt
For example,
1
TˆC1 113.2270833 1.980637255
and
115.2077206
Yˆ1 1115.2077206 50.80208333
64.40563725
48
MSE = 27.911
MAD = 4.477
49