Statistical Forecasting Models
Download
Report
Transcript Statistical Forecasting Models
Statistical Forecasting Models
(Lesson - 07)
Best Bet to See the Future
Dr. C. Ertuna
1
Statistical Forecasting Models
• Time Series Models: independent variable
is time.
– Moving Average
– Exponential Smoothening
– Holt-Winters Model
• Explanatory Methods: independent
variable is one or more factor(s).
– Regression
Dr. C. Ertuna
2
Time Series Models
• Statistical Time Series Models are very
useful for short range forecasting problems
such as weekly sales.
• Time series models assume that whatever
forces have influenced the variables in
question (sales) in the recent past will
continue into the near future.
Dr. C. Ertuna
3
Time Series Components
A time series can be described by models based on the following
components
Tt
Trend Component
St
Seasonal Component
Ct
Cyclical Component
It
Irregular Component
Using these components we can define a time series as the sum of its
components or an additive model
X t Tt St Ct I t
Alternatively, in other circumstances we might define a time series as
the product of its components or a multiplicative model – often
represented as a logarithmic model
X t Tt St Ct I t
Dr. C. Ertuna
4
Components of Time Series Data
• A linear trend is any long-term increase or
decrease in a time series in which the rate of
change is relatively constant.
• A seasonal component is a pattern that is repeated
throughout a time series and has a recurrence
period of at most one year.
• A cyclical component is a pattern within the time
series that repeats itself throughout the time series
and has a recurrence period of more than one year.
Dr. C. Ertuna
5
Components of Time Series Data
• The irregular (or random) component
refers to changes in the time-series data that
are unpredictable and cannot be associated
with the trend, seasonal, or cyclical
components.
Dr. C. Ertuna
6
Stationary Time Series Models
Time series with constant mean and variance
are called stationary time series.
When Trend, Seasonal, or Cyclical effects are
not significant then
a) Moving Average Models and
b) Exponential Smoothing Models
are useful over short time periods.
Dr. C. Ertuna
7
Moving Average Models
• Simple Moving Average forecast is
computed as the average of the most recent
k-observations.
• Weighted Moving Average forecast is
computed as the weighted average of the
most recent k-observations where the most
recent observation has the highest weight.
Dr. C. Ertuna
8
Moving Average Models
• Simple Moving Average Forecast
t 1
Y
i
Ft E ( Yt ) i t k
k
• Weighted Moving Average Forecast
t 1
wY
Ft E ( Yt ) i t k
k
i
Dr. C. Ertuna
i
9
Weighted Moving Average
Actual
Month Burgla rie s
wMA(k =3)
100.00% =SUM(C4:C6)
All weights should add-up exactly to 1
42
88
0.1
43
44
0.3
44
60
0.6
45
56
58.0
=B5*$C$6+B4*$C$5+B3*$C$4
46
70
56.0
=B6*$C$6+B5*$C$5+B4*$C$4
47
91
64.8
=B7*$C$6+B6*$C$5+B5*$C$4
48
54
81.2
:
49
60
66.7
:
50
48
61.3
:
51
35
52.2
:
52
49
41.4
:
53
44
44.7
:
54
61
44.6
:
55
68
54.7
:
56
82
63.5
:
57
71
75.7
:
58
50
74.0
59
59.5
The further away from the forecast period
weights:
the lower is the weight
Most recent observation has the highest weight
Preliminary forecasted number of burglaries
MSE =
256.3
=SUMXMY2(B7:B20,C7:C20)/COUNT(B7:B20)
RMSE =
16.01
=SQRT(C22)
• To determine best
weights and period
(k) we can use
forecast accuracy.
• MSE = Mean
Square Error is a
good measure for
forecast accuracy.
• RMSE = is the
square root of the
MSE.
Data: Evens - Burglaries
Dr. C. Ertuna
10
Weighted Moving Average
Actual
wMA(k =3)
Month Burgla rie s 100.00%
42
88
0.1
43
44
0.3
44
60
0.6
45
56
58.0
46
70
56.0
47
91
64.8
48
54
81.2
49
60
66.7
50
48
61.3
51
35
52.2
52
49
41.4
53
44
44.7
54
61
44.6
55
68
54.7
56
82
63.5
57
71
75.7
58
50
74.0
59
•
•
•
•
•
•
•
Tools / Solver
Set Target Cell: Cell containing RMSE value
Equal to: Min
By Changing Cells: Cells containing weights
Subject to constraints: Cell containing sum of the weight = 1
Options / (check) Assume Non-Negativity
Solve ----- Keep Solver Solution ----- OK
59.5
MSE =
RMSE =
256.3
16.01
Dr. C. Ertuna
11
Weighted Moving Average
Actual
wMA(k =3)
Month Burgla rie s 100.00%
42
88
0.0285
43
44
0.2093
44
60
0.7622
45
56
57.5
46
70
56.5
47
91
66.8
48
54
85.6
49
60
62.2
50
48
59.6
51
35
50.7
52
49
38.4
53
44
46.0
54
61
44.8
55
68
57.1
56
82
65.8
57
71
78.5
58
50
73.2
59
55.3
MSE =
250.6
RMSE =
15.83
• Best weights for a given “k” (in
this case “3”) is determined by
solver trough minimizing
RMSE.
• Same procedure could be
applied to models with different
k’s and the one with lowest
RMSE could be considered as
the model with best forecasting
period.
Dr. C. Ertuna
12
Moving Average Models
Months
Crime
50
48
51
35
#N/A
#N/A
52
49
#N/A
#N/A
53
44
44.00
#N/A
54
61
42.67
#N/A
68
51.33
6.33
56
82
57.67
8.21
57
71
70.33
10.59
58
50
73.67
9.13
67.67
12.32
Moving Average
Forecast
90
80
70
60
50
40
30
20
10
0
50
55
59
Actual
errors
Crimes
k=3
51
52
53
54
55
56
57
58
Months
• Tools/ Data Analysis / Moving Average
• Input Range: Observations with title (No time)
• Output Range: Select next column to the input
range and 1-Row below of the first observation
• Chart misaligns the forecasted values!
Forecasted 59th month is aligned with 58th month
Dr. C. Ertuna
13
Exponential Smoothing
Exponential smoothing is a time-series smoothing
and forecasting technique that produces an
exponentially weighted moving average in which
each smoothing calculation or forecast is dependent
upon all previously observed values.
• The smoothing factor “α” is a value between 0
and 1, where α closer to 1 means more weigh to the
recent observations and hence more rapidly
changing forecast.
Dr. C. Ertuna
14
Exponential Smoothing Model
Ft Ft 1 ( Yt 1 Ft 1 )
Ft Yt 1 ( 1 )Ft 1
or
where:
Ft= Forecast value for period t
Yt-1 = Actual value for period t-1
Ft-1 = Forecast value for period t-1
= Alpha (smoothing constant)
Dr. C. Ertuna
15
Exponential Smoothing Model
Month
Crimes
alpha=0.7
50
48
#N/A
51
35
48.0
52
49
38.9
53
44
46.0
Actual
Expone ntial Smoothing
Forecast
90
80
70
Crimes
60
50
40
30
20
10
0
54
61
44.6
55
68
56.1
56
82
64.4
57
71
76.7
58
50
72.7
59
?
56.8
50
51
52
53
54
55
56
57
58
59
M onths
• Tools/ Data Analysis / Exponential
Smoothing.
• Input Range: Observations with title (No
time)
• Output Range: Select next column to the
input range and first Row of the first
observation
• Damping Factor: 1-α (not α)
Dr. C. Ertuna
16
Exponential Smoothing Model
A
1
B
Month Crime
C
D
0.7
2
50
48
#N/A
3
51
35
48.00 ! Actual observation B2
4
52
49
38.90
5
53
44
45.97
6
54
61
44.59
7
55
56
57
58
59
68
82
71
50
?
56.08
64.42
76.73
72.72
56.82 =$C$1*B10+(1-$C$1)*C10
8
9
10
11
12
13
MSE =
• To determine
best “α” we can
use forecast
accuracy.
• MSE = Mean
Square Error is a
good measure for
forecast
accuracy.
193.0 =SUMXMY2(B3:B10,C3:C10)/COUNT(B3:B10)
Dr. C. Ertuna
17
Holt-Winters Model
The Holt-Winters forecasting model could
be used in forecasting trends. Holt-Winters
model consists of both an exponentially
smoothing component (E, w) and a trend
component (T, v) with two different smoothing
factors.
Dr. C. Ertuna
18
Holt-Winters Model
Ft k Et kTt
Et wYt 1 ( 1 w )( Et 1 Tt 1 )
Tt v( Et Et 1 ) ( 1 v )Tt 1
where:
Ft+k= Forecast value k periods from t 1. E and T are
1
1
Yt-1 = Actual value for period t-1
not defined.
Et-1 = Estimated value for period t-1 2. E = Y
2
2
Tt = Trend for period t
3. T2 = Y2 – Y1
w = Smoothing constant for estimates
v = Smoothing factor for trend
Dr. C. Ertuna
19
k = number of periods
Holt-Winters Model
Holt-Winter Forecasting
60.0
50.0
40.0
30.0
20.0
10.0
0.0
Sales
F
1
2
3
4
5
6
7
8
9
10
11
12
13
1
2 Month
3
1
4
2
5
3
6
4
7
5
8
6
9
7
10
8
11
9
12
10
13
11
14
12
15
13
B
C
D
E
w=
0.7 0.5 = v
Sales E
T
F
4.8 N/A
N/A
4.0
4.0 -0.8
5.5
4.8 0.0
3.2
15.6 12.4 3.8
4.8
23.1 21.0 6.2 16.1
23.3 24.5 4.8 27.2
31.4 30.8 5.6 29.3
46.0 43.1 8.9 36.3
46.1 47.9 6.9 52.1
41.9 45.8 2.4 54.8
45.5 46.3 1.4 48.1
53.5 51.8 3.5 47.7
55.24 •
Sales
A
Months
E_2 = Y_2 and T_2 = (Y_2-Y_1)
• E_12 = $D$1*C14+(1-$D$1)*(D13+E13)
• T_12 = $E$1*(D14-D13)+(1-$E$1)*E13
• Dr.F_13
= D14+E14
C. Ertuna
20
Holt-Winters Model
• Set E (smoothing component), T (trend
component), and F (forecasted values) columns
next to Y (actual observations) in the same
sequence
• Determine initial “w” and “v” values
• Leave E,T &F blanc for the base period (t=1)
• Set E2 = Y2
• Set T2 = Y2-Y1 Note: (F2 is blanc)
Dr. C. Ertuna
21
Holt-Winters Model
•
•
•
•
Formulate E3 = w*Y3 + (1-w)*(E2+T2)
Formulate T3 = v*(E3-E2) + (1-v)*T2
Formulate F3 = E2 + T2
Copy the formulas down until reaching one
cell further than the last observation (Yn).
• Compute MSE using Y’s and F’s
• Use solver to determine optimal “w” and “v”.
Dr. C. Ertuna
22
Holt-Winters Model
Solver set up for Holt Winters:
• Target Cell: MSE (min)
• Changing Cells: w and v
• Constrains: w <= 1
w >= 0
v <= 1
v >= 0
Dr. C. Ertuna
23
Forecasting with Crystal Ball
• CBTools / CB Predictor
– [Input Data] Select
Range, First Raw, First Column Next
– [Data Attribute] Data is in periods, etc.
Next
– [Method Gallery] Select All Next
– [Results] Number of periods to forecast [1]
Select Past Forecasts at cell
Run
Dr. C. Ertuna
24
Forecasting with Crystal Ball
Actual Revenue
Year
5.0
1975
5.4
1976
6.0
1977
7.0
1978
8.0
1979
9.7
1980
10.3
1981
10.8
1982
10.2
1983
10.6
1984
10.6
1985
11.5
1986
13.3
1987
17.0
1988
18.4
1989
18.9
1990
19.4
1991
20.2
1992
16.3
1993
13.7
1994
15.3
1995
16.2
1996
14.5
1997
13.4
1998
14.1
1999
Actual Revenues of EASTMAN KODAC
Data: EASTMANK
Dr. C. Ertuna
25
Forecasting with Crystal Ball
Method
Errors:
Method Parameters:
Method
Single Exponential
Smoothing
Method
Double
Exponential
:
Smoothing
Best
Alpha
0.999
Beta
0.051
Alpha
1.5043
0.9871
7.68%
1.1566
9.03%
Single Moving
Average
1.5453
1.2042
9.40%
Double Moving
Average
2.0855
1.592
11.16%
3rd:
Single Moving Average
Periods
1
4th:
Double Moving Average
Periods
2
4th:
Student
Edition
Student
Edition
Actual Revenue
25.0
Lower:
5%
20.0
Forecast
Upper: 95%
Data
Fitted
15.0
17.0
Forecast
10.0
Upper: 95%
Low er: 5%
5.0
Dr. C. Ertuna
19
99
19
97
19
95
19
93
19
91
19
85
19
83
19
81
0.0
19
79
14.4
19
77
11.9
19
75
2000
MAPE
1.5147
3rd:
Date
MAD
Single Exponential
Smoothing
0.999
Forecast:
RMSE
2nd:
19
89
2nd:
Double Exponential
:
Smoothing
Value
19
87
Best
Parameter
26
Performance of a Model
Performance of a model is measured by
Theil’s U.
The Theil's U statistic falls between 0 and 1.
When U = 0, that means that the predictive
performance of the model is excellant and
when U = 1 then it means that the forecasting
performance is not better than just using the
last actual observation as a forecast.
Dr. C. Ertuna
27
Theil’s U versus RMSE
The difference between RMSE (or MAD or
MAPE) and Theil’s U is that the formars are
measure of ‘fit’; measuring how well model
fits to the historical data.
The Theil's U on the other hand measures how
well the model predicts against a ‘naive’
model. A forecast in a naive model is done by
repeating the most recent value of the variable
as the next forecasted value.
Dr. C. Ertuna
28
Choosing Forecasting Model
The forecasting model should be the one with
lowest Theil’s U.
If the best Theil’s U model is not the same as
the best RMSE model then you need to run
CB again by checking only the best Theil’s U
model to obtain forecasted value.
P.S. CB uses forecasting value of the lowest
RMSE model (best model according CB)!
Dr. C. Ertuna
29
Determining Performance
Theil’s U determins the forecasting
performance of the model.
The interpretation in daily language is as
follows:
Interpret (1- Thei’l U)
1.00 – 0.80 High (strong) forecasting power
0.80 – 0.60 Moderately high forecasting power
0.60 – 0.40 Moderate forecasting power
0.40 – 0.20 Weak forecasting power
0.20 – 0.00 Very weak forecasting power
Dr. C. Ertuna
30
Regression or Time Series Forecast
Here is the guiding principle when to apply
Regression and when to apply Time Series Forecast.
• As some thing changes (one or more independent
variables) how does another thing (dependent
variable) change is an issue of directional relationship
For directional relationships we can use regression.
• If the independent variable is TIME (as time changes
how does a variable change) Then we can use either
regression or time series forecasting models
Dr. C. Ertuna
31
Explanatory Methods
Simple Linear Regression Model: The
simplest inferential forecasting model is the
simple linear regression model, where time
(t) is the independent variable and the least
square line is used to forecast the future
values of Yt.
Dr. C. Ertuna
32
Regression in Forecasting Trends
Ft E( Yt ) 0 1t t
where:
Yt = Value of trend at time t
0 = Intercept of the trend line
1 = Slope of the trend line
t = Time (t = 1, 2, . . . )
Dr. C. Ertuna
33
Regression in Forecasting
Seasonality
• Many time series have distinct seasonal pattern. (For
example room sales are usually highest around summer
periods.)
• Multiple regression models can be used to forecast a time
series with seasonal components.
• The use of dummy variables for seasonality is common.
– Dummy variables needed = total number of seasonality –1
– For example: Quarterly Seasonal: 3 Dummies are needed, Monthly
Seasonal: 11 Dummies needed, etc.
– The load of each seasonal variable (dummy) is compared to the
one which is hidden in intercept.
Dr. C. Ertuna
34
Regression in Forecasting
Seasonality
Ft E( Yt ) 0 1t 2Q1 3Q2 4Q3 t
where:
Q1 = 1 , if quarter is 1, = 0 otherwise
Q2 = 1 , if quarter is 2, = 0 otherwise
Q3 = 1 , if quarter is 3, = 0 otherwise
2 = the load of Q1 above Q4
0 = the overall intercept + the load of Q4
t = Time (t = 1, 2, . . . )
Dr. C. Ertuna
35
Seasonal Regression
Seasonal Regression
135.00
130.00
125.00
120.00
115.00
110.00
105.00
100.00
95.00
90.00
85.00
80.00
Predicted Power
Load
Actual Power
Load
19
73
.1
19
73
.2
19
73
.3
19
73
.4
19
74
.1
19
74
.2
19
74
.3
19
74
.4
19
75
.1
19
75
.2
19
75
.3
19
75
.4
19
76
.1
19
76
.2
19
76
.3
19
76
.4
Year Q1 Q2 Q3
1973.1 1 0 0
1973.2 0 2 0
1973.3 0 0 3
1973.4 0 0 0
1974.1 1 0 0
1974.2 0 2 0
1974.3 0 0 3
1974.4 0 0 0
1975.1 1 0 0
1975.2 0 2 0
1975.3 0 0 3
1975.4 0 0 0
1976.1 1 0 0
1976.2 0 2 0
1976.3 0 0 3
1976.4 0 0 0
Power
MegaWatts
Power Load
106.8
89.2
110.7
91.7
108.6
98.9
120.1
102.1
113.1
94.2
120.5
107.4
116.2
104.4
131.7
117.9
Year/Quarter
E(Y_Q1) = -10801.6 + 5.52 * Year.1 + 8.06
E(Y_Q2) = -10801.6 + 5.52 * Year.2 + -3.50
E(Y_Q3) = -10801.6 + 5.52 * Year.3 + 5.51
E(Y_Q4) = -10801.6 + 5.52 * Year.4
Dr. C. Ertuna
36
Next Lesson
(Lesson - 09)
Introduction to Optimization
Dr. C. Ertuna
37