Transcript Slide 1

Appendix B: A Primer of Time Series
Forecasting Models
B.1 A Primer of Time Series Forecasting Models
1
The Universal Time Series Model
g (Yt )  f (Tt , St , I t )
TREND
ERROR(Irregular)
SEASONAL
TRANSFORMATION
2
Additive Decomposition of the Airline Data
T: Linear
Trend
S: Seasonal
Average
I: Irregular
Component
3
Types of Models
4
g (Yt )  f ( I t )
Stationary Only
g (Yt )  f (Tt , I t )
Trend and Stationary
g (Yt )  f (St , I t )
Seasonal and Stationary
g (Yt )  f (Tt , St , I t )
Trend, Seasonal, and Stationary
Exponential Smoothing Models (ESM)


5
Stationary Only
– Simple Exponential Smoothing (one parameter)
Trend and Stationary
– Simple Exponential Smoothing (one parameter)
– Linear (Holt) Exponential Smoothing (two
parameters)
– Damped-Trend Exponential Smoothing (three
parameters)
continued...
Exponential Smoothing Models (ESM)


6
Seasonal and Stationary
– Seasonal Exponential Smoothing (two parameters)
(Both additive and multiplicative types are
supported.)
Trend, Seasonal, and Stationary
– Holt-Winters Additive (three parameters)
– Holt-Winters Multiplicative (three parameters)
Exponential Smoothing Premise




7
Weighted averages of past values can produce good
forecasts of the future.
The weights should emphasize the most recent data.
Forecasting should require only a few parameters.
Forecast equations should be simple and easy to
implement.
ESM as Weighted Averages
Sample Mean
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8
Random Walk
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8
Weights applied to past values to predict Y9
8
ESM as Weighted Averages
Sample Mean
n
Yˆn 1   wtYt  w1Y1  w2Y2    wnYn
t 1
n
1
1 n
  Yt   Yt  Y
n t 1
t 1 n
1
wt 
n
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8
8
1
Yˆ9   Yt
8 t 1
9
The mean is a weighted
average where all weights
are the same.
ESM as Weighted Averages
n
Yˆn 1   wtYt  Yn
Random Walk
t 1
wn  1, wt  0 for t  1,2,, n  1
A random walk forecast
is a weighted average
where all weights are 0
except the most recent,
which is 1.
10
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8
Yˆ9  Y8
The Exponential Smoothing Coefficient
Forecast Equation
Yˆt 1  Yt  (1   )Yˆt
 Yt  (1   )[Yt 1  (1   )Yˆt 1 ]
 Y   (1   )Y  (1   ) 2 Yˆ
t
t 1
t 1
 Yt   (1   )Yt 1  (1   ) 2 [Yt  2  (1   )Yˆt  2 ]
 Yt   (1   )Yt 1   (1   ) 2 Yt  2   (1   )3 Yˆt  2  
T
   (1   ) iYt i
i 0
11
Simple Exponential Smoothing
  0.5
  0.25
Y3 Y4 Y5 Y6 Y7 Y8
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8
Weights applied to past values to predict Y9
As the parameter grows larger, the most recent
values are emphasized more.
12
ESM for Seasonal Data
…
Jan00 Jan01 Jan02 Jan03 Jan04
Feb00 Feb01 Feb02 Feb03 Feb04
Weights decay with respect to the seasonal factor.
13
ESM Seasonal Factors
First seasonal factor s1 is always “natural.”
First season: January, Monday, Q1
Additive model:
factors average to 0
Multiplicative model:
factors average to 1
s1 s 2 s3 s 4 s5 s 6 s 7 s8 s9 s10 s11s12
Monthly Seasonal Factors
14
Smoothing Weights


Level smoothing weight
Trend smoothing weight

Seasonal smoothing weight

Trend damping weight
The choice of a Greek letter is arbitrary. The software
uses names rather than Greek symbols.
15
ESM Parameters and Keywords
ESM
16
Parameters
Name in Repository
Simple

simple
Double

double
Linear (Holt)
, 
linear
Damped-Trend
, , 
damptrend
Seasonal
, 
seasonal
Additive Winters
, , 
addwinters
Multiplicative Winters
, , 
winters
Box-Jenkins ARIMAX Models





17
ARIMAX: AutoRegressive Integrated Moving Average
with eXogenous variables.
AR: Autoregressive  Time series is a function of its
own past.
MA: Moving Average  Time series is a function of
past shocks (deviations, innovations, errors, and so
on).
I: Integrated  Differencing provides stochastic trend
and seasonal components, so forecasting requires
integration (undifferencing).
X: Exogenous  Time series is influenced by external
factors. (These input variables can actually be
endogenous or exogenous.)
Box-Jenkins ARMA Models




18
Theory: Given a stationary time series, there exists
an ARMA model that approximates the true model
arbitrarily closely  universal approximator.
Reality: Given a stationary time series, there is no
guarantee that you can find the best ARMA
approximator.
Theory: Apply differencing operators until what
remains is a stationary time series.
Reality: Differencing might not be the best way to
model trend and seasonality. After differencing, the
time series could still be nonstationary.
Box-Jenkins Forecasting Myths




19
Myth: Box and Jenkins invented ARIMA models.
Fact: Box and Jenkins brought together existing
theory and added some of their results, and thus
popularized the use of ARIMA models.
Myth: Box-Jenkins forecasting only works for
stationary time series.
Fact: Box-Jenkins forecasting provides a general
methodology for forecasting any time series. ARIMA
models are nonstationary models that can be
decomposed into the usual trend, seasonal, and
stationary components.
Historical Impediments to Box-Jenkins
Modeling
History
 Models are sophisticated and require training and
experience to use them successfully.
 Modelers are prone to overfitting the data, which leads
to poor forecasts.
 Software is unavailable, unreliable, or too slow for
forecasting many time series.
Today
 Techniques exist for automatic model selection.
 Honest assessment techniques prevent overfitting.
 Modern software is reliable and fast.
20
ARIMA Model Specification
ARIMA(p, d, q)(P, D, Q)
p indicates a simple autoregressive order.
P indicates a seasonal autoregressive order.
p  1
yt  yt 1   t
p  3
yt  1 yt 1  2 yt  2  3 yt 3   t
P 1
yt  s yt  s
For m onthlydatas  12

yt  12 yt 12   t .
21
AR(1): The Toothpaste Series
tpstt  .8tpstt 1   t
22
ARIMA Model Specification
ARIMA(p, d, q)(P, D, Q)
d indicates a simple difference of the series.
D indicates a seasonal difference.
d  1
yt  ( yt  yt 1 )
D 1
s yt  ( yt  yt  s )
For m onthlydata s  12
  yt  ( yt  yt 12 ).
12
23
ARIMA (1, 1, 0)(0, 0, 0): The Crocs Series
Croct  .8(Croct 1 )   t and Croct  (Croct  Croct 1 )
24
ARIMA Model Specification
ARIMA(p, d, q)(P, D, Q)
q indicates a simple moving average order.
Q indicates a seasonal moving average order.
q  1
yt   t  t 1
q  3
yt   t  1 t 1   2 t  2   3 t 3
Q 1
yt   t   s  t  s
For m onthlydatas  12

25
yt   t  12 t 12 .
ARIMA (0, 0, 1)(0, 1, 0): The Pork Bellies Series
s PBt   t  .4  t 1 and s PBt  (PBt  PBt 12 )
Summer Peaks; “BLT effect”
26
Types of ARIMA Models
g (Yt )  f ( I t )
Stationary Only
g (Yt )  f (Tt , I t )
Trend and Stationary
g (Yt )  f (St , I t )
Seasonal and Stationary
g (Yt )  f (Tt , St , I t )
Trend, Seasonal, and Stationary
ARIMAX models accommodate exogenous variables.
27
Intermittent Demand Models (IDM)
Intermittent time series have a large number of values
that are zero. These types of series commonly occur in
Internet, inventory, sales, and other data where the
demand for a particular item is intermittent. Typically,
when the value of the series associated with a particular
time period is nonzero, demand occurs. When the value
is zero (or missing), no demand occurs.
Source: SAS®9 Online Help and Documentation
28
Intermittent Demand Data
Demand
Mostly Zeros
Time
29
Intermittent Demand Models (IDM)
Demand
Size
Interval
Time
30
Intermittent Demand Models (IDM)
Demand
Size
Demand
Interval
Index
Index
Average Demand=Demand Size divided by Demand Interval
31
Two IDM Choices
32

Croston’s Method = Two smoothing models
– The Interval component is fit with an ESM.
– The Size component is fit with an ESM.
– The forecast of Average Demand is
Forecast Size/Forecast Interval.

Average Demand Method = One smoothing model
– Average demand is calculated directly from the
data and forecast with an ESM.
Unobserved Components Models (UCMs)
Unobserved components models are also called
structural models in the time series literature. A UCM
decomposes the response series into components such
as trend, seasonals, cycles, and the regression effects
due to predictor series. The components in the model are
supposed to capture the salient features of the series that
are useful in explaining and predicting its behavior.
Source: SAS®9 Online Help and Documentation
33
Unobserved Components Models (UCMs)


also known as structural time series models
decomposed time series into four components:
– trend
– season
– cycle
– Irregular
General form:
Yt = Trend + Season + Cycle + Regressors
34
UCMs






35
Each component captures some important feature of
the series dynamics.
Components in the model have their own models.
Each component has its own source of error.
The coefficients for trend, season, and cycle are
dynamic.
The coefficients are testable.
Each component has its own forecasts.
Types of UCM Models
g (Yt )  f ( I t )
Stationary Only
g (Yt )  f (Tt , I t )
Trend and Stationary
g (Yt )  f (St , I t )
Seasonal and Stationary
g (Yt )  f (Tt , St , I t )
Trend, Seasonal, and Stationary
UCM models accommodate exogenous variables.
36
Types of Models
UCM Statement
37
Model Types
irregular
Stationary Only (White Noise)
level, slope, irregular
Trend and Stationary
season (or cycle),
irregular
Seasonal and Stationary
all statements
Trend, Seasonal, and Stationary
Specifying UCMs
Unobserved components models are available through
the HPFDIAGNOSE and HPFUCMSPEC procedures. The
syntax used by these procedures is similar to that used by
the UCM procedure in SAS/ETS software.
38
Which Model Type?



39
Performance: time required to derive coefficients
and create forecasts
Accuracy
Usability: ease of going from data to forecasts and
interpreting results
Performance
Best to worst:
1. ESM
2. IDM
3. ARIMAX
4. UCM
40
Accuracy
Best to Worst:
1. ARIMAX, UCM
2. ESM
Intermittent Demand - Best to Worst:
1. IDM (when appropriate).
2. Others can be used, but they generally provide
unacceptable accuracy.
41
Usability
Best to worst:
1. ESM
2. UCM
3. ARIMAX
42
Mean Absolute Percent Error (MAPE)
Absolute percent error for one time point:
100%  |Actual-Forecast|/Actual
Interpretation
the size of forecast error relative to the
magnitude of the actual value
Mean absolute the average of all of the individual
percent error
absolute percent errors
MAPE is one of the most common accuracy measures in
business forecasting. As a selection criterion, choose the
model with the smallest value of MAPE.
43
Mean Absolute Error (MAE)
Absolute error for one time point:
|Actual-Forecast|
Interpretation
the size of the forecast error
Mean absolute the average of all of the individual
error
absolute errors
MAE is not commonly used as an accuracy measure in
business forecasting. As a selection criterion, choose the
model with the smallest value of MAE.
44
MAPE versus MAE
Holiday Sales
Day
Low Sales Day
Actual
1,000
300
Forecast
900
400
APE
10%
33.3%
AE
100
100
Mean
21.65% MAPE
100 MAE
An error of 100 on a large sales day is usually not
as serious as an error of 100 on a low sales day,
but MAE weights both equally.
45
Root Mean Square Error (RMSE)
Squared error for one time point:
(Actual-Forecast)2
Interpretation The squared size of the forecast error
Mean
The average of all of the individual
squared error squared errors, adjusted for the number
(MSE)
of estimated model parameters
Root mean
square error
The square root of MSE
RMSE is commonly used as an accuracy measure in
industrial, economic, and scientific forecasting. As a
selection criterion, choose the model with the smallest
value of RMSE.
46
Classes of Models





47
Exponential smoothing models
ARIMAX models
UCM models
Simple regression models
– are predefined trend components: linear, quadratic,
cubic, log-linear, exponential, and so on
– are predefined seasonal dummies
– include a combination of one or more simple
predefined components
Simple models
– the mean
– a random walk
– a random walk with drift
Performance




48
Simple models have no performance issues.
Exponential smoothing models can be constructed
quickly and easily, so they always have good
performance.
ARIMAX models require many more computer cycles
than simple or exponential smoothing models, but are
based on algorithms that were refined over the past 30
years. Thus, creating a custom fit ARIMAX model is
feasible even for large numbers of series.
UCM models are very computer intensive and should
be tried only on small data sets or individual time
series.
Forecasting with SAS Forecast Studio
Functionality:
 Only automatically generated and custom ARIMAX or
UCM models accommodate event, input, and outlier
(exogenous) variables.
 Pre-existing ESM models and ARIMA models (for
example, those shipped in the default catalog) do not
accommodate exogenous variables.
 Automatically generated ARIMAX models can select
best combinations of exogenous variables for each
series diagnosed (identified).
 Custom, user-defined ARIMAX models must be
specified to explicitly accommodate exogenous
variables.
49
Static Linear Regression with Two Variables
Y = 0 + 1X1 + 2X2 + 
Y is the target (response/dependent) variable.
X1 and X2 are input (predictor/independent) variables.
 is the error term.
0, 1, and 2 are parameters.
0 is the intercept or constant term.
1 and 2 are partial regression coefficients.
50
Time Series Regression
Static Regression
Y   0  1 X 1  ...   k X k  
Time Series Regression with Ordinary Regressors
Yt  0  1 X1t  ... k X kt   t
Time Series Regression with Dynamic Regressors
Yt   0  10 X 1,t  11 X 1,t 1    1m1 X 1,t  m1
 20 X 2,t  2,t X 2,t 1    2 m2 X 2,t m2

 k 0 X k ,t  k1 X k ,t 1    kmk X k ,t mk   t
51
Common Transfer Functions
Contemporaneous Regression
(B)  0
Model
Yt   0  0 X t  Z t
 ( B)Zt   ( B) t
52
Common Transfer Functions
Dynamic Regression: One Lag Term
(B)  0  1B
Model
Yt  0  0 X t  1 X t 1  Zt
 ( B)Zt   ( B) t
53
Common Transfer Functions
Dynamic Regression: One Shifted Term
 ( B )  k B
k
Model
Yt  0  k X t k  Zt
 ( B)Zt   ( B) t
54
Common Transfer Functions
Dynamic Regression: One Shifted and One Lag Term
 ( B)  1B  2 B
2
Model
Yt  0  1 X t 1  2 X t 2  Zt
 ( B)Zt   ( B) t
55