Transcript Slide 1
Appendix B: A Primer of Time Series Forecasting Models B.1 A Primer of Time Series Forecasting Models 1 The Universal Time Series Model g (Yt ) f (Tt , St , I t ) TREND ERROR(Irregular) SEASONAL TRANSFORMATION 2 Additive Decomposition of the Airline Data T: Linear Trend S: Seasonal Average I: Irregular Component 3 Types of Models 4 g (Yt ) f ( I t ) Stationary Only g (Yt ) f (Tt , I t ) Trend and Stationary g (Yt ) f (St , I t ) Seasonal and Stationary g (Yt ) f (Tt , St , I t ) Trend, Seasonal, and Stationary Exponential Smoothing Models (ESM) 5 Stationary Only – Simple Exponential Smoothing (one parameter) Trend and Stationary – Simple Exponential Smoothing (one parameter) – Linear (Holt) Exponential Smoothing (two parameters) – Damped-Trend Exponential Smoothing (three parameters) continued... Exponential Smoothing Models (ESM) 6 Seasonal and Stationary – Seasonal Exponential Smoothing (two parameters) (Both additive and multiplicative types are supported.) Trend, Seasonal, and Stationary – Holt-Winters Additive (three parameters) – Holt-Winters Multiplicative (three parameters) Exponential Smoothing Premise 7 Weighted averages of past values can produce good forecasts of the future. The weights should emphasize the most recent data. Forecasting should require only a few parameters. Forecast equations should be simple and easy to implement. ESM as Weighted Averages Sample Mean Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Random Walk Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Weights applied to past values to predict Y9 8 ESM as Weighted Averages Sample Mean n Yˆn 1 wtYt w1Y1 w2Y2 wnYn t 1 n 1 1 n Yt Yt Y n t 1 t 1 n 1 wt n Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 8 1 Yˆ9 Yt 8 t 1 9 The mean is a weighted average where all weights are the same. ESM as Weighted Averages n Yˆn 1 wtYt Yn Random Walk t 1 wn 1, wt 0 for t 1,2,, n 1 A random walk forecast is a weighted average where all weights are 0 except the most recent, which is 1. 10 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Yˆ9 Y8 The Exponential Smoothing Coefficient Forecast Equation Yˆt 1 Yt (1 )Yˆt Yt (1 )[Yt 1 (1 )Yˆt 1 ] Y (1 )Y (1 ) 2 Yˆ t t 1 t 1 Yt (1 )Yt 1 (1 ) 2 [Yt 2 (1 )Yˆt 2 ] Yt (1 )Yt 1 (1 ) 2 Yt 2 (1 )3 Yˆt 2 T (1 ) iYt i i 0 11 Simple Exponential Smoothing 0.5 0.25 Y3 Y4 Y5 Y6 Y7 Y8 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Weights applied to past values to predict Y9 As the parameter grows larger, the most recent values are emphasized more. 12 ESM for Seasonal Data … Jan00 Jan01 Jan02 Jan03 Jan04 Feb00 Feb01 Feb02 Feb03 Feb04 Weights decay with respect to the seasonal factor. 13 ESM Seasonal Factors First seasonal factor s1 is always “natural.” First season: January, Monday, Q1 Additive model: factors average to 0 Multiplicative model: factors average to 1 s1 s 2 s3 s 4 s5 s 6 s 7 s8 s9 s10 s11s12 Monthly Seasonal Factors 14 Smoothing Weights Level smoothing weight Trend smoothing weight Seasonal smoothing weight Trend damping weight The choice of a Greek letter is arbitrary. The software uses names rather than Greek symbols. 15 ESM Parameters and Keywords ESM 16 Parameters Name in Repository Simple simple Double double Linear (Holt) , linear Damped-Trend , , damptrend Seasonal , seasonal Additive Winters , , addwinters Multiplicative Winters , , winters Box-Jenkins ARIMAX Models 17 ARIMAX: AutoRegressive Integrated Moving Average with eXogenous variables. AR: Autoregressive Time series is a function of its own past. MA: Moving Average Time series is a function of past shocks (deviations, innovations, errors, and so on). I: Integrated Differencing provides stochastic trend and seasonal components, so forecasting requires integration (undifferencing). X: Exogenous Time series is influenced by external factors. (These input variables can actually be endogenous or exogenous.) Box-Jenkins ARMA Models 18 Theory: Given a stationary time series, there exists an ARMA model that approximates the true model arbitrarily closely universal approximator. Reality: Given a stationary time series, there is no guarantee that you can find the best ARMA approximator. Theory: Apply differencing operators until what remains is a stationary time series. Reality: Differencing might not be the best way to model trend and seasonality. After differencing, the time series could still be nonstationary. Box-Jenkins Forecasting Myths 19 Myth: Box and Jenkins invented ARIMA models. Fact: Box and Jenkins brought together existing theory and added some of their results, and thus popularized the use of ARIMA models. Myth: Box-Jenkins forecasting only works for stationary time series. Fact: Box-Jenkins forecasting provides a general methodology for forecasting any time series. ARIMA models are nonstationary models that can be decomposed into the usual trend, seasonal, and stationary components. Historical Impediments to Box-Jenkins Modeling History Models are sophisticated and require training and experience to use them successfully. Modelers are prone to overfitting the data, which leads to poor forecasts. Software is unavailable, unreliable, or too slow for forecasting many time series. Today Techniques exist for automatic model selection. Honest assessment techniques prevent overfitting. Modern software is reliable and fast. 20 ARIMA Model Specification ARIMA(p, d, q)(P, D, Q) p indicates a simple autoregressive order. P indicates a seasonal autoregressive order. p 1 yt yt 1 t p 3 yt 1 yt 1 2 yt 2 3 yt 3 t P 1 yt s yt s For m onthlydatas 12 yt 12 yt 12 t . 21 AR(1): The Toothpaste Series tpstt .8tpstt 1 t 22 ARIMA Model Specification ARIMA(p, d, q)(P, D, Q) d indicates a simple difference of the series. D indicates a seasonal difference. d 1 yt ( yt yt 1 ) D 1 s yt ( yt yt s ) For m onthlydata s 12 yt ( yt yt 12 ). 12 23 ARIMA (1, 1, 0)(0, 0, 0): The Crocs Series Croct .8(Croct 1 ) t and Croct (Croct Croct 1 ) 24 ARIMA Model Specification ARIMA(p, d, q)(P, D, Q) q indicates a simple moving average order. Q indicates a seasonal moving average order. q 1 yt t t 1 q 3 yt t 1 t 1 2 t 2 3 t 3 Q 1 yt t s t s For m onthlydatas 12 25 yt t 12 t 12 . ARIMA (0, 0, 1)(0, 1, 0): The Pork Bellies Series s PBt t .4 t 1 and s PBt (PBt PBt 12 ) Summer Peaks; “BLT effect” 26 Types of ARIMA Models g (Yt ) f ( I t ) Stationary Only g (Yt ) f (Tt , I t ) Trend and Stationary g (Yt ) f (St , I t ) Seasonal and Stationary g (Yt ) f (Tt , St , I t ) Trend, Seasonal, and Stationary ARIMAX models accommodate exogenous variables. 27 Intermittent Demand Models (IDM) Intermittent time series have a large number of values that are zero. These types of series commonly occur in Internet, inventory, sales, and other data where the demand for a particular item is intermittent. Typically, when the value of the series associated with a particular time period is nonzero, demand occurs. When the value is zero (or missing), no demand occurs. Source: SAS®9 Online Help and Documentation 28 Intermittent Demand Data Demand Mostly Zeros Time 29 Intermittent Demand Models (IDM) Demand Size Interval Time 30 Intermittent Demand Models (IDM) Demand Size Demand Interval Index Index Average Demand=Demand Size divided by Demand Interval 31 Two IDM Choices 32 Croston’s Method = Two smoothing models – The Interval component is fit with an ESM. – The Size component is fit with an ESM. – The forecast of Average Demand is Forecast Size/Forecast Interval. Average Demand Method = One smoothing model – Average demand is calculated directly from the data and forecast with an ESM. Unobserved Components Models (UCMs) Unobserved components models are also called structural models in the time series literature. A UCM decomposes the response series into components such as trend, seasonals, cycles, and the regression effects due to predictor series. The components in the model are supposed to capture the salient features of the series that are useful in explaining and predicting its behavior. Source: SAS®9 Online Help and Documentation 33 Unobserved Components Models (UCMs) also known as structural time series models decomposed time series into four components: – trend – season – cycle – Irregular General form: Yt = Trend + Season + Cycle + Regressors 34 UCMs 35 Each component captures some important feature of the series dynamics. Components in the model have their own models. Each component has its own source of error. The coefficients for trend, season, and cycle are dynamic. The coefficients are testable. Each component has its own forecasts. Types of UCM Models g (Yt ) f ( I t ) Stationary Only g (Yt ) f (Tt , I t ) Trend and Stationary g (Yt ) f (St , I t ) Seasonal and Stationary g (Yt ) f (Tt , St , I t ) Trend, Seasonal, and Stationary UCM models accommodate exogenous variables. 36 Types of Models UCM Statement 37 Model Types irregular Stationary Only (White Noise) level, slope, irregular Trend and Stationary season (or cycle), irregular Seasonal and Stationary all statements Trend, Seasonal, and Stationary Specifying UCMs Unobserved components models are available through the HPFDIAGNOSE and HPFUCMSPEC procedures. The syntax used by these procedures is similar to that used by the UCM procedure in SAS/ETS software. 38 Which Model Type? 39 Performance: time required to derive coefficients and create forecasts Accuracy Usability: ease of going from data to forecasts and interpreting results Performance Best to worst: 1. ESM 2. IDM 3. ARIMAX 4. UCM 40 Accuracy Best to Worst: 1. ARIMAX, UCM 2. ESM Intermittent Demand - Best to Worst: 1. IDM (when appropriate). 2. Others can be used, but they generally provide unacceptable accuracy. 41 Usability Best to worst: 1. ESM 2. UCM 3. ARIMAX 42 Mean Absolute Percent Error (MAPE) Absolute percent error for one time point: 100% |Actual-Forecast|/Actual Interpretation the size of forecast error relative to the magnitude of the actual value Mean absolute the average of all of the individual percent error absolute percent errors MAPE is one of the most common accuracy measures in business forecasting. As a selection criterion, choose the model with the smallest value of MAPE. 43 Mean Absolute Error (MAE) Absolute error for one time point: |Actual-Forecast| Interpretation the size of the forecast error Mean absolute the average of all of the individual error absolute errors MAE is not commonly used as an accuracy measure in business forecasting. As a selection criterion, choose the model with the smallest value of MAE. 44 MAPE versus MAE Holiday Sales Day Low Sales Day Actual 1,000 300 Forecast 900 400 APE 10% 33.3% AE 100 100 Mean 21.65% MAPE 100 MAE An error of 100 on a large sales day is usually not as serious as an error of 100 on a low sales day, but MAE weights both equally. 45 Root Mean Square Error (RMSE) Squared error for one time point: (Actual-Forecast)2 Interpretation The squared size of the forecast error Mean The average of all of the individual squared error squared errors, adjusted for the number (MSE) of estimated model parameters Root mean square error The square root of MSE RMSE is commonly used as an accuracy measure in industrial, economic, and scientific forecasting. As a selection criterion, choose the model with the smallest value of RMSE. 46 Classes of Models 47 Exponential smoothing models ARIMAX models UCM models Simple regression models – are predefined trend components: linear, quadratic, cubic, log-linear, exponential, and so on – are predefined seasonal dummies – include a combination of one or more simple predefined components Simple models – the mean – a random walk – a random walk with drift Performance 48 Simple models have no performance issues. Exponential smoothing models can be constructed quickly and easily, so they always have good performance. ARIMAX models require many more computer cycles than simple or exponential smoothing models, but are based on algorithms that were refined over the past 30 years. Thus, creating a custom fit ARIMAX model is feasible even for large numbers of series. UCM models are very computer intensive and should be tried only on small data sets or individual time series. Forecasting with SAS Forecast Studio Functionality: Only automatically generated and custom ARIMAX or UCM models accommodate event, input, and outlier (exogenous) variables. Pre-existing ESM models and ARIMA models (for example, those shipped in the default catalog) do not accommodate exogenous variables. Automatically generated ARIMAX models can select best combinations of exogenous variables for each series diagnosed (identified). Custom, user-defined ARIMAX models must be specified to explicitly accommodate exogenous variables. 49 Static Linear Regression with Two Variables Y = 0 + 1X1 + 2X2 + Y is the target (response/dependent) variable. X1 and X2 are input (predictor/independent) variables. is the error term. 0, 1, and 2 are parameters. 0 is the intercept or constant term. 1 and 2 are partial regression coefficients. 50 Time Series Regression Static Regression Y 0 1 X 1 ... k X k Time Series Regression with Ordinary Regressors Yt 0 1 X1t ... k X kt t Time Series Regression with Dynamic Regressors Yt 0 10 X 1,t 11 X 1,t 1 1m1 X 1,t m1 20 X 2,t 2,t X 2,t 1 2 m2 X 2,t m2 k 0 X k ,t k1 X k ,t 1 kmk X k ,t mk t 51 Common Transfer Functions Contemporaneous Regression (B) 0 Model Yt 0 0 X t Z t ( B)Zt ( B) t 52 Common Transfer Functions Dynamic Regression: One Lag Term (B) 0 1B Model Yt 0 0 X t 1 X t 1 Zt ( B)Zt ( B) t 53 Common Transfer Functions Dynamic Regression: One Shifted Term ( B ) k B k Model Yt 0 k X t k Zt ( B)Zt ( B) t 54 Common Transfer Functions Dynamic Regression: One Shifted and One Lag Term ( B) 1B 2 B 2 Model Yt 0 1 X t 1 2 X t 2 Zt ( B)Zt ( B) t 55