Transcript Chapter 20
Time Series Analysis and Forecasting Chapter 20 Introduction • Any variable that is measured over time in sequential order is called a time series. • We analyze time series to detect patterns. • The patterns help in forecasting future values of the time series. t Year Export for Sweden in the years 1993 -2004 Source:(www.scb.se) 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 Export (million skr) 499 507 589 383 700 874 694 152 789 378 842 471 890 690 1 018 544 1 048 444 1 042 952 1 070 177 1 182 568 ? ? Swedish Export 1400000 1200000 1000000 800000 600000 400000 1995 1993 1994 Year 1997 1996 1999 1998 2001 2000 2003 2002 2004 Earlier: Yt Now: X1t X2t …. Xkt Yt Earlier values of Y Components of a Time Series • A time series can consist of four components. – Long - term trend (T). – Cyclical effect (C). – Seasonal effect (S). – Random variation (R). A trend is a long term relatively smooth pattern or direction, that persists usually for more than one year. A cycle is a wavelike pattern describing a long term behavior (for more than one year). 6/90 6/93 6/96 6/99 6/02 Cycles are seldom regular, and often appear in combination with other components. Components of a Time Series The seasonal component of the time series exhibits a short term (less than one year) calendar repetitive behavior. 6/97 12/97 6/98 12/98 6/99 Random variation comprises the irregular unpredictable changes in the time series. It tends to hide the other (more predictable) components. We try to remove random variation thereby, identify the other components. Smoothing Techniques • To produce a better forecast we need to determine which components are present in a time series. • To identify the components present in the time series, we need first to remove the random variation. • This can be done by smoothing techniques. Moving Averages – A k-period moving average for time period t is the arithmetic average of the time series values around period t. – For example: A 3-period moving average at period t is calculated by (yt-1 + yt + yt+1)/3 Dow Jones Index Dow Jones Index January February March April May June July August September October November December 2004 2005 2006 2984.28 3038.65 2996.78 2945.94 2964.20 3042.96 3006.14 3028.03 3097.13 3160.25 3322.93 3395.82 3326.83 3427.49 3353.00 3270.30 3378.14 3348.42 3484.17 3468.54 3538.87 3460.16 3653.10 3638.06 3728.56 Dow Jones Index January February March April May June July August September October November December 2984.28 3038.65 2996.78 2945.94 2964.20 3042.96 3006.14 3028.03 3097.13 3160.25 3322.93 3395.82 3800 3600 3400 3200 3000 Dow Jones Index MA(DOWJON,3,3) em ov N te p Se ly Ju ay M rc h a M ua n J a em ov N te p Se ly Ju ay M rc h a m ua n Ja 2800 MONTH Yt Three-term MA Five-term MA y1 -------- ----------- y2 (y1 + y2 + y3)/3 ----------- y3 (y2 + y3 + y4)/3 (y1 + y2 + y3 + y4 + y5)/5 y4 (y3 + y4 + y5)/3 (y2 + y3 + y4 + y5 + y6)/5 y5 (y4 + y5 + y6)/3 (y3 + y4 + y5 + y6 + y7)/5 …. yn .. -------- … Centered Moving Average Sell Period ($ million) 1 170 2 148 3 141 4 150 5 161 6 137 7 132 8 158 9 157 4-terms MA 4-terms centered MA Drawbacks – The moving average method does not provide smoothed values (moving average values) for the first and last set of periods. – The moving average method considers only the observations included in the calculation of the average value, and “forgets” the rest. Exponentially Smoothed Time Series St = wyt + (1-w)St-1 St = exponentially smoothed time series at time t. yt = time series at time t. St-1 = exponentially smoothed time series at time t-1. w = smoothing constant, where 0 w 1. • The exponential smoothing method provides smoothed values for all the time periods observed. • When smoothing the time series at time t, the exponential smoothing method considers all the data available at t (yt, yt-1,…). Exponential Smoothing, w=.2 Small ‘w’ provides a lot of smoothing Value 100 50 15 13 11 9 7 5 3 1 0 Exponential Smoothing , w=.7 Big ‘w’ provides a little smoothing Value 100 50 0 1 3 5 7 9 11 13 15 Trend and Seasonal Effects Trend Analysis • The trend component of a time series can be linear or non-linear. • It is easy to isolate the trend component using linear regression. – For linear trend use the model y = b0 + b1t + e. – For non-linear trend with one (major) change in slope use a polynomial model, for example y = b0 + b1 t + b2 t 2 + e Trend analysis The purpose is to • Describe the trend component in order to make forecasts • Detrend the time series in order to make a season analysis. Seasonal Analysis • Seasonal variation may occur within a year or within a shorter period (month, week) • To measure the seasonal effects we construct seasonal indexes. • Seasonal indexes express the degree to which the seasons differ from the average time series value across all seasons. Computing Seasonal Indexes • Remove the effects of the seasonal and random variations by regression analysis yˆ t = b0 + b1t > > • For each time period compute the ratio yt/yt This is based on the Multiplicative Model. which removes most of the trend variation • For each season calculate the average of yt/yt which provides the measure of seasonality. • Adjust the average above so that the sum of averages of all seasons is 1 (if necessary) Computing Seasonal Indexes • Example 20.3 (Xm20-03) – Calculate the quarterly seasonal indexes for hotel occupancy rate in order to measure seasonal variation. – Data: Year Quarter 1996 1 2 3 4 1997 1 2 3 4 Rate 0.561 0.702 0.8 0.568 0.575 0.738 0.868 0.605 Year Quarter 1998 1 2 3 4 1999 1 2 3 4 Rate 0.594 0.738 0.729 0.6 0.622 0.708 0.806 0.632 Year Quarter 2000 1 2 3 4 Rate 0.665 0.835 0.873 0.67 Computing Seasonal Indexes • Perform regression analysis for the model y = b0 + b1t + e where t represents the time, and y represents the occupancy rate. Time (t) Rate 1 0.561 2 0.702 3 0.800 4 0.568 5 0.575 6 0.738 7 0.868 8 0.605 . . . . Rate yˆ .639368 .005246 t 0 5 10 15 20 The regression linet represents trend. 25 > The Ratios yt / yt yt Ratio yˆ t .561 .645 .561/.645=.870 .702 .650 .702/.650=1.08 …………………………………………………. =.639368+.005245(1) No trend is observed, but seasonality and randomness still exist. yt Rate/Predicted rate ˆy t 1.5 1 0.5 19 17 15 13 11 9 7 5 3 0 1 t 1 2 3 The Average Ratios by Seasons Rate/Predicted rate 0.870 1.080 1.221 0.860 0.864 1.100 1.284 0.888 0.865 1.067 1.046 0.854 0.879 0.993 1.122 0.874 0.913 1.138 1.181 0.900 • To remove most of the random variation but leave the seasonal effects,average the terms y t / yˆ t for each season. Rate/Predicted rate 1.5 1 0.5 0 1 3 5 7 9 11 13 15 17 19 Average ratio for quarter 1: (.870 + .864 + .865 + .879 + .913)/5 = .878 Average ratio for quarter 2: (1.080+1.100+1.067+.993+1.138)/5 = 1.076 Average ratio for quarter 3: (1.221+1.284+1.046+1.122+1.181)/5 = 1.171 Average ratio for quarter 4: (.860 +.888 + .854 + .874 + .900)/ 5 = .875 Adjusting the Average Ratios • In this example the sum of all the averaged ratios must be 4, such that the average ratio per season is equal to 1. • If the sum of all the ratios is not 4, we need to adjust them proportionately. Suppose the sum of ratios is equal to 4.1. Then each ratio will be multiplied by 4/4.1. In our problem the sum of all the averaged ratios is equal to 4: .878 + 1.076 + 1.171 + .875 = 4.0. No normalization is needed. These ratios become the seasonal indexes. Interpreting the Seasonal Indexes • The seasonal indexes tell us what is the ratio between the time series value at a certain season, and the overall seasonal average. 17.1% above the • In our problem: annual average 7.6% above the annual average Annual average occupancy (100%) 117.1% 12.5% below the annual average 107.6% 12.2% below the annual average 87.8% 87.5% Quarter 1 Quarter 2 Quarter 3 Quarter 4 Quarter 1 Quarter 2 Quarter 3 Quarter 4 The Smoothed Time Series • The trend component and the seasonality component are recomposed using the multiplicative model. This is used for forecasting. yˆ t Tˆt Sˆ t (.639 .0052 t )Sˆ t In period #1 ( quarter 1): In period #2 ( quarter 2): yˆ 1 Tˆ1 Sˆ 1 (.639 .0052 (1))(.878 ) .566 yˆ 2 Tˆ2 Sˆ 2 (.639 .0052 (2))(1.076 ) .699 • We can also use indicator variables in order to analyze the seasonal effects. Quarter I1 I2 I3 1 1 0 0 2 0 1 0 3 0 0 1 4 0 0 0 Coefficientsa Model 1 (Constant) I1 I2 I3 T Unstandardized Coefficients B Std. Error ,555 ,025 3,512E-03 ,024 ,139 ,024 ,205 ,024 5,037E-03 ,002 a. Dependent Variable: RATE Standardized Coefficients Beta ,015 ,609 ,897 ,293 t 22,350 ,143 5,741 8,510 3,348 Sig . ,000 ,888 ,000 ,000 ,004 Seasonal analysis The purpose is to • Describe the seasonal component in order to make forecasts • Deseasonalize the time series (makes it for example easier to compare timeseries over seasons) Deseasonalized Time Series Seasonally adjusted time series = Actual time series Seasonal index By removing the seasonality, we can identify changes in the other components of the time series, that might have occurred over time. Deseasonalized Time Series y1 / SI1 .561/ .870 .639 y 2 / SI2 .708 1.076 .652 y 5 / SI1 .575 .870 .661 In period #5 ( quarter 1): There was a gradual increase in occupancy rate In period #1 ( quarter 1): In period #2 ( quarter 2): 1 0.8 0.6 0.4 0.2 0 0 5 10 15 20 25