STAT 497 APPLIED TIME SERIES ANALYSIS

Download Report

Transcript STAT 497 APPLIED TIME SERIES ANALYSIS

STAT 497
APPLIED TIME SERIES ANALYSIS
INTRODUCTION
1
DEFINITION



y
• A stochastic process t t  is a collection of
random variables or a process that develops in
time according to probabilistic laws.
• The theory of stochastic processes gives us a
formal way to look at time series variables.
2
DEFINITION
Y w, t  : Stochastic Process
belongs to sample space
Indexed set
• For a fixed t, Y(w, t) is a random variable
• For a given w, Y(w, t) is called a sample
function or a realization as a function of t.
3
DEFINITION
• Time series is a realization or sample function
from a certain stochastic process.
• A time series is a set of observations generated
sequentially in time. Therefore, they are
dependent to each other. This means that we do
NOT have random sample.
• We assume that observations are equally spaced
in time.
• We also assume that closer observations might
have stronger dependency.
4
DEFINITION
• Discrete time series is one in which the set T0
at which observations are made is a discrete
set. Continuous time series are obtained when
observations are recorded continuously over
some time interval.
5
EXAMPLES
• Data in business, economics, engineering,
environment, medicine, earth sciences, and other
areas of scientific investigations are often
collected in the form of time series.
• Hourly temperature readings
• Daily stock prices
• Weekly traffic volume
• Annual growth rate
• Seasonal ice cream consumption
• Electrical signals
6
EXAMPLES
7
EXAMPLES
8
EXAMPLES
9
EXAMPLES
10
EXAMPLES
11
EXAMPLES
12
EXAMPLES
13
EXAMPLES
14
EXAMPLES
15
EXAMPLES
16
EXAMPLES (SUNSPOT NUMBERS)
17
OBJECTIVES OF TIME SERIES ANALYSIS
• Understanding the dynamic or timedependent structure of the observations of a
single series (univariate analysis)
• Forecasting of future observations
• Ascertaining the leading, lagging and feedback
relationships among several series
(multivariate analysis)
18
STEPS IN TIME SERIES ANALYSIS
• Model Identification
–
–
–
–
Time Series plot of the series
Check for the existence of a trend or seasonality
Check for the sharp changes in behavior
Check for possible outliers
• Remove the trend and the seasonal component to get stationary
residuals.
• Estimation
– MME
– MLE
• Diagnostic Checking
– Normality of error terms
– Independency of error terms
– Constant error variance (Homoscedasticity)
• Forecasting
– Exponential smoothing methods
– Minimum MSE forecasting
19
CHARACTERISTICS OF A SERIES
• For a time series Yt , t  0,1,2,
THE MEAN FUNCTION:
t  EYt  Exists iff E Yt  .
The expected value of the process at time t.
THE VARIANCE FUNCTION:
   0  Var Yt   EYt  t 
2
2
t
 E Y   
2
t
2
t

0  0
20
CHARACTERISTICS OF A SERIES
• THE AUTOCOVARIANCE FUNCTION:
 t ,s  CovYt , Ys   EYt  t Ys   s 
 E YtYs   t  s ; t , s  0,  1,  2,
Covariance between the value at time t and the value at time s of a stochastic process Yt.
• THE AUTOCORRELATION FUNCTION:
 t ,s
t ,s  Corr Yt , Ys  
,1  t ,s  1
 t s
The correlation of the series with itself
21
EXAMPLE
• Moving average process: Let ti.i.d.(0, 1),
and
Xt = t + 0.5 t−1
22
EXAMPLE
• RANDOM WALK: Let e1,e2,… be a sequence of
2
i.i.d. rvs with 0 mean and variance  e . The
observed time series
Yt , t  1,2,, n
is obtained as
Y1  e1
Y2  e1  e2  Y2  Y1  e2
Y3  e1  e2  e3  Y3  Y2  e3

Yt  e1    et  Yt  Yt 1  et
23
A RULE ON THE COVARIANCE
• If c1, c2,…, cm and d1, d2,…, dn are constants
and t1, t2,…, tm and s1, s2,…, sn are time points,
then

n
m
 m n
Cov  ciYti ,  d jYs j     ci d j Cov Yti , Ys j
j 1
 i 1
 i 1 j 1


m
m
n i 1
2


Var   ciYti    ci Var Yti   2   ci c j Cov Yti , Yt j
 i 1
 i 1
i  2 j 1

24
JOINT PDF OF A TIME SERIES
• Remember that
FX1  x1  : the marginal cdf
f X 1  x1  : the marginal pdf
FX 1 , X 2 ,, X n  x1 , x2 ,, xn  : the joint cdf
f X 1 , X 2 ,, X n  x1 , x2 ,, xn  : the joint pdf
25
JOINT PDF OF A TIME SERIES
• For the observed time series, say we have two
points, t and s.
• The marginal pdfs: fYt  yt  and fYs  ys 
• The joint pdf: fYt ,Ys  yt , ys   fYt  yt . fYs  ys 
26
JOINT PDF OF A TIME SERIES
• Since we have only one observation for each r.v.
Yt, inference is too complicated if distributions
(or moments) change for all t (i.e. change over
time). So, we need a simplification.
12
r.v.
Y4
10
8
6
r.v.
Y6
4
2
0
1
2
3
4
5
6
7
8
9
10
11
12
27
JOINT PDF OF A TIME SERIES
• To be able to identify the structure of the
series, we need the joint pdf of Y1, Y2,…, Yn.
However, we have only one sample. That is,
one observation from each random variable.
Therefore, it is very difficult to identify the
joint distribution. Hence, we need an
assumption to simplify our problem. This
simplifying assumption is known as
STATIONARITY.
28
STATIONARITY
• The most vital and common assumption in
time series analysis.
• The basic idea of stationarity is that the
probability laws governing the process do not
change with time.
• The process is in statistical equilibrium.
29
TYPES OF STATIONARITY
• STRICT (STRONG OR COMPLETE) STATIONARY
PROCESS: Consider a finite set of r.v.s.
Yt1 ,Yt2 ,,Ytn  from a stochastic process
Y w, t ; t  0,1,2,.
• The n-dimensional distribution function is
defined by
FYt ,Yt ,,Yt yt1 , yt2 ,, ytn   Pw : Yt1  y1,,Ytn  yn 
1
2
n
where yi, i=1, 2,…, n are any real numbers.
30
STRONG STATIONARITY
• A process is said to be first order stationary in
distribution, if its one dimensional distribution
function is time-invariant, i.e.,
FYt  y1   FYt  k  y1  for any t1 and k.
1
1
• Second order stationary in distribution if
FYt
,Y
1 t2
 y1, y2   FYt1k ,Yt2 k  y1, y2  for any t1 ,t2 and k.
• n-th order stationary in distribution if
FYt ,,Yt  y1,, yn   FYt k ,,Yt k  y1,, yn  for any t1 , ,tn and k.
1
n
1
n
31
STRONG STATIONARITY
n-th order stationarity in distribution = strong stationarity
 Shifting the time origin by an amount “k” has
no effect on the joint distribution, which must
therefore depend only on time intervals
between t1, t2,…, tn, not on absolute time, t.
32
STRONG STATIONARITY
• So, for a strong stationary process
i) fYt1 ,,Ytn  y1,, yn   fYt1k ,,Ytn k  y1,, yn 
ii) EYt   EYt k   t  t k   ,t , k
Expected value of a series is constant over time, not a function of time
2
2
2




Var
Y

Var
Y






, t , k
iii)
t
t k
t
t k
The variance of a series is constant over time, homoscedastic.
iv) CovYt ,Ys   CovYt  k ,Ys  k    t ,s   t  k ,s  k , t , k
  t  s   t  k  s k   h
Not constant, not depend on time, depends on time interval, which we call “lag”, k
33
STRONG STATIONARITY
12
Yt
10
8
Y1 Y2 Y3
6
4
………………………………………....
  
2 2 2
2
Yn

2
t
0
1
2
3
4
5
6
7
8
9
10
11
12
CovY2 , Y1    21   1
CovY3 , Y2    3 2   1
CovYn , Yn 1    n ( n 1)   1
Affected from time lag, k
CovY3 , Y1    31   2
CovY1, Y3    13    2
34
STRONG STATIONARITY
v) CorrYt ,Ys   CorrYt  k ,Ys  k   t ,s  t  k ,s  k , t , k
  t  s   t  k  s k  h
Let t=t-k and s=t,
t ,t k  t k ,t  k , t , k
• It is usually impossible to verify a distribution
particularly a joint distribution function from
an observed time series. So, we use weaker
sense of stationarity.
35
WEAK STATIONARITY
• WEAK (COVARIANCE) STATIONARITY OR
STATIONARITY IN WIDE SENSE: A time series is
said to be covariance stationary if its first and
second order moments are unaffected by a
change of time origin.
• That is, we have constant mean and variance
with covariance and correlation beings
functions of the time difference only.
36
WEAK STATIONARITY
EYt    , t
VarYt    2  , t
CovYt , Yt  k    k , t
CorrYt , Yt  k    k , t
From, now on, when we say “stationary”, we
imply weak stationarity.
37
EXAMPLE
• Consider a time series {Yt} where
Yt=et
and eti.i.d.(0,2). Is the process stationary?
38
EXAMPLE
• MOVING AVERAGE: Suppose that {Yt} is
constructed as
et  et 1
Yt 
2
and eti.i.d.(0,2). Is the process {Yt} stationary?
39
EXAMPLE
• RANDOM WALK
Yt  e1  e2    et
where eti.i.d.(0,2). Is the process {Yt}
stationary?
40
EXAMPLE
• Suppose that time series has the form
Yt  a  bt  et
where a and b are constants and {et} is a
weakly stationary process with mean 0 and
autocovariance function k. Is {Yt} stationary?
41
EXAMPLE
Yt   1 et
t
where eti.i.d.(0,2). Is the process {Yt}
stationary?
42
STRONG VERSUS WEAK STATIONARITY
• Strict stationarity means that the joint distribution only
depends on the ‘difference’ h, not the time (t1, . . . , tk).
• Finite variance is not assumed in the definition of strong
stationarity, therefore, strict stationarity does not
necessarily imply weak stationarity. For example, processes
like i.i.d. Cauchy is strictly stationary but not weak
stationary.
• A nonlinear function of a strict stationary variable is still
strictly stationary, but this is not true for weak stationary.
For example, the square of a covariance stationary process
may not have finite variance.
• Weak stationarity usually does not imply strict stationarity
as higher moments of the process may depend on time t.
43
STRONG VERSUS WEAK STATIONARITY
• If process {Xt} is a Gaussian time series, which
means that the distribution functions of {Xt}
are all multivariate Normal, weak stationary
also implies strict stationary. This is because a
multivariate Normal distribution is fully
characterized by its first two moments.
44
STRONG VERSUS WEAK STATIONARITY
• For example, a white noise is stationary but may
not be strict stationary, but a Gaussian white
noise is strict stationary. Also, general white
noise only implies uncorrelation while Gaussian
white noise also implies independence. Because
if a process is Gaussian, uncorrelation implies
independence. Therefore, a Gaussian white
noise is just i.i.d. N(0, 2).
45
STATIONARITY AND NONSTATIONARITY
• Stationary and nonstationary processes are
very different in their properties, and they
require different inference procedures. We
will discuss this in detail through this course.
46