Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C.

Download Report

Transcript Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C.

Slide 1

Christopher Dougherty

EC220 - Introduction to econometrics
(chapter 3)
Slideshow: prediction
Original citation:
Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 3). [Teaching Resource]
© 2012 The Author
This version available at: http://learningresources.lse.ac.uk/129/
Available in LSE Learning Resources Online: May 2012
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows
the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user
credits the author and licenses their new creations under the identical terms.
http://creativecommons.org/licenses/by-sa/3.0/

http://learningresources.lse.ac.uk/


Slide 2

PREDICTION
k

Pi   1 



j

X

ji

 ui

j2

Pˆ i  b1 

k

b

X

ji

bjX

*
j

j

j2

*
Pˆ  b1 

k



X

*
2

, X 3 ,..., X k 
*

*

j2

In the previous sequence, we saw how to predict the price of a good or asset given the
composition of its characteristics. In this sequence, we discuss the properties of such
predictions.
1


Slide 3

PREDICTION
k

Pi   1 



j

X

ji

 ui

j2

Pˆ i  b1 

k

b

X

ji

bjX

*
j

j

j2

*
Pˆ  b1 

k



X

*
2

, X 3 ,..., X k 
*

*

j2

Suppose that, given a sample of n observations, we have fitted a pricing model with k – 1
characteristics, as shown.
2


Slide 4

PREDICTION
k

Pi   1 



j

X

ji

 ui

j2

Pˆ i  b1 

k

b

X

ji

bjX

*
j

j

j2

*
Pˆ  b1 

k



X

*
2

, X 3 ,..., X k 
*

*

j2

Suppose now that one encounters a new variety of the good with characteristics {X2*, X3*, ...,
Xk* }. Given the sample regression result, it is natural to predict that the price of the new
variety should be given by the third equation.
3


Slide 5

PREDICTION
k

Pi   1 



j

X

ji

 ui

j2

Pˆ i  b1 

k

b

X

ji

bjX

*
j

j

j2

*
Pˆ  b1 

k



X

*
2

, X 3 ,..., X k 
*

*

j2

What can one say about the properties of this prediction? First, it is natural to ask whether
it is fair, in the sense of not systematically overestimating or underestimating the actual
price. Second, we will be concerned about the likely accuracy of the prediction.
4


Slide 6

PREDICTION

Pi   1   2 X i  u i
Pˆ i  b1  b 2 X i
*
*
Pˆ  b1  b 2 X

X 
*

We will start by supposing that the good has only one relevant characteristic and that we
have fitted the simple regression model shown. Hence, given a new variety of the good with
characteristic {X*} , the model gives us the predicted price.
5


Slide 7

PREDICTION

Pi   1   2 X i  u i
Pˆ i  b1  b 2 X i
*
*
Pˆ  b1  b 2 X

X 
*

*
*
PE  P  Pˆ

We will define the prediction error of the model, PE, as the difference between the actual
price and the predicted price.
6


Slide 8

PREDICTION

Pi   1   2 X i  u i
Pˆ i  b1  b 2 X i

X 

*
*
Pˆ  b1  b 2 X

*

*
*
PE  P  Pˆ

P  1  2X  u
*

*

*

We will assume that the model applies to the new good and therefore the actual price is
generated as shown, where u* is the value of the disturbance term for the new good.
7


Slide 9

PREDICTION

Pi   1   2 X i  u i
Pˆ i  b1  b 2 X i

X 

*
*
Pˆ  b1  b 2 X

*

*
*
PE  P  Pˆ

P  1  2X  u
*

*

*

*
*
*
*
*
PE  P  Pˆ   1   2 X  u   b1  b 2 X 

Then the prediction error is as shown.

8


Slide 10

PREDICTION
*
*
*
*
*
PE  P  Pˆ   1   2 X  u   b1  b 2 X 

E  PE   E   1   2 X  u

  E b  b X 
 E u   E  b   X E  b 
*

 1  2X

*

*

*

1

*

2

*

1

2

 1  2X  1  X 2  0
*

*

We take expectations.

9


Slide 11

PREDICTION
*
*
*
*
*
PE  P  Pˆ   1   2 X  u   b1  b 2 X 

E  PE   E   1   2 X  u

  E b  b X 
 E u   E  b   X E  b 
*

 1  2X

*

*

*

1

*

2

*

1

2

 1  2X  1  X 2  0
*

*

1 and 2 are assumed to be fixed parameters, so they are not affected by taking
expectations. Likewise, X* is assumed to be a fixed quantity and unaffected by taking
expectations. However, u*, b1 and b2 are random variables.
10


Slide 12

PREDICTION
*
*
*
*
*
PE  P  Pˆ   1   2 X  u   b1  b 2 X 

E  PE   E   1   2 X  u

  E b  b X 
 E u   E  b   X E  b 
*

 1  2X

*

*

*

1

*

2

*

1

2

 1  2X  1  X 2  0
*

*

E(u*) = 0 because u* is randomly drawn from the distribution for u, which we have assumed
as zero population mean. Under the usual OLS assumptions, b1 will be an unbiased
estimator of 1 and b2 an unbiased estimator of 2.
11


Slide 13

PREDICTION
*
*
*
*
*
PE  P  Pˆ   1   2 X  u   b1  b 2 X 

E  PE   E   1   2 X  u

  E b  b X 
 E u   E  b   X E  b 
*

 1  2X

*

*

*

1

*

2

*

1

2

 1  2X  1  X 2  0
*

*

Hence the expectation of the prediction error is zero. The result generalizes easily to the
case where there are multiple characteristics and the new good embodies a new
combination of them.
12


Slide 14

PREDICTION
*
*
*
*
*
PE  P  Pˆ   1   2 X  u   b1  b 2 X 

E  PE   E   1   2 X  u

  E b  b X 
 E u   E  b   X E  b 
*

 1  2X

*

*

*

1

*

2

*

1

2

 1  2X  1  X 2  0
*



2
PE



1
 1  
n



*


2
*
 X  X   2
 u
n
2



X

X
 i

i 1

The population variance of the prediction error is given by the expression shown.
Unsurprisingly, this implies that, the further is the value of from the sample mean, the
larger will be the population variance of the prediction error.
13


Slide 15

PREDICTION



2
PE



1
 1  
n




2
*
 X  X   2
 u
n
2
  X i  X  

i 1

The population variance of the prediction error is given by the expression shown.
Unsurprisingly, this implies that, the further is the value of from the sample mean, the
larger will be the population variance of the prediction error.
14


Slide 16

PREDICTION



2
PE



1
 1  
n




2
*
 X  X   2
 u
n
2
  X i  X  

i 1

It also implies, again unsurprisingly, that, the larger is the sample, the smaller will be the
population variance of the prediction error, with a lower limit of u2.
15


Slide 17

PREDICTION



2
PE



1
 1  
n




2
*
 X  X   2
 u
n
2
  X i  X  

i 1

Provided that the regression model assumptions are valid, b1 and b2 will tend to their true
values as the sample becomes large, so the only source of error in the prediction will be u*,
and by definition this has population variance u2.
16


Slide 18

PREDICTION



2
PE



1
 1  
n



s.e.  PE  


2
*
 X  X   2
 u
n
2
  X i  X  

i 1



1
1



n




2
*
 X  X   2
 su
n
2


 X i  X 

i 1

The standard error of the prediction error is calculated using the square root of the
expression for the population variance, replacing the variance of u with the estimate
obtained when fitting the model in the sample period.
17


Slide 19

PREDICTION
P

300

250

c o n fid e n c e in te rva l fo r P *

200

P^ = b 1 + b 2 X
150

P*

100

50

0

0

50

100

150

X*

200

X
250

*
*
*
Pˆ  t crit  s.e.  P  Pˆ  t crit  s.e.

Hence we are able to construct a confidence interval for a prediction. tcrit is the critical level
of t , given the significance level selected and the number of degrees of freedom, and s.e. is
the standard error of the prediction.
18


Slide 20

PREDICTION
P

300

250

u p p e r lim it o f c o n fid e n c e in te rva l

c o n fid e n c e in te rva l fo r P *

200

P^ = b 1 + b 2 X
150

P*

100

lo w e r lim it o f
c o n fid e n c e in te rva l
50

0

0

50

100

X

150

X*

200

X
250

*
*
*
Pˆ  t crit  s.e.  P  Pˆ  t crit  s.e.

The confidence interval has been drawn as a function of X *. As we noted from the
mathematical expression, it becomes wider, the greater the distance from X * to the sample
mean.
19


Slide 21

PREDICTION
P

300

250

u p p e r lim it o f c o n fid e n c e in te rva l

c o n fid e n c e in te rva l fo r P *

200

P^ = b 1 + b 2 X
150

P*

100

lo w e r lim it o f
c o n fid e n c e in te rva l
50

0

0

50

100

X

150

X*

200

X
250

*
*
*
Pˆ  t crit  s.e.  P  Pˆ  t crit  s.e.

With multiple explanatory variables, the expression for the prediction variance becomes
complex. One point to note is that multicollinearity may not have an adverse effect on
prediction precision, even though the estimates of the coefficients have large variances.
20


Slide 22

PREDICTION

Y  1  2X 2  3X 3  u
Suppose X2 and X3 are positively correlated, and that 2 and 3 are both
positive.
Then it can be shown that cov(b2, b3) < 0.
So if b2 is an overestimate, b3 is likely to compensate by being an
underestimate, and (b2X2* + b3X3*) may be a relatively good estimate of
(2X2* + 3X3*).

Similarly, for other combinations.

For simplicity, suppose that there are two explanatory variables, that both have positive true
coefficients, and that they are positively correlated, the model being as shown, and that we
are predicting the value of Y *, given values X2* and X3*.
21


Slide 23

PREDICTION

Y  1  2X 2  3X 3  u
Suppose X2 and X3 are positively correlated, and that 2 and 3 are both
positive.
Then it can be shown that cov(b2, b3) < 0.
So if b2 is an overestimate, b3 is likely to compensate by being an
underestimate, and (b2X2* + b3X3*) may be a relatively good estimate of
(2X2* + 3X3*).

Similarly, for other combinations.

Then if the effect of X2 is overestimated, so that b2 > 2, the effect of X3 will almost certainly
be underestimated, with b3 < 3. As a consequence, the effects of the errors may to some
extent cancel out, with the result that the linear combination may be close to ( 2X2* + 3X3*).
22


Slide 24

PREDICTION

Simulation

Y  10  2 X 2  3 X 3  u
X 2  { 1 , 2 , 3 , 4 , ..., 17 , 18 , 19 , 20 }

X 3  { 2 , 2 , 4 , 4 , ..., 18 , 18 , 20 , 20 }
r X 2 , X 3  0 . 9962
u ~ N  0 ,1 
Y

*

 b1  b 2 X 2  b 3 X 3
*

*

 b1   b 2  b 3  X 2

*

This will be illustrated with a simulation, with the model and data shown. We fit the model
and make the prediction Y * = b1 + b2X2* + b3X3*.
23


Slide 25

PREDICTION

Simulation

Y  10  2 X 2  3 X 3  u
X 2  { 1 , 2 , 3 , 4 , ..., 17 , 18 , 19 , 20 }

X 3  { 2 , 2 , 4 , 4 , ..., 18 , 18 , 20 , 20 }
r X 2 , X 3  0 . 9962
u ~ N  0 ,1 
Y

*

 b1  b 2 X 2  b 3 X 3
*

*

 b1   b 2  b 3  X 2

*

Since X2 and X3 are virtually identical, this may be approximated as Y * = b1 + (b2 + b3)X2*.
Thus the predictive accuracy depends on how close (b2 + b3) is to (b2 + b3), that is, to 5.
24


Slide 26

PREDICTION

Simulation

10

b

2

+ b

3

standard deviation 0.04

5

standard deviations 0.45

b

b

2

3

0
0

1

2

3

4

5

6

7

The figure shows the distributions of b2 and b3 for 10 million samples. Their distributions
have relatively wide variances around their true values, as should be expected, given the
multicollinearity. The actual standard deviations of their distributions is 0.45.
25


Slide 27

PREDICTION

Simulation

10

b

2

+ b

3

standard deviation 0.04

5

standard deviations 0.45

b

b

2

3

0
0

1

2

3

4

5

6

7

The figure also shows the distribution of their sum. As anticipated, it is distributed around
5, but with a much lower standard deviation, 0.04, despite the multicollinearity affecting the
point estimates of the individual coefficients.
26


Slide 28

Copyright Christopher Dougherty 2011.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 3.6 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own and who feel that they might
benefit from participation in a formal course should consider the London School
of Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
20 Elements of Econometrics
www.londoninternational.ac.uk/lse.

11.07.25