Chapter 1 deriving linear regression coefficients

Transcript Chapter 1 deriving linear regression coefficients

Christopher Dougherty
EC220 - Introduction to econometrics
(chapter 1)
Slideshow: deriving linear regression coefficients
Original citation:
Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 1). [Teaching Resource]
© 2012 The Author
This version available at: http://learningresources.lse.ac.uk/127/
Available in LSE Learning Resources Online: May 2012
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows
the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user
credits the author and licenses their new creations under the identical terms.
http://creativecommons.org/licenses/by-sa/3.0/
http://learningresources.lse.ac.uk/
DERIVING LINEAR REGRESSION COEFFICIENTS
True model:
Y  1   2 X  u
Y
6
Y3
Y2
5
4
3
Y1
2
1
0
0
1
2
3
X
This sequence shows how the regression coefficients for a simple regression model are
derived, using the least squares criterion (OLS, for ordinary least squares)
1
DERIVING LINEAR REGRESSION COEFFICIENTS
True model:
Y  1   2 X  u
Y
6
Y3
Y2
5
4
3
Y1
2
1
0
0
1
2
3
X
We will start with a numerical example with just three observations: (1,3), (2,5), and (3,6).
2
DERIVING LINEAR REGRESSION COEFFICIENTS
True model:
Fitted line:
Y
6
Y  1   2 X  u
Yˆ  b1  b2 X
Yˆ3  b1  3b2
Y3
Y2
5
4
Yˆ2  b1  2b2
Yˆ1  b1  b2
3
Y1
b2
2
b1
1
0
0
1
2
3
X
^ = b + b X, we will determine the values of b and b that
Writing the fitted regression as Y
1
2
1
2
minimize RSS, the sum of the squares of the residuals.
3
DERIVING LINEAR REGRESSION COEFFICIENTS
True model:
Fitted line:
Y
6
Y  1   2 X  u
Yˆ  b1  b2 X
Yˆ3  b1  3b2
Y3
Y2
5
4
Yˆ2  b1  2b2
Yˆ1  b1  b2
3
e1  Y1  Yˆ1  3  b1  b2
e2  Y2  Yˆ2  5  b1  2b2
Y1
b2
2
b1
e3  Y3  Yˆ3  6  b1  3b2
1
0
0
1
2
3
X
Given our choice of b1 and b2, the residuals are as shown.
4
SIMPLE REGRESSION ANALYSIS
RSS  e12  e22  e32  ( 3  b1  b2 ) 2  (5  b1  2b2 ) 2  (6  b1  3b2 ) 2
 9  b12  b22  6b1  6b2  2b1b2
 25  b12  4b22  10b1  20b2  4b1b2
 36  b12  9b22  12b1  36b2  6b1b2
 70  3b12  14b22  28b1  62b2  12b1b2
e1  Y1  Yˆ1  3  b1  b2
e2  Y2  Yˆ2  5  b1  2b2
e3  Y3  Yˆ3  6  b1  3b2
The sum of the squares of the residuals is thus as shown above.
5
SIMPLE REGRESSION ANALYSIS
RSS  e12  e22  e32  ( 3  b1  b2 ) 2  (5  b1  2b2 ) 2  (6  b1  3b2 ) 2
 9  b12  b22  6b1  6b2  2b1b2
 25  b12  4b22  10b1  20b2  4b1b2
 36  b12  9b22  12b1  36b2  6b1b2
 70  3b12  14b22  28b1  62b2  12b1b2
The quadratics have been expanded.
6
SIMPLE REGRESSION ANALYSIS
RSS  e12  e22  e32  ( 3  b1  b2 ) 2  (5  b1  2b2 ) 2  (6  b1  3b2 ) 2
 9  b12  b22  6b1  6b2  2b1b2
 25  b12  4b22  10b1  20b2  4b1b2
 36  b12  9b22  12b1  36b2  6b1b2
 70  3b12  14b22  28b1  62b2  12b1b2
Like terms have been added together.
7
SIMPLE REGRESSION ANALYSIS
RSS  e12  e22  e32  ( 3  b1  b2 ) 2  (5  b1  2b2 ) 2  (6  b1  3b2 ) 2
 9  b12  b22  6b1  6b2  2b1b2
 25  b12  4b22  10b1  20b2  4b1b2
 36  b12  9b22  12b1  36b2  6b1b2
 70  3b12  14b22  28b1  62b2  12b1b2
RSS
 0  6b1  12b2  28  0
b1
RSS
 0  12b1  28b2  62  0
b2
For a minimum, the partial derivatives of RSS with respect to b1 and b2 should be zero. (We
should also check a second-order condition.)
8
SIMPLE REGRESSION ANALYSIS
RSS  e12  e22  e32  ( 3  b1  b2 ) 2  (5  b1  2b2 ) 2  (6  b1  3b2 ) 2
 9  b12  b22  6b1  6b2  2b1b2
 25  b12  4b22  10b1  20b2  4b1b2
 36  b12  9b22  12b1  36b2  6b1b2
 70  3b12  14b22  28b1  62b2  12b1b2
RSS
 0  6b1  12b2  28  0
b1
RSS
 0  12b1  28b2  62  0
b2
The first-order conditions give us two equations in two unknowns.
9
SIMPLE REGRESSION ANALYSIS
RSS  e12  e22  e32  ( 3  b1  b2 ) 2  (5  b1  2b2 ) 2  (6  b1  3b2 ) 2
 9  b12  b22  6b1  6b2  2b1b2
 25  b12  4b22  10b1  20b2  4b1b2
 36  b12  9b22  12b1  36b2  6b1b2
 70  3b12  14b22  28b1  62b2  12b1b2
RSS
 0  6b1  12b2  28  0
b1
RSS
 0  12b1  28b2  62  0
b2
 b1  1.67, b2  1.50
Solving them, we find that RSS is minimized when b1 and b2 are equal to 1.67 and 1.50,
respectively.
10
DERIVING LINEAR REGRESSION COEFFICIENTS
True model:
Fitted line:
Y
Y  1   2 X  u
Yˆ  b1  b2 X
Yˆ3  b1  3b2
6
Y3
Y2
5
4
Yˆ2  b1  2b2
Yˆ1  b1  b2
3
Y1
b2
2
b1
1
0
0
1
2
3
X
Here is the scatter diagram again.
11
DERIVING LINEAR REGRESSION COEFFICIENTS
True model:
Fitted line:
Y  1   2 X  u
Yˆ  1.67  1.50 X
Y
Yˆ3  6.17
6
Y3
Y2
5
4
Yˆ2  4.67
Yˆ1  3.17
3
Y1
2
1.67
1
1.50
0
0
1
2
3
X
The fitted line and the fitted values of Y are as shown.
12
DERIVING LINEAR REGRESSION COEFFICIENTS
Y
True model:
Y  1   2 X  u
Yn
Y1
X1
Xn X
Now we will do the same thing for the general case with n observations.
13
DERIVING LINEAR REGRESSION COEFFICIENTS
Y
True model:
Fitted line:
Y  1   2 X  u
Yˆ  b1  b2 X
Yˆn  b1  b2 X n
Yn
Y1
b1
b2
Yˆ1  b1  b2 X 1
X1
Xn X
Given our choice of b1 and b2, we will obtain a fitted line as shown.
14
DERIVING LINEAR REGRESSION COEFFICIENTS
Y
True model:
Fitted line:
Y  1   2 X  u
Yˆ  b1  b2 X
Yˆn  b1  b2 X n
Yn
e1
b1
b2
Y1
Yˆ1  b1  b2 X 1
X1
e1  Y1  Yˆ1  Y1  b1  b2 X 1
.....
en  Yn  Yˆn  Yn  b1  b2 X n
Xn X
The residual for the first observation is defined.
15
DERIVING LINEAR REGRESSION COEFFICIENTS
Y
True model:
Fitted line:
Y  1   2 X  u
Yˆ  b1  b2 X
en
Yˆn  b1  b2 X n
Yn
e1
b1
b2
Y1
Yˆ1  b1  b2 X 1
X1
e1  Y1  Yˆ1  Y1  b1  b2 X 1
.....
en  Yn  Yˆn  Yn  b1  b2 X n
Xn X
Similarly we define the residuals for the remaining observations. That for the last one is
marked.
16
DERIVING LINEAR REGRESSION COEFFICIENTS
RSS  e12  e22  e32  ( 3  b1  b2 ) 2  (5  b1  2b2 ) 2  (6  b1  3b2 ) 2
 9  b12  b22  6b1  6b2  2b1b2
 25  b12  4b22  10b1  20b2  4b1b2
 36  b12  9b22  12b1  36b2  6b1b2
 70  3b12  14b22  28b1  62b2  12b1b2
RSS  e12  ...  en2  (Y1  b1  b2 X 1 ) 2  ...  (Yn  b1  b2 X n ) 2

Y12  b12 
b22 X 12 
2b1Y1 
2b2 X 1Y1 
2b1b2 X 1
b22 X n2 
2b1Yn 
2b2 X nYn 
2b1b2 X n
 ...

Yn2  b12 
  Yi 2  nb12  b22  X i2  2b1  Yi  2b2  X iYi  2b1b2  X i
RSS, the sum of the squares of the residuals, is defined for the general case. The data for
the numerical example are shown for comparison..
17
DERIVING LINEAR REGRESSION COEFFICIENTS
RSS  e12  e22  e32  ( 3  b1  b2 ) 2  (5  b1  2b2 ) 2  (6  b1  3b2 ) 2
 9  b12  b22  6b1  6b2  2b1b2
 25  b12  4b22  10b1  20b2  4b1b2
 36  b12  9b22  12b1  36b2  6b1b2
 70  3b12  14b22  28b1  62b2  12b1b2
RSS  e12  ...  en2  (Y1  b1  b2 X 1 ) 2  ...  (Yn  b1  b2 X n ) 2

Y12  b12 
b22 X 12 
2b1Y1 
2b2 X 1Y1 
2b1b2 X 1
b22 X n2 
2b1Yn 
2b2 X nYn 
2b1b2 X n
 ...

Yn2  b12 
  Yi 2  nb12  b22  X i2  2b1  Yi  2b2  X iYi  2b1b2  X i
The quadratics are expanded.
18
DERIVING LINEAR REGRESSION COEFFICIENTS
RSS  e12  e22  e32  ( 3  b1  b2 ) 2  (5  b1  2b2 ) 2  (6  b1  3b2 ) 2
 9  b12  b22  6b1  6b2  2b1b2
 25  b12  4b22  10b1  20b2  4b1b2
 36  b12  9b22  12b1  36b2  6b1b2
 70  3b12  14b22  28b1  62b2  12b1b2
RSS  e12  ...  en2  (Y1  b1  b2 X 1 ) 2  ...  (Yn  b1  b2 X n ) 2

Y12  b12 
b22 X 12 
2b1Y1 
2b2 X 1Y1 
2b1b2 X 1
b22 X n2 
2b1Yn 
2b2 X nYn 
2b1b2 X n
 ...

Yn2  b12 
  Yi 2  nb12  b22  X i2  2b1  Yi  2b2  X iYi  2b1b2  X i
Like terms are added together.
19
DERIVING LINEAR REGRESSION COEFFICIENTS
RSS  70  3b12  14b22  28b1  62b2  12b1b2
RSS
 0  6b1  12b2  28  0
b1
RSS
 0  12b1  28b2  62  0
b2
 b1  1.67, b2  1.50
RSS  Yi 2  nb12  b22  X i2  2b1 Yi  2b2  X iYi  2b1b2  X i
Note that in this equation the observations on X and Y are just data that determine the
coefficients in the expression for RSS.
20
DERIVING LINEAR REGRESSION COEFFICIENTS
RSS  70  3b12  14b22  28b1  62b2  12b1b2
RSS
 0  6b1  12b2  28  0
b1
RSS
 0  12b1  28b2  62  0
b2
 b1  1.67, b2  1.50
RSS  Yi 2  nb12  b22  X i2  2b1 Yi  2b2  X iYi  2b1b2  X i
The choice variables in the expression are b1 and b2. This may seem a bit strange because
in elementary calculus courses b1 and b2 are usually constants and X and Y are variables.
21
DERIVING LINEAR REGRESSION COEFFICIENTS
RSS  70  3b12  14b22  28b1  62b2  12b1b2
RSS
 0  6b1  12b2  28  0
b1
RSS
 0  12b1  28b2  62  0
b2
 b1  1.67, b2  1.50
RSS  Yi 2  nb12  b22  X i2  2b1 Yi  2b2  X iYi  2b1b2  X i
However, if you have any doubts, compare what we are doing in the general case with what
we did in the numerical example.
22
DERIVING LINEAR REGRESSION COEFFICIENTS
RSS  70  3b12  14b22  28b1  62b2  12b1b2
RSS
 0  6b1  12b2  28  0
b1
RSS
 0  12b1  28b2  62  0
b2
 b1  1.67, b2  1.50
RSS  Yi 2  nb12  b22  X i2  2b1 Yi  2b2  X iYi  2b1b2  X i
RSS
 0  2nb1  2 Yi  2b2  X i  0
b1
The first derivative with respect to b1.
23
DERIVING LINEAR REGRESSION COEFFICIENTS
RSS  70  3b12  14b22  28b1  62b2  12b1b2
RSS
 0  6b1  12b2  28  0
b1
RSS
 0  12b1  28b2  62  0
b2
 b1  1.67, b2  1.50
RSS  Yi 2  nb12  b22  X i2  2b1 Yi  2b2  X iYi  2b1b2  X i
RSS
 0  2nb1  2 Yi  2b2  X i  0
b1
nb1   Yi b2  X i
b1  Y  b2 X
With some simple manipulation we obtain a tidy expression for b1 .
24
DERIVING LINEAR REGRESSION COEFFICIENTS
RSS  70  3b12  14b22  28b1  62b2  12b1b2
RSS
 0  6b1  12b2  28  0
b1
RSS
 0  12b1  28b2  62  0
b2
 b1  1.67, b2  1.50
RSS  Yi 2  nb12  b22  X i2  2b1 Yi  2b2  X iYi  2b1b2  X i
RSS
 0  2nb1  2 Yi  2b2  X i  0
b1
nb1   Yi b2  X i
b1  Y  b2 X
RSS
 0  2b2  X i2  2 X iYi  2b1  X i  0
b2
The first derivative with respect to b2.
25
SIMPLE REGRESSION ANALYSIS
RSS
 0  2b2  X i2  2 X iYi  2b1  X i  0
b2
b2  X i2   X iYi  b1  X i  0
RSS  Yi 2  nb12  b22  X i2  2b1 Yi  2b2  X iYi  2b1b2  X i
RSS
 0  2nb1  2 Yi  2b2  X i  0
b1
nb1   Yi b2  X i
b1  Y  b2 X
RSS
 0  2b2  X i2  2 X iYi  2b1  X i  0
b2
Divide through by 2.
26
SIMPLE REGRESSION ANALYSIS
RSS
 0  2b2  X i2  2 X iYi  2b1  X i  0
b2
b2  X i2   X iYi  b1  X i  0
b2  X i2   X iYi  (Y  b2 X ) X i  0
RSS  Yi 2  nb12  b22  X i2  2b1 Yi  2b2  X iYi  2b1b2  X i
RSS
 0  2nb1  2 Yi  2b2  X i  0
b1
nb1   Yi b2  X i
b1  Y  b2 X
RSS
 0  2b2  X i2  2 X iYi  2b1  X i  0
b2
We now substitute for b1 using the expression obtained for it and we thus obtain an
equation that contains b2 only.
27
SIMPLE REGRESSION ANALYSIS
RSS
 0  2b2  X i2  2 X iYi  2b1  X i  0
b2
b2  X i2   X iYi  b1  X i  0
b2  X i2   X iYi  (Y  b2 X ) X i  0
b2  X i2   X iYi  (Y  b2 X )nX  0
X

X
i
n
X
i
 nX
The definition of the sample mean has been used.
28
SIMPLE REGRESSION ANALYSIS
RSS
 0  2b2  X i2  2 X iYi  2b1  X i  0
b2
b2  X i2   X iYi  b1  X i  0
b2  X i2   X iYi  (Y  b2 X ) X i  0
b2  X i2   X iYi  (Y  b2 X )nX  0
b2  X i2   X iYi  nXY  nb2 X 2  0
The last two terms have been disentangled.
29
SIMPLE REGRESSION ANALYSIS
RSS
 0  2b2  X i2  2 X iYi  2b1  X i  0
b2
b2  X i2   X iYi  b1  X i  0
b2  X i2   X iYi  (Y  b2 X ) X i  0
b2  X i2   X iYi  (Y  b2 X )nX  0
b2  X i2   X iYi  nXY  nb2 X 2  0
b2  X i2  nX 2    X iYi  nXY
Terms not involving b2 have been transferred to the right side.
30
SIMPLE REGRESSION ANALYSIS
b2  X i2  nX 2    X iYi  nXY
b2  X i2  nX 2    X iYi  nXY
To create space, the equation is shifted to the top of the slide.
31
SIMPLE REGRESSION ANALYSIS
b2  X i2  nX 2    X iYi  nXY
b2
X Y  nXY


 X  nX
i
i
2
i
2
Hence we obtain an expression for b2.
32
SIMPLE REGRESSION ANALYSIS
b2  X i2  nX 2    X iYi  nXY
X Y  nXY

b 
 X  nX
 X  X Y  Y 


 X  X 
i
2
b2
i
2
i
i
2
i
2
i
In practice, we shall use an alternative expression. We will demonstrate that it is equivalent.
33
SIMPLE REGRESSION ANALYSIS
b2  X i2  nX 2    X iYi  nXY
X Y  nXY

b 
 X  nX
 X  X Y  Y 


 X  X 
i
2
b2
i
2
i
i
2
i
2
i
X
i
 X Yi  Y    X iYi   X iY   XYi   XY
  X iYi  Y  X i  X  Yi  nXY
  X iYi  Y nX   X nY   nXY
  X iYi  nXY
Expanding the numerator, we obtain the terms shown.
34
SIMPLE REGRESSION ANALYSIS
b2  X i2  nX 2    X iYi  nXY
X Y  nXY

b 
 X  nX
 X  X Y  Y 


 X  X 
i
2
b2
i
2
i
i
2
i
2
i
X
i
 X Yi  Y    X iYi   X iY   XYi   XY
  X iYi  Y  X i  X  Yi  nXY
  X iYi  Y nX   X nY   nXY
  X iYi  nXY
In the second term the mean value of Y is a common factor. In the third, the mean value of
X is a common factor. The last term is the same for all i.
35
SIMPLE REGRESSION ANALYSIS
b2  X i2  nX 2    X iYi  nXY
X

X
i
2
i
n
X
i
X Y  nXY

b 
 X  nX
 X  X Y  Y 


 X  X 
 nX
b2
i
2
i
i
2
i
2
i
X
i
 X Yi  Y    X iYi   X iY   XYi   XY
  X iYi  Y  X i  X  Yi  nXY
  X iYi  Y nX   X nY   nXY
  X iYi  nXY
We use the definitions of the sample means to simplify the expression.
36
SIMPLE REGRESSION ANALYSIS
b2  X i2  nX 2    X iYi  nXY
X Y  nXY

b 
 X  nX
 X  X Y  Y 


 X  X 
i
2
b2
i
2
i
i
2
i
2
i
X
i
 X Yi  Y    X iYi   X iY   XYi   XY
  X iYi  Y  X i  X  Yi  nXY
  X iYi  Y nX   X nY   nXY
  X iYi  nXY
Hence we have shown that the numerators of the two expressions are the same.
37
SIMPLE REGRESSION ANALYSIS
b2  X i2  nX 2    X iYi  nXY
X Y  nXY

b 
 X  nX
 X  X Y  Y 


 X  X 
i
2
b2
i
2
i
i
2
i
2
i
X
i
 X Yi  Y    X iYi  nXY
2
2
2


X

X

X

n
X
 i
 i
The denominator is mathematically a special case of the numerator, replacing Y by X.
Hence the expressions are quivalent.
38
DERIVING LINEAR REGRESSION COEFFICIENTS
Y
True model:
Fitted line:
Y  1   2 X  u
Yˆ  b1  b2 X
Yˆn  b1  b2 X n
Yn
Y1
b1
b2
Yˆ1  b1  b2 X 1
X1
Xn X
The scatter diagram is shown again. We will summarize what we have done. We
hypothesized that the true model is as shown, we obtained some data, and we fitted a line.
39
DERIVING LINEAR REGRESSION COEFFICIENTS
Y
True model:
Fitted line:
Y  1   2 X  u
Yˆ  b1  b2 X
Yˆn  b1  b2 X n
Yn
Y1
b1
b2
Yˆ1  b1  b2 X 1
b1  Y  b2 X
b2
 X  X Y  Y 


 X  X 
i
i
2
i
X1
Xn X
We chose the parameters of the fitted line so as to minimize the sum of the squares of the
residuals. As a result, we derived the expressions for b1 and b2.
40
DERIVING LINEAR REGRESSION COEFFICIENTS
True model:
Fitted line:
Y  2 X  u
Yˆ  b2 X
Typically, an intercept should be included in the regression specification. Occasionally,
however, one may have reason to fit the regression without an intercept. In the case of a
simple regression model, the true and fitted models become as shown.
41
DERIVING LINEAR REGRESSION COEFFICIENTS
True model:
Fitted line:
Y  2 X  u
Yˆ  b2 X
ei  Yi  Yî  Yi  b2 X i
We will derive the expression for b2 from first principles using the least squares criterion.
The residual in observation i is ei = Yi – b2Xi.
42
DERIVING LINEAR REGRESSION COEFFICIENTS
Y  2 X  u
Yˆ  b2 X
True model:
Fitted line:
ei  Yi  Yî  Yi  b2 X i
n
n
n
RSS   Yi  b2 X i    Yi  2b2  X iYi  b
i 1
2
2
i 1
i 1
2
2
n
2
X
 i
i 1
With this, we obtain the expression for the sum of the squares of the residuals.
43
DERIVING LINEAR REGRESSION COEFFICIENTS
Y  2 X  u
Yˆ  b2 X
True model:
Fitted line:
ei  Yi  Yî  Yi  b2 X i
n
n
n
RSS   Yi  b2 X i    Yi  2b2  X iYi  b
i 1
2
2
i 1
i 1
2
2
n
2
X
 i
i 1
n
n
dRSS
 2b2  X i2  2 X iYi  0
db2
i 1
i 1
Differentiating with respect to b2, we obtain the first-order condition for a minimum.
44
DERIVING LINEAR REGRESSION COEFFICIENTS
Y  2 X  u
Yˆ  b2 X
True model:
Fitted line:
ei  Yi  Yî  Yi  b2 X i
n
n
n
RSS   Yi  b2 X i    Yi  2b2  X iYi  b
i 1
2
2
i 1
i 1
2
2
n
2
X
 i
i 1
n
n
dRSS
 2b2  X i2  2 X iYi  0
db2
i 1
i 1
n
b2 
XY
i 1
n
i
i
2
X
 i
i 1
Hence, we obtain the OLS estimator of b2 for this model.
45
DERIVING LINEAR REGRESSION COEFFICIENTS
Y  2 X  u
Yˆ  b2 X
True model:
Fitted line:
ei  Yi  Yî  Yi  b2 X i
n
n
n
RSS   Yi  b2 X i    Yi  2b2  X iYi  b
i 1
2
2
i 1
i 1
2
2
n
2
X
 i
i 1
n
n
dRSS
 2b2  X i2  2 X iYi  0
db2
i 1
i 1
n
b2 
XY
i 1
n
i
2
X
 i
i
n
d 2 RSS
2

2
X

i  0
2
db2
i 1
i 1
The second derivative is positive, confirming that we have found a minimum.
46
Copyright Christopher Dougherty 2011.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 1.3 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own and who feel that they might
benefit from participation in a formal course should consider the London School
of Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
20 Elements of Econometrics
www.londoninternational.ac.uk/lse.
11.07.25

Chapter 1 deriving linear regression coefficients

Transcript Chapter 1 deriving linear regression coefficients

Directory