Least Squares Algebra

Professor William Greene
Department of Economics
Least Squares Algebra
Econometrics I
Least Squares Algebra
Least Squares Algebra
Vocabulary



Some terms to be used in the discussion.

Population characteristics and entities vs.
sample quantities and analogs

Residuals and disturbances

Population regression line and sample
regression
Objective: Learn about the conditional mean
function. Estimate  and 2
First step: Mechanics of fitting a line
(hyperplane) to a set of data
Least Squares Algebra
Fitting Criteria




The set of points in the sample
Fitting criteria - what are they:


Least squares

and so on
Why least squares?
A fundamental result:
Sample moments are “good” estimators of
their population counterparts
We will spend the next few weeks using this
principle and applying it to least squares
computation.
Least Squares Algebra
An Analogy Principle for Estimating 
In the population
E[y | X ]
=
E[y - X |X] =
E[xi i]
=
Σi E[xi i]
=
E[Σi xi i]
=
E[X(y - X) ] =
Continuing
Summing,
Exchange Σi and E[]
X so
0
0
Σi 0 = 0
E[ X ] = 0
0
Choose b, the estimator of  to mimic this population
result: i.e., mimic the population mean with the sample
mean
Find b such that
1
N
X e = 0 
1
X ( y - X b )
N
As we will see, the solution is the least squares coefficient
vector.
Least Squares Algebra
Population and Sample Moments
We showed that E[i|xi] = 0 and Cov[xi,i] = 0.
If so, and if E[y|X] = X, then
 = (Var[xi])-1 Cov[xi,yi].
This will provide a population analog to the
statistics we compute with the data.
Least Squares Algebra
U.S. Gasoline Market, 1960-1995
Least Squares Algebra
Least Squares

Example will be, Gi on
xi = [1, PGi , Yi]

Fitting criterion: Fitted equation will be
yi = b1xi1 + b2xi2 + ... + bKxiK.

Criterion is based on residuals:
ei = yi - b1xi1 + b2xi2 + ... + bKxiK
Make ei as small as possible.
Form a criterion and minimize it.
Least Squares Algebra
Fitting Criteria

Sum of residuals: 

Sum of squares: 

Sum of absolute values of residuals:  i 1 e i

Absolute value of sum of residuals  i 1 e i

We focus on  i 1 e now and  i 1 e i later
N
i 1
N
ei
2
e
i 1 i
N
N
N
2
i
N
Least Squares Algebra
Least Squares Algebra
2
e
 i 1 i  e e = (y - X b )'(y - X b )
N
A d ig re ssio n o n m u ltiv a ria te ca lcu lu s.
M a trix a n d v e cto r d e riv a tiv e s.
D e riv a tiv e o f a sca la r w ith re sp e ct to a v e cto r
D e riv a tiv e o f a co lu m n v e cto r w rt a ro w v e cto r
O th e r d e riv a tiv e s
Least Squares Algebra
Least Squares Normal Equations
  i 1 e
N
b
2
i
2
  i 1 ( y i - x i b )
N

b
 (y - X b )'(y - X b )
b
 (1x1) /  (kx1)
  2 X '(y - X b )
= 0
(-2)(N xK )'(N x1)
= (-2)(K xN )(N x1) = K x1
N ote: D erivative of 1x1 w rt K x1 is a K x1 vector.
S olution:  2 X '(y - X b ) = 0  X 'y = X 'X b
Least Squares Algebra
Least Squares Solution
-1
A ssum ing it exists: b = ( X 'X ) X 'y
N ote the analogy:
 =  V ar( x ) 
1
 1

b= 
X 'X 
N

 C ov( x ,y) 
1
 1

X 'y 

N

S uggests som ething desirable about least squares
Least Squares Algebra
Second Order Conditions
N ecessary C ondition: First derivatives = 0
 (y - X b )'(y - X b )
b
  2 X '(y - X b )
S ufficient C ondition: S econd derivative s ...
 (y - X b )'(y - X b )
2
b b 
=
=
  (y - X b )'(y - X b ) 



b


b 
 K  1 colum n vector
 1  K row vector
= 2 X 'X
Least Squares Algebra
Does b Minimize e’e?
  iN1 x i21
 N
2
 i 1 x i 2 x i1
 e'e

 2 X 'X = 2

b b '
...
 N
  i 1 x iK x i 1
 i 1 x i1 x i 2
...
 i 1 x i 2
...
...
...
 i 1 x iK x i 2
...
N
N
2
N
N
 i 1 x i 1 x iK 

N
 i 1 x i 2 x iK 

...

N
2
 i 1 x iK 
If there w ere a single b, w e w ould require this to be
p o sitive, w hich it w ould be; 2 x'x = 2  i 1 x i  0.
N
2
T he m atrix counterpart of a positive num ber is a
p ositive d efin ite m a trix .
Least Squares Algebra
Sample Moments - Algebra
  iN1 x i21
 N
 i 1 x i 2 x i1
X 'X = 

...
 N
  i 1 x iK x i 1
=  i 1
N
 x i1 
 
x
 i2   x
i1
 ... 
 
 x ik 
 i 1 x i1 x i 2
...
 i 1 x i 2
...
...
...
 i 1 x iK x i 2
...
N
N
2
N
xi 2
...
x iK
N
 i 1 x i1 x iK 

N
 i 1 x i 2 x iK 
N
=  i 1

...

N
2
 i 1 x iK 
 x i21

 x i 2 x i1
 ...

 x iK x i 1
x i1 x i 2
...
xi 2
2
...
...
...
x iK x i 2
...
x i 1 x iK 

x i 2 x iK 
... 

2
x iK 

N
=  i 1 x i x i
Least Squares Algebra
Positive Definite Matrix
M a trix C is p o sitiv e d e fin ite if a 'C a is > 0
fo r a n y a.
G e n e ra lly h a rd to c h e c k . R e q u ire s a lo o k a t
c h a ra c te ristic ro o ts (la te r in th e c o u rse ).
Fo r so m e m a tric e s, it is e a sy to v e r ify . X 'X i s
o n e o f th e se .
a 'X 'X a = ( a 'X ')( X a ) = ( X a ) '( X a ) = v 'v =

K
v 0
k=1 k
2
C o u ld v = 0 ? v = 0 m e a n s X a = 0 . Is th is p o ssib le ?
-1
C o n c lu sio n : b = ( X 'X ) X 'y d o e s in d e e d m in im ize e 'e .
Least Squares Algebra
Algebraic Results - 1
In th e p o p u la tio n : E [ X '  ] = 0
In th e sa m p le :
1
N

N
i 1
x ie i  0
Least Squares Algebra
Residuals vs. Disturbances
D istu rb a n c e s (p o p u la tio n ) y i  x i   i
y = E [ y| X ] + ε
P a rtitio n in g y :
= c o n d itio n a l m e a n + d is tu rb a n c e
R e sid u a ls (sa m p le )
y i  x ib  e i
P a rtitio n in g y :
y = Xb + e
=
p ro je c tio n + re s id u a l
( N o te : P ro je c tio n in to th e c o lu m n s p a c e
o f X , i.e ., th e
se t o f lin e a r c o m b in a tio n s o f th e c o lu m n s o f X - X b is o n e o f th e se . )
Least Squares Algebra
Algebraic Results - 2







A “residual maker” M = (I - X(X’X)-1X’)
e = y - Xb= y - X(X’X)-1X’y = My
My = The residuals that result when y is regressed on X
MX = 0
(This result is fundamental!)
How do we interpret this result in terms of residuals?
When a column of X is regressed on all of X, we get a
perfect fit and zero residuals.
(Therefore) My = MXb + Me = Me = e
(You should be able to prove this.
y = Py + My, P = X(X’X)-1X’ = (I - M).
PM = MP = 0.
Py is the projection of y into the column space of X.
Least Squares Algebra
The M Matrix
M = I- X(X’X)-1X’ is an nxn matrix
 M is symmetric – M = M’
 M is idempotent – M*M = M
(just multiply it out)
 M is singular; M-1 does not exist.
(We will prove this later as a side result
in another derivation.)

Least Squares Algebra
Results when X Contains a Constant Term



X = [1,x2,…,xK]
The first column of X is a column of ones
Since X’e = 0, x1’e = 0 – the residuals sum to
zero. y  X b + e
D e fin e i  [1,1, ...,1] '  a co lu m n o f n o n e s
i'y =

N
i= 1
y i  ny
i'y  i'X b + i'e = i'X b
im p lie s (a fte r d iv id in g b y N )
y  x b (th e re g re ssio n lin e p a sse s th ro u g h th e m e a n s)
T h e se d o n o t a p p ly if th e m o d e l h a s n o co n sta n t te rm .
Least Squares Algebra
Least Squares Algebra
Least Squares Algebra
Least Squares
Least Squares Algebra
Residuals
Least Squares Algebra
Least Squares Residuals
Least Squares Algebra
Least Squares Algebra-3
X
I
e
X
X X
X
M
M is NxN potentially huge
Least Squares Algebra
Least Squares Algebra-4
MX =
Least Squares Algebra
