NOTES ON MULTIPLE REGRESSION USING MATRICES Tony E. Smith ESE 502: Spatial Data Analysis  Multiple Regression  Matrix Formulation of Regression  Applications to Regression.

Download Report

Transcript NOTES ON MULTIPLE REGRESSION USING MATRICES Tony E. Smith ESE 502: Spatial Data Analysis  Multiple Regression  Matrix Formulation of Regression  Applications to Regression.

NOTES ON MULTIPLE
REGRESSION USING MATRICES
Tony E. Smith
ESE 502: Spatial Data Analysis
 Multiple Regression
 Matrix Formulation of Regression
 Applications to Regression Analysis
SIMPLE LINEAR MODEL
( yi , xi ) , i  1,.., n
 Data:
 Parameters:
(  0 , 1 ) , 
2
 Model:
Yi   0  1xi   i , i  1,.., n
 i iid ~ N (0, ) , i  1,.., n
2
 E (Yi | xi )   0  1xi , i  1,.., n
SIMPLE REGRESSION ESTIMATION
 Estimate Conditional Mean:
E (Y | x)   0  1x
 Data Points:
( yi , xi )
 Predicted Value:
yˆi  ˆ0  ˆ1 xi
Line of Best Fit
y


yi
yˆi

xi
where:
n
2
ˆ
ˆ
(  0 , 1 )  min  i 1[ yi  (  0  1 xi )]
(  0 , 1 )
yi
STANDARD LINEAR MODEL
( yi , xi1,.., xik ) , i  1,.., n
 Data:
 Parameters:
(  0 , 1 ,..,  k ) , 
2
 Model:
Yi   0   j 1  j xij   i , i  1,.., n
k
 i iid ~ N (0, ) , i  1,.., n
2
 E (Yi | xi1,.., xik )   0   j 1  j xij
k
STANDARD LINEAR MODEL (k = 2)
( yi , xi1, xi 2 ) , i  1,.., n
 Data:
 Parameters:
(  0 , 1 ,  2 ) , 
2
 Model:
Yi   0  1xi1   2 xi 2   i , i  1,.., n
 i iid ~ N (0, ) , i  1,.., n
2
 E (Yi | xi1, xi 2 )   0  1xi1   2 xi 2
REGRESSION ESTIMATION (for k =2)
 Data Points:
( yi , xi1 , xi 2 )
y
 Predicted Value:
Plane of Best Fit

yi
 yˆi
yˆi  ˆ0  ˆ1 xi1  ˆ2 xi 2
x2
where:
x1
( ˆ0 , ˆ1 , ˆ2 )  min
(  0 , 1 ,  2 )
 ( xi1 , xi 2 )
2
[
y

(



x


x
)]
i1 i 0 1 i1 2 i 2
n
MATRIX REPRESENTATION OF
THE STANDARD LINEAR MODEL
 Vectors and Matrices:
 Y1 
Y 
Y   2 ,
:
Y 
 n


X 


1
1
:
1
x11 x12 
x21 x22 
,
: : 
xn1 xn 2 
 0 
   1  ,
 
 2
 1 
   :2 
 
 n
 Matrix Reformulation of the Model:
Y  X 
 ~ N (0, 2 I n )
 0
where: 0   0  and
 
 0
 
10 0

In   0 1
0

0 01


LINEAR TRANSFORMATIONS
IN ONE DIMENSION
 Linear Function:
f ( x)  a  x
 f (1)  a 1  a
 f ( x)  f (1)  x
 Graphic Depiction:

0

1

a

x

a x
LINEAR TRANSFORMATIONS
IN TWO DIMENSIONS
 Linear Transformation:
x1   a11x1  a12 x2 

f ( x)  f    

x
a
x

a
x
 2   21 1 22 2 


a11 
a12 
1
0


 f
  , f
 
0  a21 
1  a22 


x1 
1
0

 f   f
 x1  f
 x2
0
1
 x2 
 Graphical Depiction of Linear Transformation:
f ( x)

x
x   1 
x
  2

0

1

1
0


f 0 x2
1


f 0
1


1
f
x
0 1

 1
f
0
SOME MATRIX CONVENTIONS
 Transposes of Vectors and Matrices:
a  a( k1)
A  A( kn )
 a1 
  :   a  a(1 k )  (a1 ,., ak )
a 
 k
 a11 a1n 
 a11 ak1 
 :
:   A  A(nk )   :
: 
a

a

a
a
kn 
 k1 kn 
 1n
 Symmetric (Square) Matrices:
 Important Example:
A  A
( AA)  AA is symmetric
 Row Representation of Matrices:
 a11 a1n   (a11 ,.., a1n )   a1 
A   :
:  
:
   : 
a
  (a ,.., a )   a 
a
kn 
 k 
 k1 kn   k1
 Column Representation of Matrices:
 a11 a1n    a11   a1n  
A   :
:     :  ,..,  :    ( a1 ,.., ak )
a
  a   a 
a
 k 1 kn    k 1   kn  
 Inner Product of Vectors:
 a1 
 x1 
a   :  , x :  
a 
x 
 n
 n
ax 

n
a x  xa
i 1 i i
 Matrix Multiplication:
 a1 
 x1 
 a1 x 
A  Akn   :  , x   :   Ax   : 
 a 
x 
 a x 
 k 
 n
 k 
 a1b1 a1bm 
B  Bnm   b1 ,.., bm   AB   :
: 
 a b a b 
 k  1 k  m 
 Transposes: ( AB)  BA
MATRIX REPRESENTATIONS OF
LINEAR TRANSFORMATIONS
 For any Two-Dimensional Linear Transformation :


x1 

f ( x)  f    f 1  x1  f 0  x2
0
1
 x2 
with :


a11 
a12 
1
0


f

, f

0  a21 
1  a22 
a11 x1  a12 x2   a11 a12  x1 

f ( x)  

 Ax



 a21 x1  a22 x2   a21 a22  x2 
 Graphical Depiction of Matrix Representation:
Ax

x
x   1 
  x2 

0

1

1
0

 a12  x
a  2
 22 

 a12 
a 
 22 

  a11 
a 
 21 
  a11 
 a  x1
 21 
 Inversion of Square Matrices (as Linear Transformations):
 a11 a1n    1   0     a11   a1n  
A In   :
:    :  ,..,  :      :  ,..,  :  
a
  0  1   a   a 
a
 n1 nn          n1   nn  

  a11   a1n     1   0  
A1   :  ,..,  :      :  ,..,  :  
 a   a   0  1
  n1   nn        

A1 A  I n  AA1
DETERMINANTS OF SQUARE MATRICES
a11 a12 

A

a
a
 21 22 
 det( A)
 a11a22  a21a21
 | det( A) |

 a12 
a 
 22  


 a
 11 
a 
 21 

 Area of the image of the unit square under A
NONSINGULAR SQUARE MATRICES
A1 exists
a11 
a21 


   and  
 a21 
 a22 
are not colinear

det  A   0
 A is nonsingular


 a12 
a 
 22  


 a
 11 
a 
 21 
LEAST-SQUARES ESTIMATION
 General Regression Matrices:
 y1 
 0 
1
 x1  1 x11
y 
   1  , y   :2  , xi   x:i1  , X   x2   1 x21
:
 :  : : :


x 
 
 x   1 x
 yn 
 ik 
 k
 n   n1
x1k 
 x1 
x2 k  , X    x2  
 : 
: 
 x  
xnk 
 n 
 General Sum-of-Squares:
S ( ) 
i1[ yi  ( 0  1xi1 


n
 k xik )]2  i1 ( yi  xi ) 2
2
y
i1 i  2i1 yi xi 
n
n
n
2

(
x

)
i1 i
n
S (  )  yy  2 yX     X X 
DIFFERENTIATION OF FUNCTIONS
 General Derivative:
d
dx
f ( x)  lim 0
 Example:
d
dx
f ( x  )  f ( x)

f ( x)  x 2
( x  ) 2  x 2
f ( x)  lim 0

( x 2  2x   2 )  x 2
 lim 0

 lim0 (2 x  )

d
dx
f ( xo  ) 
f ( xo )  2 xo
f ( xo ) 

xo

PARTIAL DERIVATIVES
z  f ( x1 , x2 )

z

( x1o , x2o )
x2o

x1
f ( x , x )  lim 0
o
1
o
2
f ( x  , x )  f ( x , x )

o
1
o
2
o
1
o
2
VECTOR DERIVATIVES
 Derivative Notation for:
i f ( x ) 

xi
f ( x)  f ( x1,.., xn )
f ( x1 ,.., xn ) , i  1,.., n
 Gradient Vector:


 1 f ( x) 
x1 f ( x) 
 x f ( x)  
:
: 

  f ( x)    f ( x) 
 n
  xn

TWO IMPORTANT EXAMPLES
f ( x)  ax   i 1 ai xi
n
 Linear Functions:
i f ( x)  ai , i  1,.., n
  x f ( x)  a
 Quadratic Functions:
 a1 x 
f ( x)  xAx  ( x1 ,.., xn )  : 
 a x 
 n 
n
  i 1 xi (ai x)
 i1  j 1 xi aij x j
n
n
 Quadratic Derivatives:
f ( x)  xAx   k 1  h1 xk akh xh
n
n
 i f ( x)   h1 aih xh 
n

n
a x
k 1 ki k
 ai x  ai x
 a1 x   a1 x 
  x f ( x)   :    :   Ax  Ax
 a x   a x 
 n   n 
 Symmetric Case:
( A  A)   x ( xAx)  2 Ax
MINIMIZATION OF FUNCTIONS
 First-Order Condition:
d
dx
f ( x*)  0
f ( x)
 Example:
f ( x)  a  2bx  x 2
d
dx
f ( x)  2b  2 x
d
0  dx
f ( x*)  2b  2 x * 


x*
x*  b
TWO-DIMENSIONAL MINIMIZATION
z
z  g ( x1 , x2 )
x2o 


o
1
x

x1
g ( x1o , x2o )  0

x2
g ( x1o , x2o )  0
 x g(x )  0
o
x1
LEAST SQUARES ESTIMATION
 Solution for:
ˆ  ( ˆ0 , ˆ1 ,.., ˆk )
min  S (  )  yy  2( yX )     X X 
 0    S ( ˆ )   2 X y  2 X X ˆ
 X X ˆ  X y
 ˆ  ( X X )1 X y
if
det( X X )  0
NON-MATRIX VERSION (k = 2)
 Data: ( yi , xi1, xi 2 ) , i  1,.., n , ( y , x1, x2 )  sample means
( yi , xi1, xi 2 )  ( yi  y , xi1  x1, xi 2  x2 )  deviation form
 Beta Estimates:
ˆ0  y  ˆ1 x1  ˆ2 x2 , where :
ˆ1 
ˆ2 


 i1 yi xi1
n

 i1 xi21


n
n
 i1 yi xi1
n



 i1 yi xi 2
n
 
 i1 xi22 
n
 
 i1 xi21 
 i1 xi21
n
 
 i1 xi22 
n
 i1 yi xi1
 
n
x x

x x


x x
 
x x
i 1 i1 i 2
i 1 i1 i 2
n
 i1 xi22 
n
n
 
n
n
i 1 i1 i 2
i 1 i1 i 2

EXPECTED VALUES OF
RANDOM MATRICES
 Random Vectors and Matrices
 Y1 
 Y11 Y1k 
Y  Yn1   :  , Y  Ynk   : : : 
Y 
Y Y 
 n
 k1 kn 
 Expected Values:
 E (Y1 ) 
 E (Y11 ) E (Y1k ) 
E (Y )   :  , E (Y )   : : : 
 E (Y ) 


E
(
Y
)
E
(
Y
)
n 

k1
kn 

EXPECTATIONS OF LINEAR
FUNCTIONS OF RANDOM VECTORS
 Linear Combinations
aY   i 1 aiYi  E (aY )   i 1 ai E (Yi )  aE (Y )
n
n
 Linear Transformations
 a1Y 
 a1 E (Y ) 
AY   :   E ( AY )   :   AE (Y )
 a Y 
 a E (Y ) 
 n 
 n

EXPECTATIONS OF LINEAR
FUNCTIONS OF RANDOM MATRICES
 Left Multiplication
AY  AhnYnk
 a1Y1 a1Yk 
 : : : 
 a Y a Y 
 h 1 h k 
 a1 E (Y1 ) a1 E (Yk ) 
 E ( AY )  
:
:
:
  AE (Y )
 a E (Y ) a E (Y ) 
h
k 
 h 1
 Right Multiplication (by symmetry of inner products):
YB  Ykn Bnh  E (YB)  E (Y ) B
COVARIANCE OF RANDOM VECTORS
 Random Variables : E (Yi )  i , i  1,.., n
cov(Yi ,Yj )   ij  E[(Yi  i )(Yj   j )] , i  1,.., n
 Random Vectors: E (Y )    (1,.., n ) ,
  11  1n   E[(Y1  1 )(Y1  1 )] E[(Y1  1 )(Y1  1 )] 
cov(Y )     : : :   
:
:
:





 n1  nn   E[(Yn   n )(Y1  1 )] E[(Yn   n )(Yn   n )] 
 Y 

Y


 (Y1  1 )(Y1  1 ) (Y1  1 )(Y1  1 ) 



1
1
1
1

 E
:
:
:
  E  :  :  
 (Y   )(Y   ) (Y   )(Y   ) 
 Y    Y   
n
1
1
n
n
n
n 
n  n
n 
 n
 n


cov(Y )    E[(Y   )(Y   )]
COVARIANCE OF LINEAR
FUNCTIONS OF RANDOM VECTORS
 Linear Transformations:
E (Y )  
cov( AY )  E[( AY  A )( AY  A )]
 E[ A(Y   )(Y   ) A]
 AE[(Y   )(Y   ) A]
( Left Mult )
 AE[(Y   )(Y   )] A
( Right Mult )
 cov( AY )  A cov(Y ) A
 Linear Combinations:
cov(aY )  a cov(Y )a
TRANSLATIONS OF RANDOM VECTORS
 Translation:
 Means:
Y  b Y
E (b  Y )  E (b)  E (Y )  b  E (Y )
 E (b  AY )  b  AE (Y )
 Covariances:
E (Y )  
cov(b  Y )  E[(b  Y  {b  })(b  Y  {b  })]
 E[(Y   )(Y   )]  cov(Y )
 cov(b  AY )  A cov(Y ) A
RESIDUAL VECTOR IN THE
STANDARD LINEAR MODEL
 Linear Model Assumption: i iid ~ N (0, 2 ) , i  1,.., n
 Residual Means:
E ( i )  0 , i  1,.., n  E( )  0
 Residual Covariances:
var( i )  E ( i2 )   2 , cov( i ,  j )  E ( i j )  0 , j  i
 cov( )  E[(  0)(  0)]  E ( )
 11 1 n   E (11 )
 E :
: 
:
       E (  )
n 1
 n 1 n n 
 cov( )   2 I n
E (1 n )    2 0 
 :
:
: 
E ( n n )   0  2 
MOMENTS OF BETA ESTIMATES
 Linear Model:
Y  X    ,  ~ N (0, 2 I n )
 ˆ  ( X X )1 X Y  ( X X )1 X ( X    )
 ( X X )1 X X   ( X X )1 X     ( X X )1 X 
 Mean of Beta Estimates:
E( ˆ )    ( X X )1 X E( )  E( ˆ )  
(Unbiased Estimator)
 Covariance of Beta Estimates:
cov( ˆ )  V  cov[   ( X X )1 X  ]  cov[( X X )1 X  ]
 ( X X )1 X  cov( ) X ( X X )1   2 ( X X )1 X X ( X X )1
 cov( ˆ )  V   2 ( X X )1
ESTIMATION OF RESIDUAL
VARIANCE
 Residual Variance:
 2  var( i )  E ( i2 ) , i  1,.., n
 Residual Estimates:
ˆi  yi  yˆi , i  1,.., n
 Natural Estimate of Variance:
ˆ 
2
1
n
2
1 ˆ ˆ
ˆ


i1 i n   , where ˆ  (ˆ1,.., ˆn )
n
 Bias-Correct Estimate of Variance:
s 
2
1
n( k 1)
ˆˆ
(Compensates for Least Squares)
S ( ˆ )  ( y  yˆ )( y  yˆ )  ˆˆ
ESTIMATION OF BETA COVARIANCE
 Beta Covariance Matrix:
V  cov( ˆ )   2 ( X X ) 1
 Beta Covariance Estimates:
2
1
ˆ
ˆ

V  covest (  )  s ( X X )
 varest ( ˆ j )  vˆii
 std-err( ˆ j ) 
vˆii
 v11 v1n 
  :
: 
v v 
 n1 nn 
 vˆ11 vˆ1n 
  :
: 
 vˆ vˆ 
 n1 nn 