Transcript Topic_12

Topic 12: Multiple Linear
Regression
Outline
• Multiple Regression
– Data and notation
– Model
– Inference
• Recall notes from Topic 3 for simple
linear regression
Data for Multiple Regression
• Yi is the response variable
• Xi1, Xi2, … , Xi,p-1 are p-1
explanatory (or predictor) variables
• Cases denoted by i = 1 to n
Multiple Regression Model
Yi  0  1 X i1  2 X i 2  ...   p1 X i, p1  i
• Yi is the value of the response variable
for the ith case
• β0 is the intercept
• β1, β2, … , βp-1 are the regression
coefficients for the explanatory
variables
Multiple Regression Model
Yi  0  1 X i1  2 X i 2  ...   p1 X i, p1  i
• Xi,k is the value of the kth explanatory
variable for the ith case
• ei are independent Normally distributed
random errors with mean 0 and
variance σ2
Multiple Regression
Parameters
• β0 is the intercept
• β1, β2, … , βp-1 are the regression
coefficients for the explanatory
variables
• σ2 the variance of the error term
Interesting special cases
• Yi = β0 + β1Xi + β2Xi2 +…+ βp-1Xip-1+ ei
(polynomial of order p-1)
• X’s can be indicator or dummy variables
taking the values 0 and 1 (or any other
two distinct numbers)
• Interactions between explanatory
variables (represented as the product of
explanatory variables)
Interesting special cases
• Consider the model
Yi= β0 + β1Xi1+ β2Xi2+β3X i1Xi2+ ei
• If X2 a dummy variable
– Yi = β0 + β1Xi + ei
(when X2=0)
– Yi = β0 + β1Xi1+β2+β3Xi1+ ei
(when X2=1)
= (β0+β2) + (β1+β3)Xi1+ ei
– Modeling two different regression
lines at same time
Model in Matrix Form
Y  X β 
n1
np p1
n1
 ~N(0, I )
2
nn
Y ~ N( X ,  I )
2
nn
Least Squares
Find b to minimize
SSE  ( Y  Xb )( Y  Xb )
Obtain normal equations
XXb  XY
Least Squares Solution
b  ( XX) XY
1
Fitted (predicted) values
1
ˆ
Y  Xb  X( XX) XY
 HY
Residuals
ˆ
eYY
 Y  HY
 ( I  H )Y
IH
is symetric and idempotent
(I  H)(I  H)  (I  H)
Covariance Matrix of
residuals
• Cov(e)=σ2(I-H)(I-H)΄= σ2(I-H)
• Var(ei)= σ2(1-hii)
• hii= X΄i(X΄X)-1Xi
• X΄i =(1,Xi1,…,Xi,p-1)
• Residuals are usually correlated
• Cov(ei,ej)= -σ2hij
Estimation of σ
s 
2
 
n p
(Y  Xb)( Y  Xb)

n p
SSE

 MSE
df E
s  s  Root MSE
2
Distribution of b
• b = (X΄X)-1X΄Y
• Since Y~N(Xβ, σ2I)
• E(b)=((X΄X)-1X΄)Xβ=β
• Cov(b)=σ2 ((X΄X)-1X΄)((X΄X)-1X΄)΄
=σ2(X΄X)-1
• σ2 (X΄X)-1 is estimated by s2 (X΄X)-1
ANOVA Table
• Sources of variation are
– Model (SAS) or Regression (KNNL)
– Error (SAS, KNNL) or Residual
– Total
• SS and df add as before
– SSM + SSE =SSTO
– dfM + dfE = dfTotal
Sums of Squares
n

ˆ Y
SSM   Y
i
i 1

2
n
2
ˆ
SSE   (Yi Yi )
i 1
n
SSTO   (Yi Y)
i 1
2
Degrees of Freedom
dfM
dfE
dfTotal
 p -1
 n- p

n -1
Mean Squares
MSM  SSM/dfM
MSE  SSE/df E
MST  SSTO/df Total
Mean Squares
n


ˆ  Y / ( p  1)
MSM   Y
i
i 1
2
n
2
ˆ
MSE   (Yi Yi ) / (n  p)
i 1
n
MST   (Yi Y) / (n  1)
2
i 1
ANOVA Table
Source SS
Model SSM
df
dfM
MS
F
MSM MSM/MSE
Error
SSE
dfE
MSE
Total
SSTO dfTotal MST
ANOVA F test
• H0: β1 = β2 = … = βp-1 = 0
• Ha: βk ≠ 0, for at least one k=1,., p-1
• Under H0, F ~ F(p-1,n-p)
• Reject H0 if F is large, use P-value
P-value of F test
• The P-value for the F significance test
tells us one of the following:
– there is no evidence to conclude that
any of our explanatory variables can
help us to model the response variable
using this kind of model (P ≥ .05)
– one or more of the explanatory
variables in our model is potentially
useful for predicting the response
variable in a linear model (P ≤ .05)
R2
• The squared multiple regression
correlation (R2) gives the proportion of
variation in the response variable
explained by all the explanatory variables
• It is usually expressed as a percent
• It is sometimes called the coefficient of
multiple determination (KNNL p 226)
R2
• R2 = SSM/SST
– the proportion of variation explained
• R2 = 1 – (SSE/SST)
– 1 – the proportion not explained
• Can express F test is terms of R2
F = [ (R2)/(p-1) ] / [ (1- R2)/(n-p) ]
Background Reading
• We went over KNNL 6.1 - 6.5