Multiple Regression - James Madison University

Download Report

Transcript Multiple Regression - James Madison University

Multiple Regression
• More than one explanatory/independent variable
yt  1   2 x2t  3 x3t  ...   k xkt  et
yt  E ( yt )  et
• This makes a slight change to the interpretation of the coefficients
• This changes the measure of degrees of freedom
• We need to modify one of the assumptions
EXAMPLE: trt = 1 + 2 pt + 3 at + e
EXAMPLE: qdt = 1 + 2 pt + 3 inct + et
EXAMPLE: gpat = 1 + 2 SATt + 3 STUDYt + et
7.1
Interpretation of Coefficient
yt  1   2 x2t   3 x3t  et
dy
dx2
dy
dx3
 2
x3
 3
x2
• 2 measures the change in Y from a change in X2, holding X3
constant.
• 23 measures the change in Y from a change in X3, holding X2
constant.
7.2
Assumptions of the Multiple Regression Model
1. The Regression Model is linear in the parameters and error term
yt = 1 + 2 x2t + 3 x3t + … k xkt +et
2. Error Term has a mean of zero:
E(e) = 0  E(y) = 1 + 2 x2t + 3 x3t + … k xkt
3. Error term has constant variance: Var(e) = E(e2) = 2
4. Error term is not correlated with itself (no serial correlation):
Cov(ei,ej) = E(eiej) = 0 ij
5. Data on x’s are not random (and thus are uncorrelated with the error
term: Cov(X,e) = E(Xe) = 0) and they are NOT exact linear
functions of other explanatory variables.
6. (Optional) Error term has a normal distribution. E~N(0, 2)
7.3
Estimation of the Multiple Regression Model
• Let’s use a model with 2 independent variables:
yt  1   2 x2t  3 x3t  et
• A scatterplot of points is now a scatter “cloud”. We want to fit the
best “line” through these points. In 3 dimensions, the line becomes
a plane.
• The estimated “line” and a residual are defined as before:
yˆt  b1  b2 x2t  b3 x3t
eˆt  yt  yˆt
• The idea is to choose values for b1, b2, and b3 such that the sum of
squared residuals is minimized.
7.4
2
2
ˆ
ˆ
e

(
y

y
)
t  t t
7.5
  ( yt  b1  b2 x2t  b3 x3t ) 2
From here, we minimize this expression with respect to b1,
b2, and b3. We set these three derivatives equal to zero and
Solve for b1, b2, b3. We get the following formulas:
b2 
* *
*2
* *
* *
y
x
x

y
x
x
 t 2t  3t  t 3t  2t x3t

   
* *
*2
* *
* *
y
x
x

y
x
x


t 3t  2t
t 2t  2t x3t
b3 
*2
*2
* * 2
 x2t  x3t   x2t x3t 
x2*2t
x3*t2
b1  y  b2 x2  b3 x3
* * 2
x2t x3t
Where:
*
yt
 ( yt  y )
x2*t  ( x2t  x2 )
x3*t  ( x3t  x3 )
[What is going on here? In the formula for b2, notice that if
x3 where omitted from the model, the formula reduces to the
familiar formula from Chapter 3.]
You may wonder why the multiple regression formulas on slide 7.5
aren’t equal to:
( yt  y )(x 2t  x2 )

b2 
2
(
x

x
)
 2t 2
( yt  y )(x 3t  x3 )

b3 
2
(
x

x
)
 3t 3
7.6
We can use a Venn diagram to illustrate the idea of Regression as
Analysis of Variance
For Bivariate (Simple) Regression
y
x
For Multiple Regression
y
x2
x3
7.7
Example of Multiple Regression
Suppose we want to estimate a model of home prices using data
on the size of the house (sqft), the number of bedrooms (bed) and the
number of bathrooms (bath). We get the following results:
priˆcet  129.062 0.1548sqftt  21.588bedt 12.193batht
How does a negative coefficient estimate on bed and bath make
sense?
7.8
7.9
Expected Value
E (b1 )  1
We will omit the proofs. The Least
Squares estimator for multiple regression
is unbiased, regardless of the number of
independent variables
E (b2 )   2
E (b3 )   3
Variance Formulas With 2 Independent Variables
Var (b2 ) 
Var (b3 ) 
2
2
(1  r23
) ( x2t  x2 ) 2
2
2
(1  r23
) ( x3t  x3 ) 2
2
Where r23 is the correlation
between x2 and x3 and the
parameter 2 is the variance
of the error term.
We need to estimate using the formula
This estimate has T-k degrees of freedom.

2 
2
ˆ
e
t
T k
Gauss Markov Theorem
Under the assumptions 1-5 (the 6th assumption isn’t needed for the
theorem to be true) of the linear regression model, the least squares
estimators b1, b2, …bk have the smallest variance of all linear and
unbiased estimators of 1 , 2,… k. They are the BLUE (Best,
linear, unbiased, estimator)
7.10
Confidence Intervals and Hypothesis Testing
7.11
• The methods for constructing confidence intervals and conducting
hypothesis tests are the same as they were for simple regression.
• The format for a confidence interval is:
bi  tc se(bi )
Where tc depends on the level of confidence and has T-k degrees of
freedom. T is the number of observations and k is the number of
independent variables plus one for the intercept.


• Hypothesis Tests:
bi   i
t
Ho : i = c
se(bi )
H1 : i  c
Use the value of c for i when calculating t. If t > tc or t < - tc  reject Ho
If c is 0, then we call it a test of significance.
7.12
Goodness of Fit
• R2 measures the proportion of the variance in the dependent
variable that is explained by the independent variable. Recall that
eˆt2
SSR
SSE

R 
 1
 1
2
SST
SST
(
y

y
)
 t
2
• Least Squares chooses the line that produces the smallest sum of
squared residuals, it also produces the line with the largest R2. It
also has the property that the inclusion of additional independent
variables will never increase and will often lower the sum of
squared residuals, meaning that R2 will never fall and will often
increase when new independent variables are added, even if the
variables have no economic justification.
• Adjusted R2: adjust R2 for degrees of freedom
R  1
2
 eˆt2 /(T  k )
2
(
y

y
)
/(T  1)
 t
Example: Grades at JMU
A sample of 55 JMU students was taken Fall 2002. Data on
• GPA
• SAT scores
• Credit Hours Completed
• Hours of Study per Week
• Hours at a Job per week
• Hours at Extracurricular Activites
Three models were estimated:
gpat = 1 + 2 SATt + et
gpat = 1 + 2 SATt + 3 CREDITSt + 4 STUDYt + 5JOBt + 6 ECt + et
gpat = 1 + 2 SATt + 3 CREDITSt + 4 STUDYt + 5JOBt + et
7.13
7.14
Here is our simple
Regression model.
Regression Statistics
Multiple R
0.270081672
R Square
0.072944109
Adjusted R Square
0.055452489
Standard Error
0.455494651
Observations
55
Intercept
SAT Total
Here is our multiple
regression model.
Both R2 and
Adjusted R2 have
increased with the
inclusion of
4 additional indep.
variables.
Coefficients Standard Error
t Stat
P-value
1.799734629
0.631013633 2.852132721 0.006178434
0.001057599
0.000517894 2.042114498 0.046129915
Regression Statistics
Multiple R
0.465045986
R Square
0.216267769
Adjusted R Square
0.136295092
Standard Error
0.4355661
Observations
55
Coefficients Standard Error
Intercept
1.538048132
0.686920438
SAT Total
0.000943615
0.000503684
Credit Hours (completed)
0.003382201
0.002664489
Study (Hrs/wk)
0.010762866
0.006782587
Job (Hrs/wk)
-0.00795843
0.006452798
EC (Hrs/wk)
0.002606617
0.009216993
t Stat
2.239048435
1.873427614
1.269361823
1.586837918
-1.23333021
0.282805582
P-value
0.029726041
0.066980023
0.210308107
0.118982276
0.22333641
0.778517157
7.15
Notice that the
Exclusion of EC
increases adjusted
R2 but reduces
R2
Regression Statistics
Multiple R
0.463668569
R Square
0.214988542
Adjusted R Square
0.152187625
Standard Error
0.431540195
Observations
55
Coefficients Standard Error
Intercept
1.53616637
0.680539354
SAT Total
0.000951057
0.000498346
Credit Hours (completed)
0.003441283
0.002631734
Study (Hrs/wk)
0.011267319
0.006483348
Job (Hrs/wk)
-0.0080306
0.006388155
t Stat
2.257277792
1.908425924
1.307610208
1.737885925
-1.25710723
P-value
0.028390614
0.062085531
0.196986336
0.088386942
0.214555464