Transcript Stats 845

Summary of the Statistics
used in
Multiple Regression
The Least Squares Estimates:
ˆ , 
ˆ ,
ˆ , ... , 
ˆ
0 1 2
p
- The values that minimize
n
RSS 

i1
n
2
yi  yˆ i  

i1
2
yi   0   1 xi1   2 xi2  ...   p xip  .
The Analysis of Variance Table Entries
a) Adjustedn Total Sum of Squares (SSTotal)
SSTotal 
 y  y . d.f.  n  1
_ 2
i
i1
b) Residual Sum of Squares (SSError)
n
RSS  SSError 

2
yi  yˆ i  . d.f.  n  p  1
i1
c) Regression Sum of Squares (SSReg)
n
SSReg  SS 1 , 2 , ... ,  p  
Note:

_ 2
ˆ
yi  y . d.f. p
i1
n

i1
n
_ 2
y i  y  

n
_ 2
yˆ i  y 
i1
i.e. SSTotal = SSReg +SSError

i1
2
yi  yˆ i  .
The Analysis of Variance Table
Source
Sum of Squares d.f.
Regression
Error
SSReg
SSError
Total
SSTotal
Mean Square
p
SSReg/p = MSReg
n-p-1 SSError/(n-p-1) =MSError = s2
n-1
F
MSReg/s2
Uses:
1. To estimate s2 (the error variance).
- Use s2 = MSError to estimate s2.
2. To test the Hypothesis
H0: 1 = 1 = 2= ... = p = 0.
Use the test statistic
F = MSReg/ s2
= [(1/p)SSReg]/[(1/(n-p-1))SSError] .
- Reject H0 if F > Fa(p,n-p-1).
3. To compute other statistics that are useful in describing the
relationship between Y (the dependent variable) and X1,
X2, ... ,Xp (the independent variables).
a)R2 = the coefficient of determination
= SSReg/SSTotal
n
yˆ i  y


= i 1
2
n
2

y

y

 i
i 1
= the proportion of variance in Y explained by
X1, X2, ... ,Xp
1 - R2 = the proportion of variance in Y
that is left unexplained by X1, X2, ... , Xp
= SSError/SSTotal.
b) Ra2 = "R2 adjusted" for degrees of freedom.
= 1 -[the proportion of variance in Y that is left
unexplained by X1, X2,... , Xp adjusted for d.f.]
= 1 - [(1/(n-p-1))SSError]/[(1/(n-1))SSTotal] .
= 1 - [(n-1)SSError]/[(n-p-1)SSTotal] .
= 1 - [(n-1)/(n-p-1)] [1 - R2 ].
c) R= R2 = the Multiple correlation coefficient of
Y with X1, X2, ... ,Xp
=
SS Re g
SS Total
= the maximum correlation between Y and a
linear combination of X1, X2, ... ,Xp
Comment: The statistics F, R2, Ra2 and R are
equivalent statistics.
Properties of the Least Squares
Estimators:
ˆ , 
ˆ ,
ˆ , ... , 
ˆ
0 1 2
p
1. Normally distributed (If there error terms are
Normally distributed)
2. Unbiased Estimators of the Linear Parameters 0,
1, 2, ... p.
3. Minimum Variance (Minimum Standard Error)
of all Unbiased Estimators of the Linear
Parameters 0, 1, 2, ... p.
Comments:
ˆ , S.E. 
ˆ   s ˆ depends on
The standard error of 
i
i
i
1. The Error Variance s2 (and s).
2. sXi, the standard deviation of Xi (the ith
independent variable).
3. The sample size n.
4. The correlations between all pairs of
variables.
The standard error of ˆ i , S.E. ˆ i   sˆ
i
•
•
•
•
decreases as s decreases.
decreases as sXi increases.
decreases as n increases.
increases as the correlation between pairs of
independent variables increases.
–
In fact the standard error of the least squares
estimates can be extremely high if there is a high
correlation between one of the independent
variables and a linear combination of the remaining
independent variables. (the problem of
Multicollinearity).
The Covariance Matrix,Correlation and XTX
inverse matrix
The Covariance Matrix
 S.E. ˆ 2 Covˆ 0 ,ˆ 1  ... Covˆ 0 ,ˆ p 
0



S.E. ˆ 1 2 ... Covˆ 1 ,ˆ p 


...





2
ˆ
S.E. p  

where
ˆ ,
ˆ   r s ˆ s ˆ  r S.E. 
ˆ S.E. 
ˆ 
cov
i j
ij i j
ij
i
j
ˆ and
ˆ.
and where r  correlation between 
ij
i
j
The Correlation Matrix
  r01 ... r0p 


 1 ... r1p 






1 

The XTX inverse matrix

a
a
a0p 
 00 01

a11  a1p 

 .


app 

If we multiply each entry in the XTX inverse
matrix by s2 = MSError this matrix turns into
the covariance matrix for :
ˆ ,
ˆ ,
ˆ , ... , 
ˆ

0 1 2
p
Thus
2
ˆ
S.E. i   s2 aii and Covˆ i ,ˆ j   s2 aij .
These matrices can be used to compute
standard Errors for linear combinations of the
regression coefficients
Namely
ˆ c 
ˆ   c 
ˆ
ˆ c 
L
0 0
1 1
p p

S .E. Lˆ  sLˆ 
n
2
ˆ )] 2  2 c c cov( ˆ , ˆ )
c
[
S
.
E
.(

i
i j
i
i
j

s
i 0
i j
n
c
i 0
n
2
i
c
i 0
s  2 c c r s s
2
ˆi
2
i
i j
i
j
a  2 c c a
ii
i j
i
ˆi
ij
j
ij
ˆ j
ˆ  ˆ i  ˆ j , then
For example if L
S.E. ˆ i  ˆ j   sˆ iˆ j 

2

2
 S.E. ˆ i   S.E. ˆ j   2 11covˆ i ,ˆ j  

S.E. ˆ i 2 S.E. ˆ j 2  2 covˆ i ,ˆ j  

s2ˆ s2ˆ  2 rij sˆ sˆ
i
j
 s aii  ajj  2aij
i
j
An Example
Suppose one is interested in how the cost per month
(Y) of heating a plant is determined the average
atmospheric temperature in the Month (X1) and the
number of operating days in the month (X2). The data on
these variables was collected for n = 25 months selected
at random and is given on the following page.
Y = cost per month of heating a plant
X1 = average atmospheric temperature in the month
X2 = the number of operating days for the plant in the
month.
Month
Y
X1
X2
The Least Squares Estimates:
1
1098
35.3
20
2
1113
29.7
20
3
1251
30.8
23
4
840
58.8
20
5
927
61.4
21
6
873
71.3
22
7
636
74.4
11
8
850
76.7
23
9
782
70.7
21
10
914
57.5
20
11
824
46.4
20
12
1219
28.9
21
13
1188
28.1
21
Constant
14
957
39.1
19
X1
15
1094
46.8
23
X2
16
958
48.5
20
17
1009
59.3
22
18
811
70.0
22
19
683
70.0
22
20
888
74.5
23
21
768
72.1
20
22
847
58.1
21
23
886
44.6
20
24
1036
33.4
20
25
1108
28.6
Estimate
Standard Error
Constant
912.6
110.28
X1
-7.24
0.80
X2
20.29
4.577
The Covariance Matrix
Constant
X1
X2
Constant
12162
X1
-49.203
.63390
X2
-464.36
.76796
20.947
Constant
X1
X2
1.000
-.1764
-.0920
1.000
.0210
The Correlation Matrix
1.000
The XTX Inverse matrix
Constant
Constant
22
2.778747
X1
X2
-0.011242
-0.106098
X1
0.14207x10-
0.175467x10-3
X2
3
0.478599
The Analysis of Variance Table
Source
Regression
Error
Total
df
2
22
24
SS
MS
541871 270936
96287 4377
638158
F
61.899
Summary Statistics
(R2, Radjusted2 = Ra2 and R)
R2 = 541871/638158 = .8491
(explained variance in Y - 84.91 %)
Ra2 = 1 - [1 - R2][(n-1)/(n-p-1)]
= 1 - [1 - .8491][24/22]
= .8354 (83.54 %)
R = .8491 =.9215
= Multiple correlation coefficient
1400
1200
C
O
S
T
1000
800
600
20
30
40
50
TEMP
60
70
80
1400
1200
C
O
S
T
1000
800
600
10
15
20
DAYS
25
Three-dimensional Scatter-plot of Cost, Temp and Days.
Example
Motor Vehicle example
Variables
1. (Y) mpg – Mileage
2. (X1) engine – Engine size.
3. (X2) horse – Horsepower.
4. (X3) weight – Weight.
Select Analysis->Regression->Linear
To print the correlation matrix or the
covariance matrix of the estimates select
Statistics
Check the box for the covariance matrix of the
estimates.
Here is the table giving the estimates and their
standard errors.
Coefficientsa
Model
1
(Constant)
ENGINE
HORSE
WEIGHT
Unstandardized
Coefficients
B
Std. Error
44.015
1.272
-5.53E-03
.007
-5.56E-02
.013
-4.62E-03
.001
a. Dependent Variable: MPG
Standardi
zed
Coefficien
ts
Beta
-.074
-.273
-.504
t
34.597
-.786
-4.153
-6.186
Sig .
.000
.432
.000
.000
Here is the table giving the correlation matrix
and covariance matrix of the regression
estimates:
Coefficient Correlationsa
Model
1
Correlations
Covariances
WEIGHT
HORSE
ENGINE
WEIGHT
HORSE
ENGINE
WEIGHT
1.000
-.129
-.725
5.571E-07
-1.29E-06
-3.81E-06
HORSE
-.129
1.000
-.518
-1.29E-06
1.794E-04
-4.88E-05
ENGINE
-.725
-.518
1.000
-3.81E-06
-4.88E-05
4.941E-05
a. Dependent Variable: MPG
What is missing in SPSS is covariances and
correlations with the intercept estimate (constant).
This can be found by using the following trick
1. Introduce a new variable (called constnt)
2. The new “variable” takes on the value 1
for all cases
Select Transform->Compute
The following dialogue box appears
Type in the name of the target variable - constnt
Type in ‘1’ for the Numeric Expression
This variable is now added to the data file
Add this new variable (constnt) to the list of
independent variables
Under Options make sure the box – Include
constant in equation – is unchecked
The coefficient of the new variable will be the
constant.
Here are the estimates of the parameters with
their standard errors
Coefficientsa,b
Model
1
ENGINE
HORSE
WEIGHT
CONSTNT
Unstandardized
Coefficients
B
Std. Error
-5.53E-03
.007
-5.56E-02
.013
-4.62E-03
.001
44.015
1.272
Standardi
zed
Coefficien
ts
Beta
-.049
-.250
-.577
1.781
t
-.786
-4.153
-6.186
34.597
Sig .
.432
.000
.000
.000
a. Dependent Variable: MPG
b. Linear Regression throug h the Origin
Note the agreement with parameter estimates
and their standard errors as previously
calculated.
Here is the correlation matrix and the
covariance matrix of the estimates.
Coefficient Correlationsa,b
Model
1
Correlations
Covariances
CONSTNT
ENGINE
HORSE
WEIGHT
CONSTNT
ENGINE
HORSE
WEIGHT
CONSTNT
1.000
.761
-.318
-.824
1.619
6.808E-03
-5.427E-03
-7.821E-04
a. Dependent Variable: MPG
b. Linear Regression throug h the Origin
ENGINE
.761
1.000
-.518
-.725
6.808E-03
4.941E-05
-4.88E-05
-3.81E-06
HORSE
-.318
-.518
1.000
-.129
-5.43E-03
-4.88E-05
1.794E-04
-1.29E-06
WEIGHT
-.824
-.725
-.129
1.000
-7.82E-04
-3.81E-06
-1.29E-06
5.571E-07
Testing for Hypotheses related to
Multiple Regression.
The General Linear Hypothesis
h111 + h122 + h133 +... + h1pp = h1
h211 + h222 + h233 +... + h2pp = h2
...
hq11 + hq22 + hq33 +... + hqpp = hq
where h11,h12, h13, ... , hqp and h1,h2, h3, ... , hq are
known coefficients.
H0:
Examples
1.
2.
3.
4.
5.
6.
H0: 1 = 0
H0: 1 = 0, 2 = 0, 3 = 0
H0: 1 = 2
H0: 1 = 2 , 3 = 4
H0: 1 = 1/2(2 + 3)
H0: 1 = 1/2(2 + 3), 3 = 1/3(4 + 5 + 6)
The Complete Model
Y = 0 + 1X1 + 2X2 + 3X3 +... + pXp+ e
The Reduced Model
The model implied by H0.
You are interested in knowing whether the complete
model can be simplified to the reduced model.
Testing the General Linear Hypothesis
The F-test for H0 is performed by carrying out two
runs of a multiple regression package.
Run 1: Fit the complete model.
Resulting in the following Anova Table:
Source
Regression
Residual (Error)
Total
df
p
n-p-1
n-1
Sum of Squares
SSReg
SSError
SSTotal
Run 2: Fit the reduced model (q parameters
eliminated)
Resulting in the following Anova Table:
Source
Regression
Residual (Error)
Total
df
p-q
n-p+q-1
n-1
Sum of Squares
SS1Reg
SS1Error
SSTotal
The Test:
The Test is carried out using the Test Statistic
F
1
q
Reduction in the Residual Sum of Squares
Residual Mean Square for Complete model
 SSH0 

2
s
1
q
where SSH0 = SS1Error- SSError= SSReg- SS1Reg
and s2 = SSError/(n-p-1).
The test statistic, F, has an F-distribution with
n1 = q d.f. in the numerator and n2 = n – p - 1
d.f. in the denominator if H0 is true.
Distribution when H0 is true
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
The Critical Region
Reject H0 if F > Fa(q, n – p – 1)
Fa(q, n – p – 1)
The Anova Table for the Test:
Source
Regression
df
p-q
Sum of Squares
SS1Reg
Mean Square
[1/(p-q)]SS1Reg
F
MS1Reg/s2
q
SSH0
(1/q)SSH0
MSH0/s2
(for the
reduced model)
Departure
from H0
Residual
(Error)
Total
n-p-1 SSError
n-1
SSTotal
s2
Some Examples:
Four independent Variables
X1 , X2 , X3, X4
The Complete Model
Y = 0 + 1X1 + 2X2 + 3X3 + 4X4+ e
1)
a) H0: 3 = 0, 4 = 0 (q = 2)
b) The Reduced Model:
Y = 0 + 1X1 + 2X2 + e
Dependent Variable: Y
Independent Variables: X1 , X2
2) a) H0: 3 = 4.5, 4 = 8.0 (q = 2)
b) The Reduced Model:
Y – 4.5X3 – 8.0X4 = 0 + 1X1 + 2X2 + e
Dependent Variable:
Y – 4.5X3 – 8.0X4
Independent Variables: X1 , X2
Example
Motor Vehicle example
Variables
1. (Y) mpg – Mileage
2. (X1) engine – Engine size.
3. (X2) horse – Horsepower.
4. (X3) weight – Weight.
Suppose we want to test:
H0: 1 = 0 against HA: 1 ≠ 0
i.e. engine size(engine) has no effect on
mileage(mpg).
The Full model:
Y = 0 + 1 X1 + 2 X2 + 1 X3 + e
(mpg) (engine) (horse) (weight)
The reduced model:
Y = 0 + 2 X2 + 1 X3 + e
The ANOVA Table for the Full model:
ANOVAb
Model
1
Reg ression
Residual
Total
Sum of
Squares
16098.158
7720.836
23818.993
df
3
388
391
Mean
Square
5366.053
19.899
a. Predictors: (Constant), WEIGHT, HORSE, ENGINE
b. Dependent Variable: MPG
F
269.664
Sig .
.000a
The ANOVA Table for the Reduced model:
ANOVAb
Model
1
Reg ression
Residual
Total
Sum of
Squares
16085.855
7733.138
23818.993
df
2
389
391
Mean
Square
8042.928
19.880
F
404.583
Sig .
.000a
a. Predictors: (Constant), WEIGHT, HORSE
b. Dependent Variable: MPG
The reduction in the residual sum of squares
= 7733.138452 - 7720.835649 = 12.30280251
The ANOVA Table for testing
H0: 1 = 0 against HA: 1 ≠ 0
Regression
1=1 =0 0
Residual
Total
Sum of Squares
df
Mean Square
F
Sig.
16085.85502
2
8042.927509 404.18628 0.0000
12.30280251
1
12.30280251 0.6182605 0.4322
7720.835649 388
19.89906095
23818.99347 391
Now suppose we want to test:
H0: 1 = 0, 2 = 0 against HA: 1 ≠ 0 or 2 ≠ 0
i.e. engine size (engine) and horsepower
(horse) have no effect on mileage (mpg).
The Full model:
Y = 0 + 1 X1 + 2 X2 + 1 X3 + e
(mpg) (engine) (horse) (weight)
The reduced model:
Y = 0 + 1 X3 + e
The ANOVA Table for the Full model
ANOVAb
Model
1
Reg ression
Residual
Total
Sum of
Squares
16098.158
7720.836
23818.993
df
3
388
391
Mean
Square
5366.053
19.899
a. Predictors: (Constant), WEIGHT, HORSE, ENGINE
b. Dependent Variable: MPG
F
269.664
Sig .
.000a
The ANOVA Table for the Reduced model:
ANOVAb
Model
1
Reg ression
Residual
Total
Sum of
Squares
15519.970
8299.023
23818.993
df
1
390
391
Mean
Square
15519.970
21.280
F
729.337
Sig .
.000a
a. Predictors: (Constant), WEIGHT
b. Dependent Variable: MPG
The reduction in the residual sum of squares
= 8299.023 - 7720.835649 = 578.1875392
The ANOVA Table for testing
H0: 1 = 0, 2 = 0 against HA: 1 ≠ 0 or 2 ≠ 0
Sum of Squares
df
Mean Square
F
Sig.
Regression
15519.97028
1
15519.97028 779.93481 0.0000
0, 22 = 00
11== 0,
578.1875392
2
289.0937696 14.528011 0.0000
Residual
7720.835649 388
19.89906095
Total
23818.99347 391