Transcript Chapter 12

12-1
12-2
Chapter Twelve
Multiple Regression
and Model Building
McGraw-Hill/Irwin
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
12-3
Multiple Regression
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
12.9
The Linear Regression Model
The Least Squares Estimates and Prediction
The Mean Squared Error and the Standard Error
Model Utility: R2, Adjusted R2, and the F Test
Testing the Significance of an Independent Variable
Confidence Intervals and Prediction Intervals
Dummy Variables
Model Building and the Effects of Multicollinearity
Residual Analysis in Multiple Regression
12-4
12.1 The Linear Regression Model
The linear regression model relating y to x1, x2, …, xk is
y= μ y|x1 , x2 ,..., xk  ε = β0  β1 x1  β2 x2  ...  βk xk  ε
where
μ y|x1 , x2 ,..., xk = β0  β1 x1  β2 x2  ...  βk xk is the mean value of the
dependent variable y when the values of the independent
variables are x1, x2, …, xk.
β0 , β1 , β2 ,..., βk are the regression parameters relating the mean
value of y to x1, x2, …, xk.
e is an error term that describes the effects on y of all factors other
than the independent variables x1, x2, …, xk .
12-5
Example: The Linear Regression Model
Example 12.1: The Fuel Consumption Case
Week
1
2
3
4
5
6
7
8
Average Hourly
Temperature, x1 (F)
28.0
28.0
32.5
39.0
45.9
57.8
58.1
62.5
Chill Index, x2
18
14
24
22
8
16
1
0
Fuel Consumption
y (MMcf)
12.4
11.7
12.4
10.8
9.4
9.5
8.0
7.5
y = β0  β1 x1  β2 x2  ε
12-6
The Linear Regression Model Illustrated
Example 12.1: The Fuel Consumption Case
12-7
The Regression Model Assumptions
Model
y= μ y|x1 , x2 ,..., xk  ε = β0  β1 x1  β2 x2  ...  βk xk  ε
Assumptions about the model error terms, e’s
Mean Zero The mean of the error terms is equal to 0.
Constant Variance The variance of the error terms s2 is,
the same for every combination values of x1, x2, …, xk.
Normality The error terms follow a normal distribution
for every combination values of x1, x2, …, xk.
Independence The values of the error terms are
statistically independent of each other.
12-8
12.2 Least Squares Estimates and Prediction
Estimation/Prediction Equation:
yˆ  b0  b1 x01  b2 x02  ...  bk x0 k
is the point estimate of the mean value of the dependent variable when
the values of the independent variables are x01, x02, …, x0k. It is also the
point prediction of an individual value of the dependent variable
when the values of the independent variables are x01, x02, …, x0k.
b1, b2, …, bk are the least squares point estimates of the parameters
1,  2, …,  k.
x01, x02, …, x0k are specified values of the independent predictor
variables x1, x2, …, xk.
12-9
Example: Least Squares Estimation
Example 12.3: The Fuel Consumption Case
Minitab Output
FuelCons = 13.1 - 0.0900 Temp + 0.0825 Chill
Predictor
Constant
Temp
Chill
S = 0.3671
Coef
13.1087
-0.09001
0.08249
StDev
0.8557
0.01408
0.02200
R-Sq = 97.4%
Analysis of Variance
Source
DF
Regression
2
Residual Error
5
Total
7
SS
24.875
0.674
25.549
T
15.32
-6.39
3.75
P
0.000
0.001
0.013
R-Sq(adj) = 96.3%
MS
12.438
0.135
Predicted Values (Temp = 40, Chill = 10)
Fit StDev Fit
95.0% CI
10.333
0.170
(
9.895, 10.771)
(
F
92.30
P
0.000
95.0% PI
9.293, 11.374)
Example: Point Predictions and Residuals
12-10
Example 12.3: The Fuel Consumption Case
Average Hourly
Week Temperature, x1 (F) Chill Index, x2
1
28.0
18
2
28.0
14
3
32.5
24
4
39.0
22
5
45.9
8
6
57.8
16
7
58.1
1
8
62.5
0
Observed Fuel
Predicted Fuel
Consumption
Consumption
Residual
y (MMcf) 13.1087 - .0900x1 + .0825x2 e = y - pred
12.4
12.0733
0.3267
11.7
11.7433
-0.0433
12.4
12.1631
0.2369
10.8
11.4131
-0.6131
9.4
9.6372
-0.2372
9.5
9.2260
0.2740
8.0
7.9616
0.0384
7.5
7.4831
0.0169
12-11
12.3 Mean Square Error and Standard Error
SSE   ei2   ( yi  yˆ i ) 2 Sum of Squared Errors
s 2  MSE 
SSE
n-(k  1)
s  MSE 
SSE
n-(k  1)
Mean Square Error, point estimate
of residual variance s2
Standard Error, point estimate of
residual standard deviation s
Example 12.3 The Fuel Consumption Case
Analysis of Variance
Source
DF
Regression
2
Residual Error
5
Total
7
s 2  MSE 
SS
24.875
0.674
25.549
MS
12.438
0.135
F
92.30
P
0.000
SSE
0.674

 0.1348 s  s 2  0.1348  0.3671
n-(k  1) 8  3
12-12
12.4 Model Utility: Multiple Coefficient of
Determination, R²
The multiple coefficient of determination R2 is
Explained variatio n
2
R 
Total variation
R2 is the proportion of the total variation in y explained by the linear
regression model
Total variation  Explained variation  Unexplaine d variation
Total variation =  (yi  y )2 Total Sum of Squares (SSTO)
Explained variation =  (yˆ i  y )2 Regression Sum of Squares (SSR)
Unexplaine d variation =  (yi  yˆ i )2
Error Sum of Squares (SSE)
Multiple correlation coefficient , R  R 2
12-13
12.4 Model Utility: Adjusted R2
The adjusted multiple coefficient of determination is
k  n  1 
 2

R  R 

n  1  n  (k  1) 

2
Fuel Consumption Case:
S = 0.3671
R-Sq = 97.4%
Analysis of Variance
Source
DF
SS
Regression
2
24.875
Residual Error
5
0.674
Total
7
25.549
R2 
R-Sq(adj) = 96.3%
MS
12.438
0.135
F
92.30
P
0.000
24.875
2  8  1 

  0.963
 0.974, R 2   0.974 

25.549
8  1  8  (2  1) 

12-14
12.4 Model Utility: F Test for
Linear Regression Model
To test H0: 1= 2 = …= k = 0 versus
Ha: At least one of the 1, 2, …, k is not equal to 0
Test Statistic:
F(model) 
(Explained variation )/k
(Unexplain ed variation )/[n - (k  1)]
Reject H0 in favor of Ha if:
F(model) > Fa or
p-value < a
Fa is based on k numerator and n-(k+1) denominator degrees of freedom.
12-15
Example: F Test for Linear Regression
Example 12.5 The Fuel Consumption Case
Analysis of Variance
Source
DF
Regression
2
Residual Error
5
Total
7
SS
24.875
0.674
25.549
MS
12.438
0.135
Minitab Output
F
92.30
P
0.000
Test Statistic:
F(model) 
(Explained variation )/k
24.875 / 2

 92.30
(Unexplain ed variation )/[n - (k  1)] 0.674 /(8  3)
Reject H0 at a level of significance, since
F-test at a = 0.05
F(model)  92.30  5.79  F.05 and
level of significance
p - value  0.000  0.05  a
Fa is based on 2 numerator and 5 denominator degrees of freedom.
12-16
12.5 Testing Significance
of the Independent Variable
If the regression assumptions hold, we can reject H0: j = 0 at the a
level of significance (probability of Type I error equal to a) if and only if
the appropriate rejection point condition holds or, equivalently, if the
corresponding p-value is less than a.
Alternative
Ha :  j  0
Reject H0 if:
p-Value
t  ta
Area under t distributi on right of t
Ha :  j  0
t  ta
Area under t distributi on left of t
Ha :  j  0
t  ta / 2 , that is
Twice area under t distributi on right of t
t  ta / 2 or t  ta / 2
Test Statistic
t=
bj
sbj
100(1-a)% Confidence Interval for j
[b j  ta / 2 sb j ]
ta, ta/2 and p-values are based on n – (k+1) degrees of freedom.
12-17
Example: Testing and
Estimation for s
Example 12.6: The Fuel Consumption Case
Predictor
Constant
Temp
Chill
Coef
13.1087
-0.09001
0.08249
StDev
0.8557
0.01408
0.02200
Test
b
0.08249
t= 2 
 3.75  2.571  t.025
sb2
0.02200
p  value  2  P(t  3.75)  0.013
Minitab Output
T
15.32
-6.39
3.75
P
0.000
0.001
0.013
Interval
[b2  ta / 2 sb2 ] 
[0.08249  (2.571)(0.02200)] 
[0.08249  0.05656] 
[0.02593, 0.13905]
Chill is significant at the a = 0.05 level, but not at a = 0.01
ta, ta/2 and p-values are based on 5 degrees of freedom.
12.6 Confidence and Prediction Intervals
12-18
Prediction: yˆ  b0  b1 x01  b2 x02  ...  bk x0 k
If the regression assumptions hold,
100(1 - a)% confidence interval for the mean value of y
[yˆ  ta/2 s( y  yˆ ) ] s( y  yˆ )  s Distance value
100(1 - a)% prediction interval for an individual value of y
[yˆ  ta/2 s yˆ ],
s yˆ  s 1 + Distance value
(Distance value requires matrix algebra)
ta/2 is based on n-(k+1) degrees of freedom
Example: Confidence and Prediction Intervals
12-19
Example 12.9 The Fuel Consumption Case
Minitab Output
FuelCons = 13.1 - 0.0900 Temp + 0.0825 Chill
Predicted Values (Temp = 40, Chill = 10)
Fit StDev Fit
95.0% CI
95.0% PI
10.333
0.170
(9.895, 10.771) (9.293,11.374)
95% Confidence Interval
95% Prediction Interval
[yˆ  t a/2 s Distance value ]
[yˆ  t a/2 s 1  Distance value ]
[10.333  (2.571)(0.3671) 0.2144515 ]
[10.333  (2.571)(0.3671) 1  0.2144515 ]
[10.333  0.438]
[9.895,10.771]
[10.333  1.041]
[9.292,11.374]
12-20
12.7Dummy Variables
Example 12.11 The Electronics World Case
Store
1
2
3
4
5
6
7
8
9
10
Number of
Households
x
161
99
135
120
164
221
179
204
214
101
Location
Street
Street
Street
Street
Street
Mall
Mall
Mall
Mall
Mall
Location
Dummy
DM
0
0
0
0
0
1
1
1
1
1
Sales
Volume
y
157.27
93.28
136.81
123.79
153.51
241.74
201.54
206.71
229.78
135.22
Location Dummy Variable
DM 
1 if a store is in a mall location
0 otherwise
Example: Regression with a Dummy Variable
12-21
Example 12.11: The Electronics World Case
Minitab Output
Sales = 17.4 + 0.851 Households + 29.2 DM
Predictor
Constant
Househol
DM
S = 7.329
Coef
17.360
0.85105
29.216
StDev
9.447
0.06524
5.594
R-Sq = 98.3%
T
1.84
13.04
5.22
P
0.109
0.000
0.001
R-Sq(adj) = 97.8%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
2
7
9
SS
21412
376
21788
MS
10706
54
F
199.32
P
0.000
12.8 Model Building and the Effects of Multicollinearity
12-22
Example: The Sale Territory Performance Case
Sales
3669.88
3473.95
2295.10
4675.56
6125.96
2134.94
5031.66
3367.45
6519.45
4876.37
2468.27
2533.31
2408.11
2337.38
4586.95
2729.24
3289.40
2800.78
3264.20
3453.62
1741.45
2035.75
1578.00
4167.44
2799.97
Time
43.10
108.13
13.82
186.18
161.79
8.94
365.04
220.32
127.64
105.69
57.72
23.58
13.82
13.82
86.99
165.85
116.26
42.28
52.84
165.04
10.57
13.82
8.13
58.54
21.14
Adver MktShare
MktPoten
2.51
74065.11 4582.88
5.51
58117.30 5539.78
10.91
21118.49 2950.38
8.27
68521.27 2243.07
9.15
57805.11 7747.08
5.51
402.44
37806.94
8.54
50935.26 3140.62
7.07
35602.08 2086.16
12.54
46176.77 8846.25
8.85
42053.24 5673.11
5.38
36829.71 2761.76
5.43
33612.67 1991.85
8.48
21412.79 1971.52
7.80
20416.87 1737.38
10.34
36272.00 10694.20
5.15
23093.26 8618.61
6.64
26879.59 7747.89
5.45
39571.96 4565.81
6.31
51866.15 6022.70
6.35
58749.82 3721.10
7.37
860.97
23990.82
8.39
25694.86 3571.51
5.15
23736.35 2845.50
12.88
34314.29 5060.11
9.14
22809.53 3552.00
Change
0.34
0.15
-0.72
0.17
0.50
0.15
0.55
-0.49
1.24
0.31
0.37
-0.65
0.64
1.01
0.11
0.04
0.68
0.66
-0.10
-0.03
-1.63
-0.43
0.04
0.22
-0.74
Accts WkLoad
15.05
74.86
19.97
107.32
17.34
96.75
13.40
195.12
17.64
180.44
16.22
104.88
18.80
256.10
19.86
126.83
17.42
203.25
21.41
119.51
16.32
116.26
14.51
142.28
19.35
89.43
20.02
84.55
15.26
119.51
15.87
80.49
7.81
136.58
16.00
78.86
17.44
136.58
17.98
138.21
20.99
75.61
21.66
102.44
21.46
76.42
24.78
136.58
24.96
88.62
Rating
4.9
5.1
2.9
3.4
4.6
4.5
4.6
2.3
4.9
2.8
3.1
4.2
4.3
4.2
5.5
3.6
3.4
4.2
3.6
3.1
1.6
3.4
2.7
2.8
3.9
12-23
Correlation Matrix
Example: The Sale Territory Performance Case
12-24
Multicollinearity
Multicollinearity refers to the condition where the
independent variables (or predictors) in a model are
dependent, related, or correlated with each other.
Effects
Hinders ability to use bjs, t statistics, and p-values to
assess the relative importance of predictors.
Does not hinder ability to predict the dependent (or
response) variable.
Detection
Scatter Plot Matrix
Correlation Matrix
Variance Inflation Factors (VIF)
12.9 Residual Analysis in Multiple Regression
12-25
For an observed value of yi, the residual is
ei  yi  yˆi  yi  (b0  b1 xi1  ...  bk xik )
If the regression assumptions hold, the residuals should
look like a random sample from a normal distribution
with mean 0 and variance s2.
Residual Plots
Residuals versus each independent variable
Residuals versus predicted y’s
Residuals in time order (if the response is a time series)
Histogram of residuals
Normal plot of the residuals
12-26
Multiple Regression
Summary:
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
12.9
The Linear Regression Model
The Least Squares Estimates and Prediction
The Mean Squared Error and the Standard Error
Model Utility: R2, Adjusted R2, and the F Test
Testing the Significance of an Independent
Variable
Confidence Intervals and Prediction Intervals
Dummy Variables
Model Building and the Effects of
Multicollinearity
Residual Analysis in Multiple Regression