Transcript Chapter 12
12-1 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. 12-3 Multiple Regression 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 12.9 The Linear Regression Model The Least Squares Estimates and Prediction The Mean Squared Error and the Standard Error Model Utility: R2, Adjusted R2, and the F Test Testing the Significance of an Independent Variable Confidence Intervals and Prediction Intervals Dummy Variables Model Building and the Effects of Multicollinearity Residual Analysis in Multiple Regression 12-4 12.1 The Linear Regression Model The linear regression model relating y to x1, x2, …, xk is y= μ y|x1 , x2 ,..., xk ε = β0 β1 x1 β2 x2 ... βk xk ε where μ y|x1 , x2 ,..., xk = β0 β1 x1 β2 x2 ... βk xk is the mean value of the dependent variable y when the values of the independent variables are x1, x2, …, xk. β0 , β1 , β2 ,..., βk are the regression parameters relating the mean value of y to x1, x2, …, xk. e is an error term that describes the effects on y of all factors other than the independent variables x1, x2, …, xk . 12-5 Example: The Linear Regression Model Example 12.1: The Fuel Consumption Case Week 1 2 3 4 5 6 7 8 Average Hourly Temperature, x1 (F) 28.0 28.0 32.5 39.0 45.9 57.8 58.1 62.5 Chill Index, x2 18 14 24 22 8 16 1 0 Fuel Consumption y (MMcf) 12.4 11.7 12.4 10.8 9.4 9.5 8.0 7.5 y = β0 β1 x1 β2 x2 ε 12-6 The Linear Regression Model Illustrated Example 12.1: The Fuel Consumption Case 12-7 The Regression Model Assumptions Model y= μ y|x1 , x2 ,..., xk ε = β0 β1 x1 β2 x2 ... βk xk ε Assumptions about the model error terms, e’s Mean Zero The mean of the error terms is equal to 0. Constant Variance The variance of the error terms s2 is, the same for every combination values of x1, x2, …, xk. Normality The error terms follow a normal distribution for every combination values of x1, x2, …, xk. Independence The values of the error terms are statistically independent of each other. 12-8 12.2 Least Squares Estimates and Prediction Estimation/Prediction Equation: yˆ b0 b1 x01 b2 x02 ... bk x0 k is the point estimate of the mean value of the dependent variable when the values of the independent variables are x01, x02, …, x0k. It is also the point prediction of an individual value of the dependent variable when the values of the independent variables are x01, x02, …, x0k. b1, b2, …, bk are the least squares point estimates of the parameters 1, 2, …, k. x01, x02, …, x0k are specified values of the independent predictor variables x1, x2, …, xk. 12-9 Example: Least Squares Estimation Example 12.3: The Fuel Consumption Case Minitab Output FuelCons = 13.1 - 0.0900 Temp + 0.0825 Chill Predictor Constant Temp Chill S = 0.3671 Coef 13.1087 -0.09001 0.08249 StDev 0.8557 0.01408 0.02200 R-Sq = 97.4% Analysis of Variance Source DF Regression 2 Residual Error 5 Total 7 SS 24.875 0.674 25.549 T 15.32 -6.39 3.75 P 0.000 0.001 0.013 R-Sq(adj) = 96.3% MS 12.438 0.135 Predicted Values (Temp = 40, Chill = 10) Fit StDev Fit 95.0% CI 10.333 0.170 ( 9.895, 10.771) ( F 92.30 P 0.000 95.0% PI 9.293, 11.374) Example: Point Predictions and Residuals 12-10 Example 12.3: The Fuel Consumption Case Average Hourly Week Temperature, x1 (F) Chill Index, x2 1 28.0 18 2 28.0 14 3 32.5 24 4 39.0 22 5 45.9 8 6 57.8 16 7 58.1 1 8 62.5 0 Observed Fuel Predicted Fuel Consumption Consumption Residual y (MMcf) 13.1087 - .0900x1 + .0825x2 e = y - pred 12.4 12.0733 0.3267 11.7 11.7433 -0.0433 12.4 12.1631 0.2369 10.8 11.4131 -0.6131 9.4 9.6372 -0.2372 9.5 9.2260 0.2740 8.0 7.9616 0.0384 7.5 7.4831 0.0169 12-11 12.3 Mean Square Error and Standard Error SSE ei2 ( yi yˆ i ) 2 Sum of Squared Errors s 2 MSE SSE n-(k 1) s MSE SSE n-(k 1) Mean Square Error, point estimate of residual variance s2 Standard Error, point estimate of residual standard deviation s Example 12.3 The Fuel Consumption Case Analysis of Variance Source DF Regression 2 Residual Error 5 Total 7 s 2 MSE SS 24.875 0.674 25.549 MS 12.438 0.135 F 92.30 P 0.000 SSE 0.674 0.1348 s s 2 0.1348 0.3671 n-(k 1) 8 3 12-12 12.4 Model Utility: Multiple Coefficient of Determination, R² The multiple coefficient of determination R2 is Explained variatio n 2 R Total variation R2 is the proportion of the total variation in y explained by the linear regression model Total variation Explained variation Unexplaine d variation Total variation = (yi y )2 Total Sum of Squares (SSTO) Explained variation = (yˆ i y )2 Regression Sum of Squares (SSR) Unexplaine d variation = (yi yˆ i )2 Error Sum of Squares (SSE) Multiple correlation coefficient , R R 2 12-13 12.4 Model Utility: Adjusted R2 The adjusted multiple coefficient of determination is k n 1 2 R R n 1 n (k 1) 2 Fuel Consumption Case: S = 0.3671 R-Sq = 97.4% Analysis of Variance Source DF SS Regression 2 24.875 Residual Error 5 0.674 Total 7 25.549 R2 R-Sq(adj) = 96.3% MS 12.438 0.135 F 92.30 P 0.000 24.875 2 8 1 0.963 0.974, R 2 0.974 25.549 8 1 8 (2 1) 12-14 12.4 Model Utility: F Test for Linear Regression Model To test H0: 1= 2 = …= k = 0 versus Ha: At least one of the 1, 2, …, k is not equal to 0 Test Statistic: F(model) (Explained variation )/k (Unexplain ed variation )/[n - (k 1)] Reject H0 in favor of Ha if: F(model) > Fa or p-value < a Fa is based on k numerator and n-(k+1) denominator degrees of freedom. 12-15 Example: F Test for Linear Regression Example 12.5 The Fuel Consumption Case Analysis of Variance Source DF Regression 2 Residual Error 5 Total 7 SS 24.875 0.674 25.549 MS 12.438 0.135 Minitab Output F 92.30 P 0.000 Test Statistic: F(model) (Explained variation )/k 24.875 / 2 92.30 (Unexplain ed variation )/[n - (k 1)] 0.674 /(8 3) Reject H0 at a level of significance, since F-test at a = 0.05 F(model) 92.30 5.79 F.05 and level of significance p - value 0.000 0.05 a Fa is based on 2 numerator and 5 denominator degrees of freedom. 12-16 12.5 Testing Significance of the Independent Variable If the regression assumptions hold, we can reject H0: j = 0 at the a level of significance (probability of Type I error equal to a) if and only if the appropriate rejection point condition holds or, equivalently, if the corresponding p-value is less than a. Alternative Ha : j 0 Reject H0 if: p-Value t ta Area under t distributi on right of t Ha : j 0 t ta Area under t distributi on left of t Ha : j 0 t ta / 2 , that is Twice area under t distributi on right of t t ta / 2 or t ta / 2 Test Statistic t= bj sbj 100(1-a)% Confidence Interval for j [b j ta / 2 sb j ] ta, ta/2 and p-values are based on n – (k+1) degrees of freedom. 12-17 Example: Testing and Estimation for s Example 12.6: The Fuel Consumption Case Predictor Constant Temp Chill Coef 13.1087 -0.09001 0.08249 StDev 0.8557 0.01408 0.02200 Test b 0.08249 t= 2 3.75 2.571 t.025 sb2 0.02200 p value 2 P(t 3.75) 0.013 Minitab Output T 15.32 -6.39 3.75 P 0.000 0.001 0.013 Interval [b2 ta / 2 sb2 ] [0.08249 (2.571)(0.02200)] [0.08249 0.05656] [0.02593, 0.13905] Chill is significant at the a = 0.05 level, but not at a = 0.01 ta, ta/2 and p-values are based on 5 degrees of freedom. 12.6 Confidence and Prediction Intervals 12-18 Prediction: yˆ b0 b1 x01 b2 x02 ... bk x0 k If the regression assumptions hold, 100(1 - a)% confidence interval for the mean value of y [yˆ ta/2 s( y yˆ ) ] s( y yˆ ) s Distance value 100(1 - a)% prediction interval for an individual value of y [yˆ ta/2 s yˆ ], s yˆ s 1 + Distance value (Distance value requires matrix algebra) ta/2 is based on n-(k+1) degrees of freedom Example: Confidence and Prediction Intervals 12-19 Example 12.9 The Fuel Consumption Case Minitab Output FuelCons = 13.1 - 0.0900 Temp + 0.0825 Chill Predicted Values (Temp = 40, Chill = 10) Fit StDev Fit 95.0% CI 95.0% PI 10.333 0.170 (9.895, 10.771) (9.293,11.374) 95% Confidence Interval 95% Prediction Interval [yˆ t a/2 s Distance value ] [yˆ t a/2 s 1 Distance value ] [10.333 (2.571)(0.3671) 0.2144515 ] [10.333 (2.571)(0.3671) 1 0.2144515 ] [10.333 0.438] [9.895,10.771] [10.333 1.041] [9.292,11.374] 12-20 12.7Dummy Variables Example 12.11 The Electronics World Case Store 1 2 3 4 5 6 7 8 9 10 Number of Households x 161 99 135 120 164 221 179 204 214 101 Location Street Street Street Street Street Mall Mall Mall Mall Mall Location Dummy DM 0 0 0 0 0 1 1 1 1 1 Sales Volume y 157.27 93.28 136.81 123.79 153.51 241.74 201.54 206.71 229.78 135.22 Location Dummy Variable DM 1 if a store is in a mall location 0 otherwise Example: Regression with a Dummy Variable 12-21 Example 12.11: The Electronics World Case Minitab Output Sales = 17.4 + 0.851 Households + 29.2 DM Predictor Constant Househol DM S = 7.329 Coef 17.360 0.85105 29.216 StDev 9.447 0.06524 5.594 R-Sq = 98.3% T 1.84 13.04 5.22 P 0.109 0.000 0.001 R-Sq(adj) = 97.8% Analysis of Variance Source Regression Residual Error Total DF 2 7 9 SS 21412 376 21788 MS 10706 54 F 199.32 P 0.000 12.8 Model Building and the Effects of Multicollinearity 12-22 Example: The Sale Territory Performance Case Sales 3669.88 3473.95 2295.10 4675.56 6125.96 2134.94 5031.66 3367.45 6519.45 4876.37 2468.27 2533.31 2408.11 2337.38 4586.95 2729.24 3289.40 2800.78 3264.20 3453.62 1741.45 2035.75 1578.00 4167.44 2799.97 Time 43.10 108.13 13.82 186.18 161.79 8.94 365.04 220.32 127.64 105.69 57.72 23.58 13.82 13.82 86.99 165.85 116.26 42.28 52.84 165.04 10.57 13.82 8.13 58.54 21.14 Adver MktShare MktPoten 2.51 74065.11 4582.88 5.51 58117.30 5539.78 10.91 21118.49 2950.38 8.27 68521.27 2243.07 9.15 57805.11 7747.08 5.51 402.44 37806.94 8.54 50935.26 3140.62 7.07 35602.08 2086.16 12.54 46176.77 8846.25 8.85 42053.24 5673.11 5.38 36829.71 2761.76 5.43 33612.67 1991.85 8.48 21412.79 1971.52 7.80 20416.87 1737.38 10.34 36272.00 10694.20 5.15 23093.26 8618.61 6.64 26879.59 7747.89 5.45 39571.96 4565.81 6.31 51866.15 6022.70 6.35 58749.82 3721.10 7.37 860.97 23990.82 8.39 25694.86 3571.51 5.15 23736.35 2845.50 12.88 34314.29 5060.11 9.14 22809.53 3552.00 Change 0.34 0.15 -0.72 0.17 0.50 0.15 0.55 -0.49 1.24 0.31 0.37 -0.65 0.64 1.01 0.11 0.04 0.68 0.66 -0.10 -0.03 -1.63 -0.43 0.04 0.22 -0.74 Accts WkLoad 15.05 74.86 19.97 107.32 17.34 96.75 13.40 195.12 17.64 180.44 16.22 104.88 18.80 256.10 19.86 126.83 17.42 203.25 21.41 119.51 16.32 116.26 14.51 142.28 19.35 89.43 20.02 84.55 15.26 119.51 15.87 80.49 7.81 136.58 16.00 78.86 17.44 136.58 17.98 138.21 20.99 75.61 21.66 102.44 21.46 76.42 24.78 136.58 24.96 88.62 Rating 4.9 5.1 2.9 3.4 4.6 4.5 4.6 2.3 4.9 2.8 3.1 4.2 4.3 4.2 5.5 3.6 3.4 4.2 3.6 3.1 1.6 3.4 2.7 2.8 3.9 12-23 Correlation Matrix Example: The Sale Territory Performance Case 12-24 Multicollinearity Multicollinearity refers to the condition where the independent variables (or predictors) in a model are dependent, related, or correlated with each other. Effects Hinders ability to use bjs, t statistics, and p-values to assess the relative importance of predictors. Does not hinder ability to predict the dependent (or response) variable. Detection Scatter Plot Matrix Correlation Matrix Variance Inflation Factors (VIF) 12.9 Residual Analysis in Multiple Regression 12-25 For an observed value of yi, the residual is ei yi yˆi yi (b0 b1 xi1 ... bk xik ) If the regression assumptions hold, the residuals should look like a random sample from a normal distribution with mean 0 and variance s2. Residual Plots Residuals versus each independent variable Residuals versus predicted y’s Residuals in time order (if the response is a time series) Histogram of residuals Normal plot of the residuals 12-26 Multiple Regression Summary: 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 12.9 The Linear Regression Model The Least Squares Estimates and Prediction The Mean Squared Error and the Standard Error Model Utility: R2, Adjusted R2, and the F Test Testing the Significance of an Independent Variable Confidence Intervals and Prediction Intervals Dummy Variables Model Building and the Effects of Multicollinearity Residual Analysis in Multiple Regression