Transcript Chapter 12b
Chapter 12b •Testing for significance—the t-test •Developing confidence intervals for estimates of β1. •Testing for significance—the f-test •Using Excel’s regression tool. Testing for Significance To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of b1 is zero. Two tests are commonly used: t Test and F Test Both the t test and F test require an estimate of s 2, the variance of e in the regression model. Testing for Significance • An Estimate of s The mean square error (MSE) provides the estimate of s 2, and the notation s2 is also used. s 2 = MSE = SSE/(n 2) where: SSE (yi yˆi ) 2 ( yi b0 b1 xi ) 2 Testing for Significance • An Estimate of s • To estimate s we take the square root of s 2. • The resulting s is called the standard error of the estimate. SSE s MSE n2 Testing for Significance: t Test • Hypotheses H0 : b1 0 H a : b1 0 • Test Statistic b1 t sb1 Testing for Significance: t Test Rejection Rule Reject H0 if t < -t or t > t where: t is based on a t distribution with n - 2 degrees of freedom Testing for Significance: t Test Using the Test Statistic 1. Determine the hypotheses. H0 : b1 0 H a : b1 0 2. Specify the level of significance. = .05 b1 3. Select the test statistic. t sb1 4. State the rejection rule. Reject H0 if |t| > 3.182 (3 degrees of freedom) Testing for Significance: t Test Using the Test Statistic 5. Compute the value of the test statistic. b1 5 t 4.63 sb1 1.08 6. Determine whether to reject H0. Because t = 4.63 > 3.182, we reject H0. At the .05 level of significance, the sample evidence indicates that there is a significant relationship between the number of TV ads aired and the number of cars sold. Confidence Interval for b1 We can use a 95% confidence interval for b1 to test the hypotheses just used in the t test. H0 is rejected if the hypothesized value of b1 is not included in the confidence interval for b1. Confidence Interval for b1 • The form of a confidence interval for b1 is: t /2 sb1 b1 is the point estimator b1 t /2 sb1 is the margin of error where t / 2 is the t value providing an area of /2 in the upper tail of a t distribution with n - 2 degrees of freedom Confidence Interval for b1 • Rejection Rule Reject H0 if 0 is not included in the confidence interval for b1. • 95% Confidence Interval for b1 b1 t / 2 sb1 = 5 +/- 3.182(1.08) = 5 +/- 3.44 or 1.56 to 8.44 • Conclusion 0 is not included in the confidence interval. Reject H0 Testing for Significance: F Test Hypotheses H0 : b1 0 H a : b1 0 Test Statistic F = MSR/MSE Testing for Significance: F Test Rejection Rule Reject H0 if F > F where: F is based on an F distribution with 1 degree of freedom in the numerator and n - 2 degrees of freedom in the denominator Testing for Significance: F Test Using the Test Statistic 1. Determine the hypotheses. H0 : b1 0 H a : b1 0 2. Specify the level of significance. 3. Select the test statistic. 4. State the rejection rule. = .05 F = MSR/MSE Reject H0 if F > 10.13 (1 d.f. in numerator, 3 d.f. in denominator) Testing for Significance: F Test Using the Test Statistic 5. Compute the value of the test statistic. F = MSR/MSE = 100/4.667 = 21.43 6. Determine whether to reject H0. Because F = 21.43 > 10.13, we reject H0. At the .05 level of significance, the statistical evidence is sufficient to conclude that we have a significant relationship between the number of TV ads aired and the number of cars sold. Some Cautions about the Interpretation of Significance Tests Rejecting H0: b1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y. Just because we are able to reject H0: b1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y. Using Excel’s Regression Tool Up to this point, you have seen how Excel can be used for various parts of a regression analysis. Excel also has a comprehensive tool in its Data Analysis package called Regression. The Regression tool can be used to perform a complete regression analysis. Using Excel’s Regression Tool Formula Worksheet (showing data) 1 2 3 4 5 6 7 A Week 1 2 3 4 5 B TV Ads 1 3 2 1 3 C Cars Sold 14 24 18 17 27 D Using Excel’s Regression Tool Performing the Regression Analysis Step 1 Select the Tools pull-down menu Step 2 Choose the Data Analysis option Step 3 Choose Regression from the list of Analysis Tools Using Excel’s Regression Tool Performing the Regression Analysis Step 4 When the Regression dialog box appears: Enter C1:C6 in the Input Y Range box Enter B1:B6 in the Input X Range box Select Labels Select Confidence Level Enter 95 in the Confidence Level box Select Output Range Enter A9 (any cell) in the Ouput Range box Click OK to begin the regression analysis Using Excel’s Regression Tool Regression Dialog Box Using Excel’s Regression Tool 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Value Worksheet A Week 1 2 3 4 5 B TV Ads 1 3 2 1 3 C Cars Sold 14 24 18 17 27 SUMMARY OUTPUT D E F G H I Data Regression Statistics Output Regression Statistics Multiple R 0.936585812 R Square 0.877192982 Adjusted R Square 0.83625731 Standard Error 2.160246899 Observations 5 Estimated Regression Equation Output ANOVA Output ANOVA df Regression Residual Total Intercept TV Ads SS 1 3 4 MS F Significance F 100 100 21.42857 0.018986231 14 4.666667 114 Coefficients Standard Error t Stat P-value 10 2.366431913 4.225771 0.024236 5 1.08012345 4.6291 0.018986 Lower 95% Upper 95% Lower 95.0% Upper 95.0% 2.468950436 17.53104956 2.468950436 17.53104956 1.562561893 8.437438107 1.562561893 8.437438107 Using Excel’s Regression Tool Estimated Regression Equation Output (left portion) A 22 23 24 Intercept 25 TV Ads 26 B C D E Coeffic. Std. Err. t Stat P-value 10 2.36643 4.2258 0.02424 5 1.08012 4.6291 0.01899 Note: Columns F-I are not shown. Using Excel’s Regression Tool Estimated Regression Equation Output (right portion) A 22 23 24 Intercept 25 TV Ads 26 B F G H I Coeffic. Low. 95% Up. 95% Low. 95.0% Up. 95.0% 10 2.46895 17.53105 2.46895044 17.5310496 5 1.562562 8.437438 1.56256189 8.43743811 Note: Columns C-E are hidden. Using Excel’s Regression Tool ANOVA Output A 16 17 18 19 20 21 22 B C D E F ANOVA df Regression Residual Total SS MS F Significance F 1 100 100 21.4286 0.018986231 3 14 4.66667 4 114 Using Excel’s Regression Tool Regression Statistics Output A 9 10 11 12 13 14 15 16 B Regression Statistics Multiple R 0.936585812 R Square 0.877192982 Adjusted R Square 0.83625731 Standard Error 2.160246899 Observations 5 C Using the Estimated Regression Equation for Estimation and Prediction Confidence Interval Estimate of E(yp) y p t /2 s y p Prediction Interval Estimate of yp y p t /2 sind where: confidence coefficient is 1 - and t/2 is based on a t distribution with n - 2 degrees of freedom Point Estimation If 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be: y^ = 10 + 5(3) = 25 cars Using Excel to Develop Confidence and Prediction Interval Estimates Formula Worksheet (confidence interval portion) D 1 2 3 4 E F CONFIDENCE INTERVAL xp 3 x bar =AVERAGE(B2:B6) x p -x bar =F2-F3 5 (x p -x bar)2 =F4^2 (x p -x bar)2 Variance of y hat Std. Dev of y hat t Value Margin of Error Point Estimate Lower Limit Upper Limit =DEVSQ(B2:B6) =D20*(1/B15+F5/F6) =SQRT(F7) =TINV(0.05,3) =F9*F8 =B24+B25*F2 =F11-F10 =F11+F10 6 7 8 9 10 11 12 13 G Using Excel to Develop Confidence and Prediction Interval Estimates Value Worksheet (confidence interval portion) D 1 2 3 4 E CONFIDENCE INTERVAL xp 3 x bar 2.0 x p -x bar 1.0 5 (x p -x bar)2 1.0 (x p -x bar)2 Variance of y hat Std. Dev of y hat t Value Margin of Error Point Estimate Lower Limit Upper Limit 4.0 2.1000 1.4491 3.1824 4.6118 25.0 20.39 29.61 6 7 8 9 10 11 12 13 F G Confidence Interval for E(yp) The 95% confidence interval estimate of the mean number of cars sold when 3 TV ads are run is: 25 + 4.61 = 20.39 to 29.61 cars Using Excel to Develop Confidence and Prediction Interval Estimates Formula Worksheet (prediction interval portion) 1 2 3 4 H I PREDICTION INTERVAL Variance of y ind =D20+F7 Std. Dev. of y ind =SQRT(I2) Margin of Error =F9*I3 5 Lower Limit =F11-I4 6 7 Upper Limit =F11+I4 Using Excel to Develop Confidence and Prediction Interval Estimates Value Worksheet (prediction interval portion) 1 2 3 4 H PREDICTION INTERVAL Variance of y ind 6.76667 Std. Dev. of y ind 2.60128 Margin of Error 8.27845 5 Lower Limit 16.72 6 7 Upper Limit 33.28 I Prediction Interval for yp The 95% prediction interval estimate of the number of cars sold in one particular week when 3 TV ads are run is: 25 + 8.28 = 16.72 to 33.28 cars