Chapter 23: Inferences About Means Confidence Intervals & Hypotheses About Means To create confidence intervals and test hypotheses about means: Base both on the sampling.
Download ReportTranscript Chapter 23: Inferences About Means Confidence Intervals & Hypotheses About Means To create confidence intervals and test hypotheses about means: Base both on the sampling.
Chapter 23: Inferences About Means Confidence Intervals & Hypotheses About Means To create confidence intervals and test hypotheses about means: Base both on the sampling model CLT tells us that the sampling model is Normal Standard Error is just the estimated standard deviation of the sampling model Gosset’s t Gosset worked as a quality control engineer. He noticed that with small sample size, his tests for quality weren’t quite right. When he used the estimated standard error, the shape of the sampling model changed; he called the new model a t-distribution. Student’s t-models form a whole family of related distributions that depend on a parameter known as degrees of freedom (df). A Sampling Distribution for Means When the conditions are met, the standardized sample mean, t y SE y follows a Student’s t-model with n – 1 degrees of freedom. We estimate the standard error with SE y s n Gosset’s Model When Gosset corrected the model for the extra uncertainty, the margin of error got bigger. When you use Gosset’s model instead of the Normal model, your confidence interval will be slightly wider and your P-values slightly larger. “To t or not to t?” If you know use z (very rare!). Whenever you use s to estimate , use t. Student’s t-models are unimodal, symmetric and bell shaped. Assumptions and Conditions Independence Assumption Randomization condition 10% condition Normal Population Assumption Nearly Normal condition The data come from a distribution that is unimodal and symmetric. Check by making a histogram or Normal probability plot. One-sample t-interval When the conditions are met, find the confidence level for the population mean, . Since the standard error of the mean is SE y t * n 1 s , the interval y t * n 1 SE y . The critical value n depends on the particular confidence level, C , that you specify and on the number of degrees of freedom, n 1, which we get from the sample size. A One-Sample t-Interval for the Mean Identify the parameter: Find a 90% confidence interval for the mean speed of vehicles driving on Triphammer Road. Look at the data: Enter data into L1. A One-Sample t-Interval for the Mean Check the conditions: Randomization: we have a convenience sample, but we have reason to believe it is representative. 10%: the cars observed were fewer than 10% of al cars traveling on Triphammer Road. Nearly Normal Condition: The histogram is unimodal and symmetric A One-Sample t-Interval for the Mean State the sampling distribution model for the statistic. Under these conditions the sampling distribution of the mean can be modeled by Student’s t-model with 22 degrees of freedom: n 1 23 1 22 Choose your method. We will use a one-sample t-interval for the mean. A One-Sample t-Interval for the Mean Construct the confidence interval Under STAT TESTS choose Tinterval We know n 23 cars y 31.0 mph s 4.25 mph Margin of Error: ME t *22 SE y 1.717 .0886 1.521 mph A One-Sample t-Interval for the Mean Interpretation: We are 90% confident that the true mean speed of all vehicles on Triphammer Road is between 29.5 and 32.5 miles per hour. Caution: this was not a random sample of vehicles. It was a convenience sample taken at one time of the day. The drivers could possibly have seen the police device and may have slowed down. We are reluctant to extend our inference to other situations. A One-Sample t-Test for the Mean State the hypotheses: We want to know whether the mean speed of vehicles on Triphammer Road exceeds the posted speed limit of 30 mph. State the null hypothesis: HO : Mean speed, 30 mph H A : Mean speed, 30 mph A One-Sample t-Test for the Mean The histogram: Check the conditions: Randomization: we have a convenience sample, but we have reason to believe it is representative. 10%: the cars observed were fewer than 10% of al cars traveling on Triphammer Road. Nearly Normal Condition: The histogram is unimodal and symmetric. A One-Sample t-Test for the Mean State the sampling distribution model for the statistic. Under these conditions the sampling distribution of the mean can be modeled by Student’s tmodel with 22 degrees of freedom: n 1 23 1 22 Choose your method. We will use a one-sample t-test for the mean. A One-Sample t-Test for the Mean STAT TESTS T-Test Calculate: STAT TESTS T-Test Draw: A One-Sample t-Test for the Mean Conclusion: Link the P-value to your decision about the null hypothesis and state your conclusion in context. The P-value of 0.126 says that if the true mean speed of vehicles on Triphammer Road were 30 mph, samples of 23 vehicles can expected to have an observed mean of at least 31.0 mph 12.6% of the time. That P-value is not small enough for us to reject the hypothesis that the true mean is 30 mph at any alpha level. We conclude that there is not enough evidence to say that the average speed is too high. Intervals and Tests Confidence intervals and significance tests are built from the same calculations. The confidence level contains all the null hypothesis values you can’t reject. A level C confidence interval contains all of the possible null hypothesis values that would be retained by a two-sided hypothesis test at -level 1 – C. When the hypothesis is one-sided, the corresponding -level is (1 – C)/2. Sample Size Before collecting data, it is a good idea to know whether the sample size is large enough to give you a good chance of being able to tell you what you want to know. An example: the movie download p. 456 ME 8 min, SD 10 min, 95% confidence interval 8 1.96 10 n 2.45 n 6.0025 n Use 6 1 5 degrees of freedom to substitute an appropriate t * value. 8 2.571 10 n 3.214 n 10.33 n Round up, so n 11 movies CAUTION!! Beware multimodality. Look for the possibility that the data come from two groups. If so, separate the groups and analyze each group separately. Beware skewed data. Set outliers aside. Report on these values separately. Conduct an analysis of non-outlying points, along with a separate discussion of outliers. Watch out for bias. Think about possible sources of bias in your measurements. Make sure data are independent. Make sure that data are from an appropriately randomized sample.