#### Transcript Hypothesis Testing

Class 9 March 23, 2006 • From exam, just review one part of calculations from normal distribution • Total carb intake: µ = 124g/1000cal, δ = 20g/1000cal. • For every 100 boys, how many would have carb intake between 90 & 110g/1000cal? • 2 points to note here: 1. asking for number, not percentage. 2. draw the normal distribution. Review estimate of the mean Confidence intervals, that is, what is the probability that a population mean will fall within a certain range based on an estimate using a sample mean. Usual Intervals • • • • • Usually use 90%, 95% or 99% What are the corresponding Z values? For now, start with 95% Last time we found that Z = 1.96 Just accept that for the moment; we will review how to get it shortly. Example Estimate the mean birth weight of newborns with gestational age 40 weeks, based on a sample of 5 such newborns with a mean birth weight of 3500 grams and a standard deviation of 430 grams. Since there are only 5 in the sample, what do we have to assume? Have to assume that the population from which the sample is drawn is approximately normal and that these are 5 randomly chosen newborns. Mu is between the two values calculated by: X Z * σ/ n X Z * σ/ n Always draw the N.D. The shaded area can help us to see what we mean by X Z * σ/ n It is the border of the shaded area to the right of the mean . We are saying that the mean lies between that border and the corresponding left-side border Write the equation X Z * σ/ n μ X Z * σ/ n Mu lies between the two values within the parentheses Practical Statement of Result • With 95% confidence, we can say that u will be ≤ 1.96 *S.E. and • ≥ -1.96 * S.E. • We call 1.96 the reliability coefficient The Math • X = 3500 g, n = 5, б = 430 • Review what the symbols mean • Find the quantity Z * б / Γn б / Γn = 430/ Γ5 = 430/ 2.24 = 192 • 1.96 * 192 = 376 • Mu is between 3500 – 376 and 3500 + 376 • 3124 ≤ µ ≤ 3876 with 95% confidence How did we get Z • Review for 95% what is the area not included divide into 2 tails look up Z • Try for 90% • Try for 99% Applications • If we know a population mean & st. dev., we can calculate the probability that any sample will have a stated mean. • A certain large human pop’n has a cranial length that is approx’ly normally distributed with mean 185.6 mm and б of 12.7 mm. µ = 185.6 mm б = 12.7 mm • What is the probability that a random sample of size 10 from this population will have a mean greater than 190? • We can calculate this probability but why would we? Usefulness?? • Let’s say that it is accepted knowledge that the population has a certain mean. • I am working with a group of people. • I want to know if they fit into this population with regard to the particular parameter. If the probability of the mean of the sample is very low, perhaps it is not really from the same population Education Example • Third-graders in the U.S. have an average reading score of 124. • Third-graders in a particular school have a mean reading score of 120. What’s the probability that they are from the same population? Back to Cranial Length • µ = 185.6 mm б = 12.7 mm • random sample of size 10 from this population will have a mean greater than 190? • Have to find how far 190 is from 185.6 in units of standard error of the mean Z (190 185.6) 12.7/ 10 Probability of Mean of 190 Did you draw a normal dist??? Z = 4.4 / 12.7 / 3.16 Z = 1.09 0.138 Area = 0.138 185.6 190 The probability is 13.8% 0 1.09 Hypothesis Testing A frequently used method in clinical research Another Look at Confidence Intervals • 95% C.I.: The space in each tail is 0.025. • The sum is 0.050 • This value, the space not covered in the C.I., is called alpha. • α is called the Significance Level • Each tail contains α/2 Using a Confidence Interval • A school district administrator took sample(s) of fifth grade reading scores for all the schools in Nassau County and then estimated a 90% C.I. for the mean: 87.60 to 92.34 with a sigma of 10. • All the individual schools then compared their mean to this. • If they did not fall within the interval, they were considered significantly better or significantly worse and faculty hiring's were affected by the results. Hypothesis Testing Just a new way to ask the questions. The methods used are virtually the same as C.I.’s The More Usual Approach • The most frequent application of estimating the mean is to look for effects that put a group outside the interval . • We usually set up research so that we are looking for a difference A New Drug to Lower Cholesterol • Everyone has been using my competitors product named reductum • I have developed a drug called drasticchangum • I want to find out if my product has results different from the accepted treatment Set Up a Hypothesis • My hope is that there is a difference between my product and theirs. • I set up Hypothesis A that says there is a difference • In order to test it, I state another hypothesis saying that there is NO DIFFERENCE between the 2 treatments. • I want to reject this hypothesis Another Wording • HA says that the sample comes from a population with a mean different from that of the original population • H0 says that the sample comes from a population with a mean the same as that of the original population. We call this µ0 The Hypothesis to Reject • This is called the Null Hypothesis, H0 • I now go about the process to reject it. • I have to choose a significance level • Let’s say α = 0.01, analogous to a C.I. of 99% Set-Up • Each tail has probability of α/2. Call it Rejection Zone • For α = 0.01, each tail has 0.005 and this occurs for any Z that is over 2.54 Check the table of Normal Distribution Results • The population treated by reductum has a mean change in cholesterol of 30 units with a sigma of 4.5. • I have a preliminary sample of 10 patients treated by drasticchangum. Their mean change is 34 units. I assume that they come from a population that has the same sigma as the first. Hypotheses • HA: Called the Alternative Hypothesis. This is the result I want to support: “µ for the new population differs from 30. µ ≠ 30.” • H0: The Null Hypothesis, I want to reject this: µ for the new population is the same as the old, that is 30. “µ = 30”. The Test Statistic • Z = (X - µ0) / б/Γn • Z = (34 – 30) / 4.5 / Γ10 • Z = 2.81 • This is larger than 2.54 so is in the rejection zone Excellent Result • We can reject the null hypothesis • We can accept the alternative hypothesis • I can say that drasticchangum causes a different change in cholesterol level from reductum. • Is there anything else we might prefer to say? • How about “Drasticchangum causes a greater change than does reductum.”? Different Hypotheses • HA: The mean for DC is greater than 30. • H0 : The mean for DC is equal to 30 • How can we reject H0? One-Sided Tests • So far we have divided alpha into the two tails. • Let’s put the whole alpha into just one tail. • Can we still reject the null hypothesis at the level of α = 0.01 (99% confidence)? • Look up the tail with area = 0.01 • Now Z just has to exceed 2.30. • It’s even easier to reject Easier to Reject • If we choose the correct side to consider then it is easier to reject the null hypothesis • And accept the alternative hypothesis • Remember to express the correct alternative hypothesis before starting the math. The Two Things We Can Say • We reject the null hypothesis that the sample came from the original population • We accept the alternative hypothesis that the sample came from a population with a mean greater than that of the original population. So…Nothing so new • Hypothesis Testing is a cinch – because you already knew most of it from Confidence Intervals of the Normal Distribution • Let’s just look at some details Confidence • Not only are we saying that drasticchangum has a greater effect than does reductum • But we have a level of significance, i.e. a statement of probability that we are correct. • But that means our conclusion could also be incorrect in reality even while satisfying our statistical conditions. Possible Errors • In our example, we stated that the new sample came from a different population than the original • We could be incorrect. • What are the chances that we have stated a difference when there isn’t one? • The probability that the sample came from the old population is 0.01 Rejection Zone • If the new sample had a population mean equal to the original mean, there is only a 0.01 possibility that the sample mean would fall far enough from µ0 to give a Z of 2.30. • This would be a rare event but a possible one. Type I Error • If we say that there is a difference when there really isn’t one. That is called a Type I Error. • We were statistically allowed to reject the null hypothesis but only some spiritual being somewhere knew that we shouldn’t have done so. • We may never know that the result was incorrect or … • After 1000 patients have tried drasticchangum, we may find that it came from a population with a mean the same as that of reductum. The chance of committing a Type I error is equal to alpha. OR… • We may be able to state the probability even more precisely. • The value of “p”. • Use the value calculated for Z, 2.81 • The area under the tail of the Normal Distribution is 0.002 • The chance that the means are equal is equal to or less than 0.002 • A report would state p ≤ 0.002 Conclusion • The null hypothesis is rejected with a p value less than or equal to 0.002 • The alternative hypothesis is accepted stating that the effect of drasticchangum is greater than that of reductum. An Example • Weight Watchers Inc claims that those who have followed their program for at least a yr maintain some weight loss even after they have left the formal program. • A competitive group claims that those who have left the program have a rebound effect and actually weigh more than they would have if they had never tried the ww program Before Gathering Data • HA: there is a difference between the average for former weight watchers and the general population • H0: The mean weight for both groups is the same. µest = µ0 • Test the null hypothesis with a significance level of 0.05 • Is the test one-sided or two-sided? Are the Wts different from the general population • Considering the population of women from ages 30 to 60, it is accepted that the average weight is 150 with a variance of 20 lbs. • A sample of 20 former weight watchers have an average weight of 147 pounds Establish Zcrit & determine Zcalc • • • • • • • For α = 0.05, & a 2-sided test, what is Zcrit? Check the table for area of 0.025. Z = 1.96 Zcalc = (147-150)/Γ20/ Γ20 = 3/ Γ1 = 3 Do we reject the null hypothesis? Do we accept the alternative hypothesis? Yes to both and what is p? p ≤ 0.001 Review p • We got p by looking up the area corresponding to a Zcalc of 3.0. Check it. • What does it mean that p ≤ 0.001? • It means that the chance of a Type I error is ≤ 0.001. • It means that the chance that the sample came from the population with mean of 150 is ≤ 0.001 Could we have set up a different test? • • • • • • • • • Yes, we could have asked a one-sided type question. How would WW Inc have asked the question? One-sided: WW average is less than population Is there an advantage for them in choosing this? Yes, a one-sided test would have given a greater area in the rejection zone. Easier for them to show that their former clients weigh less than the overall population Would p be any different? No. What is different? Zcrit Another Sample • A pharmaceutical company says that it is more effective to take a weight loss drug. • They say they have a sample of 100 former weight watchers with an average weight of 149.5. • Now do a two-sided test on this. Zcalc = (149.5-150)/Γ20/ Γ100 = 0.5/ Γ0.5 = 0.7 • Can we reject the null hypothesis? • No • Can we accept the null hypothesis that the average weight of the former ww’s equals the general population? • NO, NO, NO, NO, NO, NO • We never accept the null hypothesis • All we can say is – not enough evidence to reject. Think About It • 149.5 is very close to 150 • Does that mean the sample came from the population with the original mean • No • Could have come from a population with a mean of 149.5, or 150.5 or 149.9 or 148 or whaterver. • We cannot accept the null hypothesis So at least we don’t reject • Is the pharmaceutical company a little happier? • Maybe but the ww’s aren’t. • Could the ww’s be correct and the new sample just didn’t show it? • Yes, this is possible • Called a Type II error Type II Error • We fail to reject a null hypothesis on statistical grounds • But that greater being in heaven knows that we are incorrect. There is really a difference even if small. • And what can the ww’s do? • They might take an even larger sample and try to find a difference. The two types of errors • Type I: we rejected when the reality is that there is no difference • We can evaluate the chance of this error. It is equal to alpha. • Type II: we did not reject when the reality is that there is a difference • We cannot easily evaluate the chance of a type II error. The Power of a Test • Whenever we do not reject the null hypothesis, there is a chance of a Type II Error. The chance of that error is called β • In this course we will not learn how to evaluate the chance of the error. • The probability of NOT making a Type II Error is called the Power of a Test. • The Power of a Test can be increased Two Ways to Decrease β, i.e. Increase the Power • One way is to make the rejection zone larger • But then we are increasing the chance of a Type I Error • The other way is to increase the sample size • We can calculate a value of n for predetermined values of α and β. • We won’t cover this. Practice Problem • Chapter 10 # 12