Transcript Slide 1
Statistics for Business and Economics Module 2: Regression and time series analysis Spring 2010 Lecture 2: Chi–squared tests; goodness–of–fit & independence Priyantha Wijayatunga, Department of Statistics, Umeå University [email protected] These materials are altered ones from copyrighted lecture slides (© 2009 W.H. Freeman and Company) from the homepage of the book: The Practice of Business Statistics Using Data for Decisions :Second Edition by Moore, McCabe, Duckworth and Alwan. Goodness–of–fit test and analysis of two–way contingency tables Reference to the book: Chapter 9.1, 9.2 and 2.5 Summerizing different data types Testing goodness–of–fit of models for multinomial observations Chi–squared distribution and Chi–squared test and p–values Two-way contingency tables and describing relationships in twoway tables The hypothesis: no association (or independence) Conditional distributions and marginal distributions Chi-square test vs. z-test Techniques to summerize data 1. 2. One variables –univaraite methods Two variables –bivariate methods Graphical displays Two interval variables –scatterplot Two categorical variables –clustered bar chart More than two variables –graphical displays are hard Observations can be taken 1. At the same time –cross sectional data Market surveys: eg. brand preferences of 100 people, etc. 2. At successive times repeatedly –time series data Price of a certain stock over the last 5 years Note: succession can be in space too. But we omit such discussions Describing Relationship between Two Nominal/Ordinal Variables Contingency / cross–classification / cross–tabulation table is used to describe (two or more) nominal variables Ex: Are the profession and newspaper reading habbits related? A sample of people are asked about their professions and newspaper preferences Person Occupation Newpaper 1 White-collar Post Occ WC Newsp 2 White-collar Sun Globe 27 29 33 89 3 Professional Sun Mail 18 43 51 112 . Post 38 21 22 81 . Sun 37 15 20 72 Total 120 108 126 354 354 Blue-collar Mail BC Pro Total Relative conditaional frequencies Occupation WC BC Pro Globe 27/120=0.23 29/108=0.27 33/126=0.26 Mail 18/120=0.15 43/108=0.40 51/126=0.40 Post 38/120=0.32 21/108=0.19 22/126=0.17 Sun 37/120=0.31 15/108=0.14 20/126=0.16 Newspaper total 120 (1) 108 (1) 126 (1) Time seires data Observations are repeated at successive times Ex: Total amount of taxed collected (in billions, US$) from year 1993 to 2002 in USA. Tax 755 1997 848 1998 940 1999 1032 2000 1137 2001 1178 1200 1996 1100 686 1000 1995 900 625 800 1994 700 594 600 1993 Tax in US Dollar Billions tax Year 1994 2002 1038 1996 1998 year 2000 2002 Binary and multinary observations 1. Binomial Experiment: A nomial variable has two outcomes Eg: Do the majority of people like new economic policies or not? 2. Multinomial Experiment: For a nominal variable that has three or more outcomes, we test more than two proportions Eg: Do the people have equal preferences on five brands of tea? Note: Multinomial cases can be reduced to binomial case sometimes! Multinary experiment: example 100 persons took part in a survey about different brands of coffee, say Ellips, Gexus, Luber and Loflia. Each of the persons tasted these four different kinds of coffee (in a blind test), and noted which one they liked the best. The result of the test is as follows: Brand Ellips Observed number of 26 persons/occurences (frequency) Gexus Luber Loflia 28 16 30 General question of interest Does the result of the survey show that any of the brands are more popular than the others, or are they all equal? In statistical terms we can formulate the problem as: Null hypothesis: All the coffee brands are equally popular. Alternative hypothesis: All the coffee brands are not equally popular. If the null hypothesis is true, we could expect the following result of the survey: Brand: Ellips Gexus Luber Loflia Expected number of persons (frequency) 25 25 25 25 With a significance level of 5% can we say anything about whether the null hypothesis is true or not. One way of measuring how much the observed table differs from the expected table is to look at the squared differences: Squared difference 26 25 28 25 16 25 30 25 2 2 2 2 However, there is a problem with the fact that the difference between 10 and 20 is relatively larger than the difference between 10000 and 10010. How can we take this into account? Divide with the expected value and formulate a test statistic (chi– squared statistic) 2 2 2 2 26 25 28 25 16 25 30 25 2 obs 25 25 25 25 4.64 If the null hypothesis is true, obs ought to be close to zero. Is 4.64 so far away from zero that we can reject the null hypothesis? 2 What is the sampling distribution for obs if the null hypothesis is true? 2 Chi-squared statistic Chi-squared statistic has two things: 2 1. A continuous distribution: -distribution 2. A statistical test where the sampling distribution for the test 2 2 statistic obs is - distributed. Brand Ellips Gexus Luber Loflia Observed frequency (fo) 26 28 16 30 Expected frequency (fe) 25 25 25 25 2 f f 2 obs o e ~ n21 fe 2 2 2 2 26 25 28 25 16 25 30 25 2 obs 25 25 25 25 4.64 Chi-squared distribution The distribution is a parametric distribution with the parameter v which is called the degrees of freedom. The distribution looks different for different degrees of freedom. Larger the v, the distribution is more symmetric and larger the expected value and standard deviation. 0.00 0.05 density 0.10 0.15 Chi Squared distribution with df= 5 0 5 10 Chi-squared: P(Chi > c )=0.05 15 20 Eg: looking chi–squared table df = 6 If 2 = 16.1, for df=6, the p-value is between 0.01−0.02. df 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 50 60 80 100 p 0.25 0.2 0.15 0.1 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 1.32 1.64 2.07 2.71 3.84 5.02 5.41 6.63 7.88 9.14 10.83 2.77 3.22 3.79 4.61 5.99 7.38 7.82 9.21 10.60 11.98 13.82 4.11 4.64 5.32 6.25 7.81 9.35 9.84 11.34 12.84 14.32 16.27 5.39 5.99 6.74 7.78 9.49 11.14 11.67 13.28 14.86 16.42 18.47 6.63 7.29 8.12 9.24 11.07 12.83 13.39 15.09 16.75 18.39 20.51 7.84 8.56 9.45 10.64 12.59 14.45 15.03 16.81 18.55 20.25 22.46 9.04 9.80 10.75 12.02 14.07 16.01 16.62 18.48 20.28 22.04 24.32 10.22 11.03 12.03 13.36 15.51 17.53 18.17 20.09 21.95 23.77 26.12 11.39 12.24 13.29 14.68 16.92 19.02 19.68 21.67 23.59 25.46 27.88 12.55 13.44 14.53 15.99 18.31 20.48 21.16 23.21 25.19 27.11 29.59 13.70 14.63 15.77 17.28 19.68 21.92 22.62 24.72 26.76 28.73 31.26 14.85 15.81 16.99 18.55 21.03 23.34 24.05 26.22 28.30 30.32 32.91 15.98 16.98 18.20 19.81 22.36 24.74 25.47 27.69 29.82 31.88 34.53 17.12 18.15 19.41 21.06 23.68 26.12 26.87 29.14 31.32 33.43 36.12 18.25 19.31 20.60 22.31 25.00 27.49 28.26 30.58 32.80 34.95 37.70 19.37 20.47 21.79 23.54 26.30 28.85 29.63 32.00 34.27 36.46 39.25 20.49 21.61 22.98 24.77 27.59 30.19 31.00 33.41 35.72 37.95 40.79 21.60 22.76 24.16 25.99 28.87 31.53 32.35 34.81 37.16 39.42 42.31 22.72 23.90 25.33 27.20 30.14 32.85 33.69 36.19 38.58 40.88 43.82 23.83 25.04 26.50 28.41 31.41 34.17 35.02 37.57 40.00 42.34 45.31 24.93 26.17 27.66 29.62 32.67 35.48 36.34 38.93 41.40 43.78 46.80 26.04 27.30 28.82 30.81 33.92 36.78 37.66 40.29 42.80 45.20 48.27 27.14 28.43 29.98 32.01 35.17 38.08 38.97 41.64 44.18 46.62 49.73 28.24 29.55 31.13 33.20 36.42 39.36 40.27 42.98 45.56 48.03 51.18 29.34 30.68 32.28 34.38 37.65 40.65 41.57 44.31 46.93 49.44 52.62 30.43 31.79 33.43 35.56 38.89 41.92 42.86 45.64 48.29 50.83 54.05 31.53 32.91 34.57 36.74 40.11 43.19 44.14 46.96 49.64 52.22 55.48 32.62 34.03 35.71 37.92 41.34 44.46 45.42 48.28 50.99 53.59 56.89 33.71 35.14 36.85 39.09 42.56 45.72 46.69 49.59 52.34 54.97 58.30 34.80 36.25 37.99 40.26 43.77 46.98 47.96 50.89 53.67 56.33 59.70 45.62 47.27 49.24 51.81 55.76 59.34 60.44 63.69 66.77 69.70 73.40 56.33 58.16 60.35 63.17 67.50 71.42 72.61 76.15 79.49 82.66 86.66 66.98 68.97 71.34 74.40 79.08 83.30 84.58 88.38 91.95 95.34 99.61 88.13 90.41 93.11 96.58 101.90 106.60 108.10 112.30 116.30 120.10 124.80 109.10 111.70 114.70 118.50 124.30 129.60 131.10 135.80 140.20 144.30 149.40 0.0005 12.12 15.20 17.73 20.00 22.11 24.10 26.02 27.87 29.67 31.42 33.14 34.82 36.48 38.11 39.72 41.31 42.88 44.43 45.97 47.50 49.01 50.51 52.00 53.48 54.95 56.41 57.86 59.30 60.73 62.16 76.09 89.56 102.70 128.30 153.20 Chi-squared tail probabilities for critical values For our data 2 obs 4.64 Levelof significnce 0.05 02.05:3 7.81 2 obs 02.05:3 We do not reject H0 at the level of significance 0.05 0.05 < p–value < 0.1 Chi–squared Goodness–of–fit test Used to test to see if a variable with two or more possible categories has a specific distribution. (Do the observed frequencies in different categories align with what we can expect from some theory?) Steps Formulate null and alternative hypotheses Compute the expected frequencies if the null hypothesis is true (expected counts) Note the observed frequencies (how many are there?) Use the difference between the expected and the observed values 2 and compute the value of the 2 - statistic (called obs ). 2 Compare your value with the critical value of n1 or compare the pvalue with your level of significance. Example 2 A political analyst believes that 45%, 40% and 15% of there voters will vote for political parties A, B and C respectively in the forthcoming election. In order to test her belief a statistician did a survey: 200 randomly selected voters were asked for their voting preference and it was found that 102, 82 and 16 voters were going to vote for parties A, B and C respectively. Can the statistician infer at 5% level of significance that political analyst’s belief is correct? H 0 : p A 0.45, pB 0.40 pC 0.15 H a : at least on of thepopulationproportion s are differentfromabove Political party A B C Total Observed frequency (fo) 102 82 16 200 90 80 30 200 Expected frequency (fe) Example 2 Calculate the chi–squared statistic 2 obs f0 fe 2 102 902 82 802 16 302 fe 90 80 30 8.18 2 This statistics follows a 31 -distribution if H0 is true. Look at the tabulated value from the chi–squared distribution with degrees of freedom 2 and level of significance 0.05. 2 It is 0.05:2 5.99 2 2 We reject H0 at level of significance 0.05 since obs 0.05:2 3 P–value P 2 8.18 0.02 One can do the testing with p–value too Example 3 The 13 first weeks of the season, the TV watchers on Saturday evenings were distributed as follows: SVT1 TV3 28% 18% SVT2 TV4 25% 29% After a change of the TV program presentation, a sample of 300 households was taken and the following numbers were observed: SVT1 TV3 70 households 46 households SVT2 89 households TV4 95 households Has the change in the TV program presentation changed the pattern of TV watchers? Example 3 Channel SVT1 SVT2 TV3 TV4 Total Observed frequency 70 89 46 95 300 Expecteded frequency 300x0.25 300x0.18 300x0.29 300 =75 =54 =87 300x0.28 =84 2 obs 6.87 02.05:3 7.81 2 obs 02.05:3 Therefore we do not reject the null hypothesis at the level of significance 0.05 That is, there is no evidence that the change in the program has affected the TV watching habits of the people Two-way contingency tables An experiment has a two-way design if two categorical factors are studied with several levels of each factor. Two-way tables organize data about two categorical variables Example: We call Education the row variable and Age group the column variable. Each combination of values for these two variables is called a cell. Describing relations The cells of a two-way table represent the intersection of a given level of one categorical factor with a given level of the other categorical factor. We can also compute each count as a percent of the column total. These percents should add up to 100% and together are the conditional distributions of education level given age group. Here the percents are calculated by age range (columns). Hypothesis: no association Again, we want to know if the differences in sample proportions are likely to have occurred just by chance, because of the random sampling. We use the chi-square (2) test to assess the null hypothesis of no relationship between the two categorical variables of a two-way table. H0 : there is no relationship between these two categorical variables. Are these conditional probability distributions the same (very close)? Expected counts in two-way tables H0 : there is no relationship between these two categorical variables. Ha : there is a relationship between these two categorical variables To test this hypothesis, we compare actual counts from the sample data with expected counts, given the null hypothesis of no relationship (assuming the null hypothesis is true). The expected count in any cell of a two-way table when H0 is true is: The chi-square test The chi-square statistic (2) is a measure of how much the observed cell counts in a two-way table diverge from the expected cell counts. The formula for the 2 statistic is: (summed over all r * c cells in the table) 2 observed count - expected count expect ed count 2 Large values for 2 represent strong deviations from the expected distribution under the H0, providing evidence against H0. However, since 2 is a sum, how large a 2 is required for statistical significance will depend on the number of comparisons made. If H0 is true, the chi-square test has approximately a χ2 distribution with (r − 1)(c − 1) degrees of freedom. The P-value for the chi-square test is the area to the right of 2 under the 2 distribution with df (r−1)(c−1): P(χ2 ≥ X2). Example 1 In order to see if the people’s political beliefs and gender associated a survery was conducted on randomly selected 2771 people and the findings were recorded as follows PB Democratic Independent Republican Total Female 573 (0.38) 516 (0.34) 422 (0.28) 1511 Male 386 (0.31) 475 (0.38) 399(0.32) 1260 Tolal 959 991 821 2771 Genger Within brackets: the cond. prob. of political belief given gender Example 1 H0: ”Political beliefs” and ”Gender” are independent Ha: They are dependent Level of significance=0.05 Under H0: expected fequencies: fe = (column total) x (row total) / (total) PB Democratic Independent Republican Total Female 573 (522.9) 516 (540.4) 422 (447.7) 1511 Male 386 (436.1) 475 (450.6) 399 (373.3) 1260 821 2771 Genger Tolal 959 991 Within brackets are the expected frequencies 2 obs f o f e 2 573 522.92 516 540.62 .... 399 373.32 fe 522.9 540.6 df (r 1) (c 1) (2 1) (3 1) 2 t0.05:2 5.99 2 Wedo not accept H 0 since obs t0.05:2 pvalve 0.0005 There is strong evidence for a dependency 373.3 16.22 Cocaine addiction Cocaine produces short-term feelings of physical and mental well-being. To maintain the effect, the drug may have to be taken more frequently and at higher doses. After stopping use, users will feel tired, sleepy, and depressed. The pleasurable high, followed by unpleasant after-effects, encourages repeated compulsive use which can easily lead to dependency. Desipramine is an antidepressant affecting the brain chemicals that may become unbalanced and cause depression. It was thus tested for recovery from cocaine addiction. Treatment with desipramine was compared to a standard treatment (lithium, with strong anti-manic effects) and a placebo. Is there is a relationship between treatment (desipramine, lithium, placebo) and outcome (relapse or not)? Cocaine addiction Observed (for No) Do we have same percentages for ”Yes” category? If not there should be some relation between two variables Expected relapse counts No Yes 25*26/74 ≈ 8.78 25*48/74≈16.22 Lithium 9.14 16.86 Placebo 8.08 14.92 Desipramine Cocaine addiction Table of counts: “actual & expected,” with three rows and two columns: No relapse Relapse Desipramine 15 8.78 10 16.22 Lithium 7 9.14 19 16.86 Placebo 4 8.08 19 14.92 df = (3−1)*(2−1) = 2 2 2 15 8 . 78 10 16 . 22 2 8.78 16.22 2 2 7 9.14 19 16.86 9.14 16.86 2 2 4 8.08 19 14.92 8.08 14.92 10.74 2 components: 4 .4 1 0 .5 0 2 .0 6 2 .3 9 0 .2 7 1 .1 2 Cocaine addiction: H0: there is no relationship between treatment (desipramine, lithium, placebo) and outcome (relapse or not). df 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 0.25 0.2 1.32 1.64 2.77 3.22 4.11 4.64 5.39 5.99 6.63 7.29 7.84 8.56 9.04 9.80 10.22 11.03 11.3910.60 12.24< 12.55 13.44 13.70 14.63 14.85 15.81 15.98 16.98 17.12 18.15 18.25 19.31 19.37 20.47 20.49 21.61 21.60 22.76 22.72 23.90 23.83 25.04 24.93 26.17 p 0.15 0.1 0.05 0.025 0.02 0.01 2.07 2.71 3.84 5.02 5.41 6.63 3.79 4.61 5.99 7.38 7.82 9.21 5.32 6.25 7.81 9.35 9.84 11.34 6.74 7.78 9.49 11.14 11.67 13.28 8.12 9.24 11.07 12.83 13.39 15.09 9.45 10.64 12.59 14.45 15.03 16.81 2 10.75 12.02 X 14.07 16.01 = 10.71 and16.62 df = 2 18.48 12.03 13.36 15.51 17.53 18.17 20.09 2 < 11.98 14.68 16.920.005 19.02< p19.68 21.67 X13.29 < 0.0025 14.53 15.99 18.31 20.48 21.16 23.21 15.77 17.28 19.68 21.92 22.62 24.72 16.99 18.55 21.03 23.34 24.05 26.22 18.20 19.81 22.36 24.74 25.47 27.69 19.41 21.06 23.68 26.12 26.87 29.14 20.60 22.31 25.00 27.49 28.26 30.58 21.79 23.54 26.30 28.85 29.63 32.00 22.98 24.77 27.59 30.19 31.00 33.41 24.16 25.99 28.87 31.53 32.35 34.81 25.33 27.20 30.14 32.85 33.69 36.19 26.50 28.41 31.41 34.17 35.02 37.57 27.66 29.62 32.67 35.48 36.34 38.93 0.005 0.0025 0.001 7.88 9.14 10.83 10.60 11.98 13.82 12.84 14.32 16.27 14.86 16.42 18.47 16.75 18.39 20.51 18.55 20.25 22.46 20.28 22.04 24.32 21.95 23.77 26.12 23.59 reject25.46 the H027.88 25.19 27.11 29.59 26.76 28.73 31.26 28.30 30.32 32.91 29.82 31.88 34.53 31.32 33.43 36.12 32.80 34.95 37.70 34.27 36.46 39.25 35.72 37.95 40.79 37.16 39.42 42.31 38.58 40.88 43.82 40.00 42.34 45.31 41.40 43.78 46.80 0.0005 12.12 15.20 17.73 20.00 22.11 24.10 26.02 27.87 29.67 31.42 33.14 34.82 36.48 38.11 39.72 41.31 42.88 44.43 45.97 47.50 49.01 Cocaine addiction Observed (for No) Minitab statistical software output for the cocaine study: The p-value is 0.005 or half a percent. This is very significant. We reject the null hypothesis of no association and conclude that there is a significant relationship between treatment (desipramine, lithium, placebo) and outcome (relapse or not). Marginal distributions We can look at each categorical variable separately in a two-way table by studying the row totals and the column totals. They represent the marginal distributions, expressed in counts or percentages (They are written as if in a margin.) 2000 U.S. census The marginal distributions can then be displayed on separate bar graphs, typically expressed as percents instead of raw counts. Each graph represents only one of the two variables, completely ignoring the second one. Marginal distribution of education Similarly we can do it for column totals to obtain the marginal distribution of age Conditional distributions The calculated percents within a two-way table represent the conditional distributions, describing the “relationship” between both variables. For every two-way table, there are two sets of possible conditional distributions (column percents or row percents). For column percents, divide each cell count by the column total. The sum of the percents in each column should be 100, except for possible small round-off errors. When one variable is clearly explanatory, it makes sense to describe the relationship by comparing the conditional distributions of the response variable for each value (level) of the explanatory variable. Conditional Distribution In the table below, the 25 to 34 age group occupies the first column. To find the complete distribution of education in this age group, look only at that column. Compute each count as a percent of the column total. These percents should add up to 100% because all persons in this age group fall in one of the education categories. These four percents together are the conditional distribution of education, given the 25 to 34 age group. 2000 U.S. census Conditional distributions The percents within the table represent the conditional distributions. Comparing the conditional distributions allows you to describe the “relationship” between both categorical variables. Here the percents are calculated by age range (columns). 29.30% = 11071 37785 = cell total . column total The conditional distributions can be graphically compared using side by side bar graphs of one variable for each value of the other variable. Here the percents are calculated by age range (columns). Music and wine purchase decision What is the relationship between type of music played in supermarkets and type of wine purchased? We want to compare the conditional distributions of the response variable (wine purchased) for each value of the explanatory variable (music played). Therefore, we calculate column percents. Calculations: When no music was played, there were 84 bottles of wine sold. Of these, 30 were French wine. 30/84 = 0.357 35.7% of the wine sold was French when no music was played. We calculate the column conditional percents similarly for each of the nine cells in the table: 30 = 35.7% 84 = cell total . column total For every two-way table, there are two sets of possible conditional distributions. Does background music in supermarkets influence customer purchasing decisions? Wine purchased for each kind of music played (column percents) Music played for each kind of wine purchased (row percents) Computing expected counts When testing the null hypothesis that there is no relationship between both categorical variables of a two-way table, we compare actual counts from the sample data with expected counts given H0. The expected count in any cell of a two-way table when H0 is true is: Although in real life counts must be whole numbers, an expected count need not be. The expected count is the mean over many repetitions of the study, assuming no relationship. Music and wine purchase decision The null hypothesis is that there is no relationship between music and wine sales. The alternative is that these two variables are related. What is the expected count in the upper-left cell of the two-way table, under H0? Column total 84: Number of bottles sold without music Row total 99: Number of bottles of French wine sold Table total 243: all bottles sold during the study This expected cell count is thus (84)(99) / 243 = 34.222 Nine similar calculations produce the table of expected counts: Computing the chi-square statistic The chi-square statistic (2) is a measure of how much the observed cell counts in a two-way table diverge from the expected cell counts. The formula for the 2 statistic is: (summed over all r * c cells in the table) 2 observed count - expected count expect ed count 2 Tip: First, calculate the 2 components, (observed-expected)2/expected, for each cell of the table, and then sum them up to arrive at the 2 statistic. Music and wine purchase decision H0: No relationship between music and wine Observed counts We calculate nine X2 components and sum them to produce the X2 statistic: Ha: Music and wine are related Expected counts Music and wine purchase decision H0: No relationship between music and wine Ha: Music and wine are related We found that the X2 statistic under H0 is 18.28. The two-way table has a 3x3 design (3 levels of music and 3 levels of wine). Thus, the degrees of freedom for the X2 distribution for this test is: (r – 1)(c – 1) = (3 – 1)(3 – 1) = 4 df 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0.25 1.32 2.77 4.11 5.39 6.63 7.84 9.04 10.22 11.39 12.55 13.70 14.85 15.98 17.12 p 0.2 0.15 0.1 0.05 0.025 0.02 0.01 0.005 0.0025 0.001 0.0005 1.64 2.07 2.71 3.84 5.02 5.41 6.63 7.88 9.14 10.83 12.12 3.22 3.79 4.61 5.99 7.38 7.82 9.21 10.60 11.98 13.82 15.20 4.64 5.32 6.25 7.81 9.35 9.84 11.34 12.84 14.32 16.27 17.73 5.99 6.74 7.78 9.49 11.14 11.67 13.28 14.86 16.42 18.47 20.00 7.29 8.12 9.24 11.07 12.83 13.39 15.09 16.75 18.39 20.51 22.11 2 8.56 9.45 10.64 12.59 14.45 15.03 16.81 18.55 20.25 22.46 24.10 16.42 < X =18.28 < 18.47 9.80 10.75 12.02 14.07 16.01 16.62 18.48 20.28 22.04 24.32 26.02 11.03 12.03 13.36 15.51 17.53 18.17 20.09 21.95 23.77 26.12 27.87 0.0025 > p-value > 0.001 very significant 12.24 13.29 14.68 16.92 19.02 19.68 21.67 23.59 25.46 27.88 29.67 13.44 14.53 15.99 18.31 20.48 21.16 23.21 25.19 27.11 29.59 31.42 There is a significant relationship between the type of music played 14.63 15.77 17.28 19.68 21.92 22.62 24.72 26.76 28.73 31.26 33.14 15.81 16.99 18.55 21.03 23.34 24.05 26.22 28.30 30.32 32.91 34.82 and wine18.20 purchases in supermarkets. 16.98 19.81 22.36 24.74 25.47 27.69 29.82 31.88 34.53 36.48 18.15 19.41 21.06 23.68 26.12 26.87 29.14 31.32 33.43 36.12 38.11 Interpreting the 2 output The values summed to make up 2 are called the 2 components. When the test is statistically significant, the largest components point to the conditions most different from the expectations based on H0. Two chi-square components contribute Music and wine purchase decision most to the X2 total the largest X2 components effect is for sales of Italian wine, which are strongly affected by Italian and French 0.5209 2.3337 0.5209 0.0075 7.6724 6.4038 0.3971 0.0004 0.4223 music. Actual proportions show that Italian music helps sales of Italian wine, but French music hinders it. When is it safe to use a 2 test? We can safely use the chi-square test when: The samples are simple random samples (SRS). All individual expected counts are 1 or more (≥1) No more than 20% of expected counts are less than 5 (< 5) For a 2x2 table, this implies that all four expected counts should be 5 or more. Chi-square test vs. z-test for two proportions When comparing only two proportions, such as in a 2x2 table where the columns represent counts of “success” and “failure,” we can test H0: p1 = p2 vs. Ha p1 ≠ p2 equally with a two-sided z test or with a chi-square test with 1 degree of freedom and get the same p-value. In fact, the two test statistics are related: X2 = (z)2. Successful firms Franchise businesses are sometimes given an exclusive territory by contract. This means that the new outlet will not have to compete with other outlets of the same chain within its own territory. How does the presence of an exclusive-territory clause in the contract relate to the survival of the business? A random sample of 170 new franchises recorded two categorical variables for each firm: (1) whether the firm was successful or not (based on economic criteria) and (2) whether or not the firm had an exclusive-territory contract. This is a 2x2 table (two levels for success, yes/no; two levels for exclusive territory, yes/no). df = (2 − 1)(2 − 1) = 1 Successful firms How does the presence of an exclusive-territory clause in the contract relate to the survival of the business? To compare firms that have an exclusive territory with those that do not, we start by examining column percents (conditional distribution): The difference between the percent of successes among the two types of firms is quite large. The chi-square test can tell us whether or not these differences can be plausibly attributed to chance (random sampling). Specifically, we will test H0: No relationship between exclusive clause and success Ha: There is some relationship between the two variables Successful firms Here is the chi-square output from Minitab: The p-value is significant at α = 5% (p = 1.5%), thus we reject H0: we have found a significant relationship between an exclusive territory and the success of a franchised firm. Successful firms Computer output using Crunch It! Computations for two-way tables When analyzing relationships between two categorical variables, follow this procedure: 1. Calculate descriptive statistics that convey the important information in the table—usually column or row percents. 2. Find the expected counts and use them to compute the X2 statistic. 3. Compare your X2 statistic to the chi-square critical values from Table F to find the approximate P-value for your test. 4. Draw a conclusion about the association between the row and column variables. Comparing several populations Select independent SRSs from each of c populations, of sizes n1, n2, . . . , nc. Classify each individual in a sample according to a categorical response variable with r possible values. There are c different probability distributions, one for each population. The null hypothesis is that the distributions of the response variable are the same in all c populations. The alternative hypothesis says that these c distributions are not all the same. Cocaine addiction Cocaine produces short-term feelings of physical and mental well-being. To maintain the effect, the drug may have to be taken more frequently and at higher doses. After stopping use, users will feel tired, sleepy, and depressed. The pleasurable high, followed by unpleasant after-effects, encourage repeated compulsive use which can easily lead to dependency. We compare treatment with an antidepressant (desipramine), a standard treatment (lithium), and a placebo. Population 1: Antidepressant treatment (desipramine) Population 2: Standard treatment (lithium) Population 3: Placebo (“sugar pill”) Cocaine addiction H0: The proportions of success (no relapse) are the same in all three populations. Observed 15/25=0.6 7/26=0.27 4/23=0.17 Expected Expected relapse counts 26/74= 35% 26/74= 35% 26/74 35% No Yes 25*26/74 ≈ 8.78 =25*0.35 16.22 25*0.65 Lithium 9.14 =26*0.35 16.86 25*0.65 Placebo 8.08 23*0.35 14.92 25*0.65 Desipramine Cocaine addiction Table of counts: “actual & expected,” with three rows and two columns: No relapse Relapse Desipramine 15 8.78 10 16.22 Lithium 7 9.14 19 16.86 Placebo 4 8.08 19 14.92 df = (3−1)*(2−1) = 2 2 2 15 8 . 78 10 16 . 22 2 8.78 16.22 2 2 7 9.14 19 16.86 9.14 16.86 2 2 4 8.08 19 14.92 8.08 14.92 10.74 2 components: 4 .4 1 0 .5 0 2 .0 6 2 .3 9 0 .2 7 1 .1 2 Cocaine addiction: H0: The proportions of success (no relapse) are the same in all three populations. p df 0.25 0.2 0.15 0.1 0.05 0.025 0.02 0.01 1 1.32 1.64 2.07 2.71 3.84 5.02 5.41 6.63 2 2.77 3.22 3.79 4.61 5.99 7.38 7.82 9.21 3 4.11 4.64 5.32 6.25 7.81 9.35 9.84 11.34 4 5.39 5.99 6.74 7.78 9.49 11.14 11.67 13.28 5 6.63 7.29 8.12 9.24 11.07 12.83 13.39 15.09 6 7.84 8.56 9.45 10.64 12.59 14.45 15.03 16.81 2 7 9.04 9.80 10.75 12.02 X 14.07 16.01 = 10.71 and16.62 df = 2 18.48 8 10.22 11.03 12.03 13.36 15.51 17.53 18.17 20.09 2 < 11.98 9 11.3910.60 12.24< X13.29 14.68 16.920.005 19.02< p19.68 21.67 < 0.0025 10 12.55 13.44 14.53 15.99 18.31 20.48 21.16 23.21 11 13.70 14.63 15.77 17.28 19.68 21.92 22.62 24.72 12 14.85 15.81 16.99 18.55 21.03 23.34 24.05 26.22 13 15.98 16.98 18.20 19.81 22.36 24.74 25.47 27.69 The proportions of success are not the same in 14 17.12 18.15 19.41 21.06 23.68 26.12 26.87 29.14 18.25 19.31 20.60 22.31 25.00 27.49 28.26 30.58 all 15 three populations (Desipramine, Lithium, 16 19.37 20.47 21.79 23.54 26.30 28.85 29.63 32.00 17 20.49 21.61 22.98 24.77 27.59 30.19 31.00 33.41 Placebo). 18 21.60 22.76 24.16 25.99 28.87 31.53 32.35 34.81 19 22.72 23.90 25.33 27.20 30.14 32.85 33.69 36.19 20 23.83 25.04 26.50 28.41 31.41 34.17 35.02 37.57 Desipramine is a more successful treatment 21 24.93 26.17 27.66 29.62 32.67 35.48 36.34 38.93 0.005 0.0025 0.001 0.0005 7.88 9.14 10.83 12.12 10.60 11.98 13.82 15.20 12.84 14.32 16.27 17.73 14.86 16.42 18.47 20.00 16.75 18.39 20.51 22.11 18.55 20.25 22.46 24.10 20.28 22.04 24.32 26.02 21.95 23.77 26.12 27.87 23.59 reject25.46 the H027.88 29.67 25.19 27.11 29.59 31.42 26.76 28.73 31.26 33.14 28.30 30.32 32.91 34.82 29.82 31.88 34.53 36.48 Observed 31.32 33.43 36.12 38.11 32.80 34.95 37.70 39.72 34.27 36.46 39.25 41.31 35.72 37.95 40.79 42.88 37.16 39.42 42.31 44.43 38.58 40.88 43.82 45.97 40.00 42.34 45.31 47.50 41.40 43.78 46.80 49.01