Transcript Slide 1

Statistics for Business and
Economics
Module 2: Regression and time series analysis
Spring 2010
Lecture 2: Chi–squared tests; goodness–of–fit & independence
Priyantha Wijayatunga, Department of Statistics, Umeå University
[email protected]
These materials are altered ones from copyrighted lecture slides (© 2009 W.H.
Freeman and Company) from the homepage of the book:
The Practice of Business Statistics Using Data for Decisions :Second Edition
by Moore, McCabe, Duckworth and Alwan.
Goodness–of–fit test and analysis of
two–way contingency tables
Reference to the book: Chapter 9.1, 9.2 and 2.5

Summerizing different data types

Testing goodness–of–fit of models for multinomial observations

Chi–squared distribution and Chi–squared test and p–values

Two-way contingency tables and describing relationships in twoway tables

The hypothesis: no association (or independence)

Conditional distributions and marginal distributions

Chi-square test vs. z-test
Techniques to summerize data
1.
2.
One variables –univaraite methods
Two variables –bivariate methods
Graphical displays
Two interval variables –scatterplot
Two categorical variables –clustered bar chart
More than two variables –graphical displays are hard
Observations can be taken
1. At the same time –cross sectional data
Market surveys: eg. brand preferences of 100 people, etc.
2. At successive times repeatedly –time series data
Price of a certain stock over the last 5 years
Note: succession can be in space too. But we omit such discussions
Describing Relationship between Two
Nominal/Ordinal Variables
Contingency / cross–classification / cross–tabulation table is
used to describe (two or more) nominal variables
Ex: Are the profession and newspaper reading habbits
related? A sample of people are asked about their
professions and newspaper preferences
Person Occupation
Newpaper
1
White-collar
Post
Occ WC
Newsp
2
White-collar
Sun
Globe
27
29
33
89
3
Professional
Sun
Mail
18
43
51
112
.
Post
38
21
22
81
.
Sun
37
15
20
72
Total
120
108
126
354
354
Blue-collar
Mail
BC
Pro
Total
Relative conditaional frequencies
Occupation
WC
BC
Pro
Globe
27/120=0.23
29/108=0.27
33/126=0.26
Mail
18/120=0.15
43/108=0.40
51/126=0.40
Post
38/120=0.32
21/108=0.19
22/126=0.17
Sun
37/120=0.31
15/108=0.14
20/126=0.16
Newspaper
total
120 (1)
108
(1)
126 (1)
Time seires data
Observations are repeated at successive times
Ex: Total amount of taxed collected (in billions, US$) from
year 1993 to 2002 in USA.
Tax
755
1997
848
1998
940
1999
1032
2000
1137
2001
1178
1200
1996
1100
686
1000
1995
900
625
800
1994
700
594
600
1993
Tax in US Dollar Billions
tax
Year
1994
2002
1038
1996
1998
year
2000
2002
Binary and multinary observations
1. Binomial Experiment: A nomial variable has two
outcomes
Eg: Do the majority of people like new economic policies or not?
2. Multinomial Experiment: For a nominal variable that
has three or more outcomes, we test more than two
proportions
Eg: Do the people have equal preferences on five brands of tea?
Note: Multinomial cases can be reduced to binomial case sometimes!
Multinary experiment: example
100 persons took part in a survey about different brands
of coffee, say Ellips, Gexus, Luber and Loflia. Each of the
persons tasted these four different kinds of coffee (in a
blind test), and noted which one they liked the best. The
result of the test is as follows:
Brand
Ellips
Observed number of 26
persons/occurences
(frequency)
Gexus
Luber
Loflia
28
16
30
General question of interest
Does the result of the survey show that any of the brands
are more popular than the others, or are they all equal?
In statistical terms we can formulate the problem as:
Null hypothesis: All the coffee brands are equally popular.
Alternative hypothesis: All the coffee brands are not
equally popular.
If the null hypothesis is true, we could expect the following result of the
survey:
Brand:
Ellips
Gexus
Luber
Loflia
Expected
number of
persons
(frequency)
25
25
25
25
With a significance level of 5% can we say anything about whether the
null hypothesis is true or not.
One way of measuring how much the observed table differs from the
expected table is to look at the squared differences:
Squared difference 26  25  28 25  16  25  30  25
2
2
2
2
However, there is a problem with the fact that the difference between
10 and 20 is relatively larger than the difference between 10000 and
10010.
How can we take this into account?
Divide with the expected value and formulate a test statistic (chi–
squared statistic)
2
2
2
2








26

25
28

25
16

25
30

25
2
 obs




25
25
25
25
 4.64
If the null hypothesis is true, obs ought to be close to zero. Is 4.64
so far away from zero that we can reject the null hypothesis?
2
What is the sampling distribution for obs if the null hypothesis is true?
2
Chi-squared statistic
Chi-squared statistic has two things:
2
1.
A continuous distribution:  -distribution
2.
A statistical
test where the sampling distribution for the test
2
2
statistic obs is  - distributed.
Brand
Ellips
Gexus
Luber
Loflia
Observed
frequency (fo)
26
28
16
30
Expected
frequency (fe)
25
25
25
25
2


f

f
2
 obs
  o e ~  n21
fe
2
2
2
2








26

25
28

25
16

25
30

25
2
 obs




25
25
25
25
 4.64
Chi-squared distribution
The distribution is a parametric distribution with the parameter v which
is called the degrees of freedom.
The distribution looks different for different degrees of freedom.
Larger the v, the distribution is more symmetric and larger the
expected value and standard deviation.
0.00
0.05
density
0.10
0.15
Chi Squared distribution with df= 5
0
5
10
Chi-squared: P(Chi > c )=0.05
15
20
Eg: looking chi–squared table
df = 6
If 2 = 16.1,
for df=6,
the p-value
is between
0.01−0.02.
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
50
60
80
100
p
0.25
0.2
0.15
0.1
0.05
0.025
0.02
0.01
0.005 0.0025
0.001
1.32
1.64
2.07
2.71
3.84
5.02
5.41
6.63
7.88
9.14
10.83
2.77
3.22
3.79
4.61
5.99
7.38
7.82
9.21
10.60
11.98
13.82
4.11
4.64
5.32
6.25
7.81
9.35
9.84
11.34
12.84
14.32
16.27
5.39
5.99
6.74
7.78
9.49
11.14
11.67
13.28
14.86
16.42
18.47
6.63
7.29
8.12
9.24
11.07
12.83
13.39
15.09
16.75
18.39
20.51
7.84
8.56
9.45
10.64
12.59
14.45
15.03
16.81
18.55
20.25
22.46
9.04
9.80
10.75
12.02
14.07
16.01
16.62
18.48
20.28
22.04
24.32
10.22
11.03
12.03
13.36
15.51
17.53
18.17
20.09
21.95
23.77
26.12
11.39
12.24
13.29
14.68
16.92
19.02
19.68
21.67
23.59
25.46
27.88
12.55
13.44
14.53
15.99
18.31
20.48
21.16
23.21
25.19
27.11
29.59
13.70
14.63
15.77
17.28
19.68
21.92
22.62
24.72
26.76
28.73
31.26
14.85
15.81
16.99
18.55
21.03
23.34
24.05
26.22
28.30
30.32
32.91
15.98
16.98
18.20
19.81
22.36
24.74
25.47
27.69
29.82
31.88
34.53
17.12
18.15
19.41
21.06
23.68
26.12
26.87
29.14
31.32
33.43
36.12
18.25
19.31
20.60
22.31
25.00
27.49
28.26
30.58
32.80
34.95
37.70
19.37
20.47
21.79
23.54
26.30
28.85
29.63
32.00
34.27
36.46
39.25
20.49
21.61
22.98
24.77
27.59
30.19
31.00
33.41
35.72
37.95
40.79
21.60
22.76
24.16
25.99
28.87
31.53
32.35
34.81
37.16
39.42
42.31
22.72
23.90
25.33
27.20
30.14
32.85
33.69
36.19
38.58
40.88
43.82
23.83
25.04
26.50
28.41
31.41
34.17
35.02
37.57
40.00
42.34
45.31
24.93
26.17
27.66
29.62
32.67
35.48
36.34
38.93
41.40
43.78
46.80
26.04
27.30
28.82
30.81
33.92
36.78
37.66
40.29
42.80
45.20
48.27
27.14
28.43
29.98
32.01
35.17
38.08
38.97
41.64
44.18
46.62
49.73
28.24
29.55
31.13
33.20
36.42
39.36
40.27
42.98
45.56
48.03
51.18
29.34
30.68
32.28
34.38
37.65
40.65
41.57
44.31
46.93
49.44
52.62
30.43
31.79
33.43
35.56
38.89
41.92
42.86
45.64
48.29
50.83
54.05
31.53
32.91
34.57
36.74
40.11
43.19
44.14
46.96
49.64
52.22
55.48
32.62
34.03
35.71
37.92
41.34
44.46
45.42
48.28
50.99
53.59
56.89
33.71
35.14
36.85
39.09
42.56
45.72
46.69
49.59
52.34
54.97
58.30
34.80
36.25
37.99
40.26
43.77
46.98
47.96
50.89
53.67
56.33
59.70
45.62
47.27
49.24
51.81
55.76
59.34
60.44
63.69
66.77
69.70
73.40
56.33
58.16
60.35
63.17
67.50
71.42
72.61
76.15
79.49
82.66
86.66
66.98
68.97
71.34
74.40
79.08
83.30
84.58
88.38
91.95
95.34
99.61
88.13
90.41
93.11
96.58 101.90 106.60 108.10 112.30 116.30 120.10 124.80
109.10 111.70 114.70 118.50 124.30 129.60 131.10 135.80 140.20 144.30 149.40
0.0005
12.12
15.20
17.73
20.00
22.11
24.10
26.02
27.87
29.67
31.42
33.14
34.82
36.48
38.11
39.72
41.31
42.88
44.43
45.97
47.50
49.01
50.51
52.00
53.48
54.95
56.41
57.86
59.30
60.73
62.16
76.09
89.56
102.70
128.30
153.20
Chi-squared
tail probabilities
for critical values
For our data
2
 obs
 4.64
Levelof significnce    0.05
 02.05:3  7.81
2
 obs
  02.05:3
We do not reject H0 at the
level of significance 0.05
0.05 < p–value < 0.1
Chi–squared Goodness–of–fit test
Used to test to see if a variable with two or more possible categories
has a specific distribution.
(Do the observed frequencies in different categories align with what
we can expect from some theory?)
Steps
 Formulate null and alternative hypotheses
 Compute the expected frequencies if the null hypothesis is true
(expected counts)
 Note the observed frequencies (how many are there?)
 Use the difference between the expected and the observed values
2
and compute the value of the  2 - statistic (called obs ).
2
 Compare your value with the critical value of  n1 or compare the pvalue with your level of significance.
Example 2
A political analyst believes that 45%, 40% and 15% of there voters
will vote for political parties A, B and C respectively in the forthcoming
election. In order to test her belief a statistician did a survey: 200
randomly selected voters were asked for their voting preference and
it was found that 102, 82 and 16 voters were going to vote for parties
A, B and C respectively. Can the statistician infer at 5% level of
significance that political analyst’s belief is correct?
H 0 : p A  0.45, pB  0.40 pC  0.15
H a : at least on of thepopulationproportion
s are differentfromabove
Political party
A
B
C
Total
Observed
frequency (fo)
102
82
16
200
90
80
30
200
Expected
frequency (fe)
Example 2
Calculate the chi–squared statistic
2
 obs

 f0  fe 2  102 902  82  802  16  302
fe
90
80
30
 8.18
2
This statistics follows a 31 -distribution if H0 is true.
Look at the tabulated value from the chi–squared distribution with
degrees of freedom 2 and level of significance 0.05.
2
It is 0.05:2  5.99
2
2
We reject H0 at level of significance 0.05 since obs  0.05:2
3
P–value  P 2  8.18  0.02
One can do the testing with p–value too
Example 3
The 13 first weeks of the season, the TV watchers on Saturday evenings
were distributed as follows:
SVT1
TV3
28%
18%
SVT2
TV4
25%
29%
After a change of the TV program presentation, a sample of 300
households was taken and the following numbers were observed:
SVT1
TV3
70 households
46 households
SVT2 89 households
TV4 95 households
Has the change in the TV program presentation changed the pattern of
TV watchers?
Example 3
Channel
SVT1
SVT2
TV3
TV4
Total
Observed frequency 70
89
46
95
300
Expecteded
frequency
300x0.25 300x0.18 300x0.29 300
=75
=54
=87
300x0.28
=84
2
 obs
 6.87
 02.05:3  7.81
2
 obs
  02.05:3
Therefore we do not reject the null hypothesis at the level of significance
0.05
That is, there is no evidence that the change in the program has affected
the TV watching habits of the people
Two-way contingency tables




An experiment has a two-way design if two categorical factors are
studied with several levels of each factor.
Two-way tables organize data about two categorical variables
Example: We call Education the row variable and Age group the
column variable.
Each combination of values for these two variables is called a cell.
Describing relations

The cells of a two-way table represent the intersection of a given
level of one categorical factor with a given level of the other
categorical factor.

We can also compute each count as a percent of the column total.
These percents should add up to 100% and together are the
conditional distributions of education level given age group.
Here the
percents are
calculated by age
range (columns).
Hypothesis: no association
Again, we want to know if the differences in sample proportions are likely to
have occurred just by chance, because of the random sampling.
We use the chi-square (2) test to assess the null hypothesis of no
relationship between the two categorical variables of a two-way table.
H0 : there is no relationship between these two categorical variables.
Are these
conditional
probability
distributions the
same (very
close)?
Expected counts in two-way tables
H0 : there is no relationship between these two categorical variables.
Ha : there is a relationship between these two categorical variables
To test this hypothesis, we compare actual counts from the sample
data with expected counts, given the null hypothesis of no relationship
(assuming the null hypothesis is true).
The expected count in any cell of a two-way table when H0 is true is:
The chi-square test
The chi-square statistic (2) is a measure of how much the observed
cell counts in a two-way table diverge from the expected cell counts.
The formula for the 2 statistic is:
(summed over all r * c cells in the table)
2  
observed count -
expected count
expect ed count
2
Large values for 2 represent strong deviations from the expected

distribution under the H0, providing evidence against H0.
However, since 2 is a sum, how large a 2 is required for statistical
significance will depend on the number of comparisons made.
If H0 is true, the chi-square test has approximately a χ2 distribution with
(r − 1)(c − 1) degrees of freedom.
The P-value for the chi-square test
is the area to the right of 2 under
the 2 distribution with df
(r−1)(c−1):
P(χ2 ≥ X2).
Example 1
In order to see if the people’s political beliefs and gender associated a
survery was conducted on randomly selected 2771 people and the
findings were recorded as follows
PB
Democratic
Independent
Republican
Total
Female
573 (0.38)
516 (0.34)
422 (0.28)
1511
Male
386 (0.31)
475 (0.38)
399(0.32)
1260
Tolal
959
991
821
2771
Genger
Within brackets: the cond. prob. of political belief given gender
Example 1
H0: ”Political beliefs” and ”Gender” are independent
Ha: They are dependent
Level of significance=0.05
Under H0: expected fequencies: fe = (column total) x (row total) / (total)
PB
Democratic
Independent
Republican
Total
Female
573 (522.9)
516 (540.4)
422 (447.7)
1511
Male
386 (436.1)
475 (450.6)
399 (373.3)
1260
821
2771
Genger
Tolal
959
991
Within brackets are the expected frequencies

2
obs

 f o  f e 2  573 522.92  516 540.62  .... 399 373.32
fe
522.9
540.6
df  (r  1)  (c  1)  (2  1)  (3  1)  2
t0.05:2  5.99
2
Wedo not accept H 0 since  obs
 t0.05:2
pvalve 0.0005
There is strong evidence for a dependency
373.3
 16.22
Cocaine addiction
Cocaine produces short-term feelings of physical and mental well-being. To
maintain the effect, the drug may have to be taken more frequently and at
higher doses. After stopping use, users will feel tired, sleepy, and depressed.
The pleasurable high, followed by
unpleasant after-effects, encourages
repeated compulsive use which can
easily lead to dependency.
Desipramine is an antidepressant
affecting the brain chemicals that may
become unbalanced and cause
depression. It was thus tested for
recovery from cocaine addiction.
Treatment with desipramine was compared to a standard treatment (lithium,
with strong anti-manic effects) and a placebo. Is there is a relationship
between treatment (desipramine, lithium, placebo) and outcome (relapse or
not)?
Cocaine addiction
Observed (for No)
Do we have same
percentages for ”Yes”
category?
If not there should be some
relation between two variables
Expected relapse counts
No
Yes
25*26/74 ≈ 8.78
25*48/74≈16.22
Lithium
9.14
16.86
Placebo
8.08
14.92
Desipramine
Cocaine addiction
Table of counts:
“actual & expected,”
with three rows and two
columns:
No relapse
Relapse
Desipramine
15
8.78
10
16.22
Lithium
7
9.14
19
16.86
Placebo
4
8.08
19
14.92
df = (3−1)*(2−1) = 2
2
2




15

8
.
78
10

16
.
22
2 

8.78
16.22
2
2


7  9.14
19  16.86


9.14
16.86
2
2


4  8.08
19  14.92


8.08
14.92
 10.74
2 components:
4 .4 1
0 .5 0
2 .0 6
2 .3 9
0 .2 7
1 .1 2
Cocaine addiction:
H0: there is no relationship between treatment (desipramine, lithium,
placebo) and outcome (relapse or not).
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
0.25
0.2
1.32
1.64
2.77
3.22
4.11
4.64
5.39
5.99
6.63
7.29
7.84
8.56
9.04
9.80
10.22
11.03
11.3910.60
12.24<
12.55
13.44
13.70
14.63
14.85
15.81
15.98
16.98
17.12
18.15
18.25
19.31
19.37
20.47
20.49
21.61
21.60
22.76
22.72
23.90
23.83
25.04
24.93
26.17
p
0.15
0.1
0.05
0.025
0.02
0.01
2.07
2.71
3.84
5.02
5.41
6.63
3.79
4.61
5.99
7.38
7.82
9.21
5.32
6.25
7.81
9.35
9.84
11.34
6.74
7.78
9.49
11.14
11.67
13.28
8.12
9.24
11.07
12.83
13.39
15.09
9.45
10.64
12.59
14.45
15.03
16.81
2
10.75
12.02 X 14.07
16.01
= 10.71
and16.62
df = 2 18.48
12.03
13.36
15.51
17.53
18.17
20.09
2 < 11.98
14.68 
16.920.005
19.02< p19.68
21.67
X13.29
< 0.0025
14.53
15.99
18.31
20.48
21.16
23.21
15.77
17.28
19.68
21.92
22.62
24.72
16.99
18.55
21.03
23.34
24.05
26.22
18.20
19.81
22.36
24.74
25.47
27.69
19.41
21.06
23.68
26.12
26.87
29.14
20.60
22.31
25.00
27.49
28.26
30.58
21.79
23.54
26.30
28.85
29.63
32.00
22.98
24.77
27.59
30.19
31.00
33.41
24.16
25.99
28.87
31.53
32.35
34.81
25.33
27.20
30.14
32.85
33.69
36.19
26.50
28.41
31.41
34.17
35.02
37.57
27.66
29.62
32.67
35.48
36.34
38.93
0.005 0.0025
0.001
7.88
9.14
10.83
10.60
11.98
13.82
12.84
14.32
16.27
14.86
16.42
18.47
16.75
18.39
20.51
18.55
20.25
22.46
20.28
22.04
24.32
21.95
23.77
26.12
23.59
reject25.46
the H027.88
25.19
27.11
29.59
26.76
28.73
31.26
28.30
30.32
32.91
29.82
31.88
34.53
31.32
33.43
36.12
32.80
34.95
37.70
34.27
36.46
39.25
35.72
37.95
40.79
37.16
39.42
42.31
38.58
40.88
43.82
40.00
42.34
45.31
41.40
43.78
46.80
0.0005
12.12
15.20
17.73
20.00
22.11
24.10
26.02
27.87
29.67
31.42
33.14
34.82
36.48
38.11
39.72
41.31
42.88
44.43
45.97
47.50
49.01
Cocaine addiction
Observed (for No)
Minitab statistical software output
for the cocaine study:
The p-value is 0.005 or half a
percent. This is very significant.
We reject the null hypothesis of
no association and conclude that
there is a significant relationship
between treatment (desipramine,
lithium, placebo) and outcome
(relapse or not).
Marginal distributions
We can look at each categorical variable separately in a two-way table
by studying the row totals and the column totals. They represent the
marginal distributions, expressed in counts or percentages (They are
written as if in a margin.)
2000 U.S. census
The marginal distributions can then be displayed on separate bar graphs, typically
expressed as percents instead of raw counts. Each graph represents only one of
the two variables, completely ignoring the second one.
Marginal distribution of education
Similarly we can do it for column totals to obtain
the marginal distribution of age
Conditional distributions
The calculated percents within a two-way table represent the
conditional distributions, describing the “relationship” between both
variables.
For every two-way table, there are two sets of possible conditional
distributions (column percents or row percents).
For column percents, divide each cell count by the column total. The sum
of the percents in each column should be 100, except for possible small
round-off errors.
When one variable is clearly explanatory, it makes sense to describe
the relationship by comparing the conditional distributions of the
response variable for each value (level) of the explanatory variable.
Conditional Distribution
 In the table below, the 25 to 34 age group occupies the first column. To find
the complete distribution of education in this age group, look only at that
column. Compute each count as a percent of the column total.
 These percents should add up to 100% because all persons in this age
group fall in one of the education categories. These four percents together
are the conditional distribution of education, given the 25 to 34 age group.
2000 U.S. census
Conditional distributions
The percents within the table represent the conditional distributions.
Comparing the conditional distributions allows you to describe the
“relationship” between both categorical variables.
Here the
percents are
calculated by age
range (columns).
29.30% = 11071
37785
= cell total .
column total
The conditional distributions can be graphically compared using side by
side bar graphs of one variable for each value of the other variable.
Here the percents are
calculated by age range
(columns).
Music and wine purchase decision
What is the relationship between type of music
played in supermarkets and type of wine purchased?
We want to compare the conditional distributions of the response
variable (wine purchased) for each value of the explanatory
variable (music played). Therefore, we calculate column percents.
Calculations: When no music was played, there were
84 bottles of wine sold. Of these, 30 were French wine.
30/84 = 0.357  35.7% of the wine sold was French
when no music was played.
We calculate the column
conditional percents similarly for
each of the nine cells in the table:
30 = 35.7%
84
= cell total .
column total
For every two-way table, there are two
sets of possible conditional distributions.
Does background music in
supermarkets influence
customer purchasing
decisions?
Wine purchased for each kind of
music played (column percents)
Music played for each
kind of wine purchased
(row percents)
Computing expected counts
When testing the null hypothesis that there is no relationship between
both categorical variables of a two-way table, we compare actual
counts from the sample data with expected counts given H0.
The expected count in any cell of a two-way table when H0 is true is:
Although in real life counts must be whole numbers, an expected count
need not be. The expected count is the mean over many repetitions of
the study, assuming no relationship.
Music and wine purchase decision
The null hypothesis is that there is no relationship between music and wine
sales. The alternative is that these two variables are related.
What is the expected count in the upper-left cell
of the two-way table, under H0?
Column total 84: Number of bottles sold
without music
Row total 99: Number of bottles of French wine
sold
Table total 243: all bottles sold during the study
This expected cell count is thus
(84)(99) / 243 = 34.222
Nine similar calculations
produce the table of
expected counts:
Computing the chi-square statistic
The chi-square statistic (2) is a measure of how much the observed
cell counts in a two-way table diverge from the expected cell counts.
The formula for the 2 statistic is:
(summed over all r * c cells in the table)
2  
observed count -
expected count
expect ed count
2
Tip: First, calculate the 2 components, (observed-expected)2/expected, for

each cell of the table, and then sum them up to arrive at the 2 statistic.
Music and wine purchase decision
H0: No relationship between music and wine
Observed counts
We calculate nine X2
components and sum them
to produce the X2 statistic:
Ha: Music and wine are related
Expected counts
Music and wine purchase decision
H0: No relationship between music and wine
Ha: Music and wine are related
We found that the X2 statistic under H0 is 18.28.
The two-way table has a 3x3 design (3 levels of
music and 3 levels of wine). Thus, the degrees of
freedom for the X2 distribution for this test is:
(r – 1)(c – 1) = (3 – 1)(3 – 1) = 4
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
0.25
1.32
2.77
4.11
5.39
6.63
7.84
9.04
10.22
11.39
12.55
13.70
14.85
15.98
17.12
p
0.2
0.15
0.1
0.05
0.025
0.02
0.01
0.005 0.0025
0.001
0.0005
1.64
2.07
2.71
3.84
5.02
5.41
6.63
7.88
9.14
10.83
12.12
3.22
3.79
4.61
5.99
7.38
7.82
9.21
10.60
11.98
13.82
15.20
4.64
5.32
6.25
7.81
9.35
9.84
11.34
12.84
14.32
16.27
17.73
5.99
6.74
7.78
9.49
11.14
11.67
13.28
14.86
16.42
18.47
20.00
7.29
8.12
9.24
11.07
12.83
13.39
15.09
16.75
18.39
20.51
22.11
2
8.56
9.45
10.64
12.59
14.45
15.03
16.81
18.55
20.25
22.46
24.10
16.42 < X =18.28 < 18.47
9.80
10.75
12.02
14.07
16.01
16.62
18.48
20.28
22.04
24.32
26.02
11.03
12.03
13.36
15.51
17.53
18.17
20.09
21.95
23.77
26.12
27.87
0.0025
>
p-value
>
0.001

very
significant
12.24
13.29
14.68
16.92
19.02
19.68
21.67
23.59
25.46
27.88
29.67
13.44
14.53
15.99
18.31
20.48
21.16
23.21
25.19
27.11
29.59
31.42
There
is
a
significant
relationship
between
the
type
of
music
played
14.63
15.77
17.28
19.68
21.92
22.62
24.72
26.76
28.73
31.26
33.14
15.81
16.99
18.55
21.03
23.34
24.05
26.22
28.30
30.32
32.91
34.82
and
wine18.20
purchases
in
supermarkets.
16.98
19.81
22.36
24.74
25.47
27.69
29.82
31.88
34.53
36.48
18.15
19.41
21.06
23.68
26.12
26.87
29.14
31.32
33.43
36.12
38.11
Interpreting the 2 output

The values summed to make up 2 are called the 2 components.
When the test is statistically significant, the largest components point
to the conditions most different from the expectations based on H0.
Two chi-square components contribute
Music and wine purchase decision
most to the X2 total  the largest
X2 components
effect is for sales of Italian wine, which
are strongly affected
by Italian and French
0.5209
2.3337
0.5209
0.0075
7.6724
6.4038
0.3971
0.0004
0.4223
music.
Actual proportions show that
Italian music helps sales of Italian
wine, but French music hinders it.
When is it safe to use a 2 test?
We can safely use the chi-square test when:

The samples are simple random samples (SRS).

All individual expected counts are 1 or more (≥1)

No more than 20% of expected counts are less than 5 (< 5)
 For a 2x2 table, this implies that all four expected counts
should be 5 or more.
Chi-square test vs. z-test for two proportions
When comparing only two proportions, such as in a 2x2 table where
the columns represent counts of “success” and “failure,” we can test
H0: p1 = p2 vs. Ha p1 ≠ p2
equally with a two-sided z test or with a chi-square test with 1 degree
of freedom and get the same p-value. In fact, the two test statistics are
related: X2 = (z)2.
Successful firms
Franchise businesses are sometimes given an exclusive territory by contract.
This means that the new outlet will not have to compete with other outlets of
the same chain within its own territory. How does the presence of an
exclusive-territory clause in the contract relate to the survival of the business?
A random sample of 170 new franchises recorded two categorical variables
for each firm: (1) whether the firm was successful or not (based on economic
criteria) and (2) whether or not the firm had an exclusive-territory contract.
This is a 2x2 table (two levels
for success, yes/no; two levels
for exclusive territory, yes/no).
df = (2 − 1)(2 − 1) = 1
Successful firms
How does the presence of an exclusive-territory clause in the
contract relate to the survival of the business?
To compare firms that have
an exclusive territory with
those that do not, we start by
examining column percents
(conditional distribution):
The difference between the percent of successes among the two types of firms
is quite large. The chi-square test can tell us whether or not these differences
can be plausibly attributed to chance (random sampling). Specifically, we will
test
H0: No relationship between exclusive clause and success
Ha: There is some relationship between the two variables
Successful firms
Here is the chi-square
output from Minitab:
The p-value is significant at α = 5% (p = 1.5%), thus we reject H0: we have
found a significant relationship between an exclusive territory and the success of
a franchised firm.
Successful firms
Computer output
using Crunch It!
Computations for two-way tables
When analyzing relationships between two categorical variables, follow
this procedure:
1. Calculate descriptive statistics that convey the important information
in the table—usually column or row percents.
2. Find the expected counts and use them to compute the X2 statistic.
3. Compare your X2 statistic to the chi-square critical values from Table
F to find the approximate P-value for your test.
4. Draw a conclusion about the association between the row and
column variables.
Comparing several populations
Select independent SRSs from each of c populations, of sizes
n1, n2, . . . , nc. Classify each individual in a sample according to a
categorical response variable with r possible values. There are c
different probability distributions, one for each population.
The null hypothesis is that the distributions of the response variable are
the same in all c populations. The alternative hypothesis says that
these c distributions are not all the same.
Cocaine addiction
Cocaine produces short-term feelings of physical and mental well-being. To
maintain the effect, the drug may have to be taken more frequently and at
higher doses. After stopping use, users will feel tired, sleepy, and depressed.
The pleasurable high, followed by
unpleasant after-effects, encourage
repeated compulsive use which can
easily lead to dependency.
We compare treatment with an antidepressant (desipramine), a standard
treatment (lithium), and a placebo.
Population 1: Antidepressant treatment (desipramine)
Population 2: Standard treatment (lithium)
Population 3: Placebo (“sugar pill”)
Cocaine addiction
H0: The proportions of success (no relapse)
are the same in all three populations.
Observed
15/25=0.6
7/26=0.27
4/23=0.17
Expected
Expected relapse counts
26/74=
35%
26/74=
35%
26/74
35%
No
Yes
25*26/74 ≈ 8.78
=25*0.35
16.22
25*0.65
Lithium
9.14
=26*0.35
16.86
25*0.65
Placebo
8.08
23*0.35
14.92
25*0.65
Desipramine
Cocaine addiction
Table of counts:
“actual & expected,”
with three rows and two
columns:
No relapse
Relapse
Desipramine
15
8.78
10
16.22
Lithium
7
9.14
19
16.86
Placebo
4
8.08
19
14.92
df = (3−1)*(2−1) = 2
2
2




15

8
.
78
10

16
.
22
2 

8.78
16.22
2
2


7  9.14
19  16.86


9.14
16.86
2
2


4  8.08
19  14.92


8.08
14.92
 10.74
2 components:
4 .4 1
0 .5 0
2 .0 6
2 .3 9
0 .2 7
1 .1 2
Cocaine addiction:
H0: The proportions of success (no relapse)
are the same in all three populations.
p
df
0.25
0.2
0.15
0.1
0.05
0.025
0.02
0.01
1
1.32
1.64
2.07
2.71
3.84
5.02
5.41
6.63
2
2.77
3.22
3.79
4.61
5.99
7.38
7.82
9.21
3
4.11
4.64
5.32
6.25
7.81
9.35
9.84
11.34
4
5.39
5.99
6.74
7.78
9.49
11.14
11.67
13.28
5
6.63
7.29
8.12
9.24
11.07
12.83
13.39
15.09
6
7.84
8.56
9.45
10.64
12.59
14.45
15.03
16.81
2
7
9.04
9.80
10.75
12.02 X 14.07
16.01
= 10.71
and16.62
df = 2 18.48
8
10.22
11.03
12.03
13.36
15.51
17.53
18.17
20.09
2 < 11.98
9
11.3910.60
12.24< X13.29
14.68 
16.920.005
19.02< p19.68
21.67
< 0.0025
10
12.55
13.44
14.53
15.99
18.31
20.48
21.16
23.21
11
13.70
14.63
15.77
17.28
19.68
21.92
22.62
24.72
12
14.85
15.81
16.99
18.55
21.03
23.34
24.05
26.22
13
15.98
16.98
18.20
19.81
22.36
24.74
25.47
27.69
The
proportions
of
success
are
not
the
same
in
14
17.12
18.15
19.41
21.06
23.68
26.12
26.87
29.14
18.25
19.31
20.60
22.31
25.00
27.49
28.26
30.58
all 15
three
populations
(Desipramine,
Lithium,
16
19.37
20.47
21.79
23.54
26.30
28.85
29.63
32.00
17
20.49
21.61
22.98
24.77
27.59
30.19
31.00
33.41
Placebo).
18
21.60
22.76
24.16
25.99
28.87
31.53
32.35
34.81
19
22.72
23.90
25.33
27.20
30.14
32.85
33.69
36.19
20
23.83
25.04
26.50
28.41
31.41
34.17
35.02
37.57
Desipramine
is
a
more
successful
treatment

21
24.93
26.17
27.66
29.62
32.67
35.48
36.34
38.93
0.005 0.0025
0.001 0.0005
7.88
9.14
10.83
12.12
10.60
11.98
13.82
15.20
12.84
14.32
16.27
17.73
14.86
16.42
18.47
20.00
16.75
18.39
20.51
22.11
18.55
20.25
22.46
24.10
20.28
22.04
24.32
26.02
21.95
23.77
26.12
27.87
23.59
reject25.46
the H027.88 29.67
25.19
27.11
29.59
31.42
26.76
28.73
31.26
33.14
28.30
30.32
32.91
34.82
29.82
31.88
34.53
36.48
Observed
31.32
33.43
36.12
38.11
32.80
34.95
37.70
39.72
34.27
36.46
39.25
41.31
35.72
37.95
40.79
42.88
37.16
39.42
42.31
44.43
38.58
40.88
43.82
45.97
40.00
42.34
45.31
47.50
41.40
43.78
46.80
49.01