Transcript Slide 1
Analysis of two-way tables
- Formulas and models for two-way tables
- Goodness of fit
IPS chapters 9.3 and 9.4
© 2006 W.H. Freeman and Company
Objectives (IPS chapters 9.3 and 9.4)
Formulas and models for two-way tables - Goodness of fit
Computations for two-way tables
Computing conditional distributions
Computing expected counts
Computing the chi-square statistic
Finding the p-value with Table F
Models for two-way tables
Comparing several populations
Testing for independence
Testing for goodness of fit
Computations for two-way tables
When analyzing relationships between two categorical variables, follow
this procedure:
1. Calculate descriptive statistics that convey the important information
in the table—usually column or row percents.
2. Find the expected counts and use them to compute the X2 statistic.
3. Compare your X2 statistic to the chi-square critical values from Table
F to find the approximate P-value for your test.
4. Draw a conclusion about the association between the row and
column variables.
Computing conditional distributions
The calculated percents within a two-way table represent the
conditional distributions describing the “relationship” between both
variables.
For every two-way table, there are two sets of possible conditional
distributions (column percents or row percents).
For column percents, divide each cell count by the column total. The sum
of the percents in each column should be 100, except for possible small
roundoff errors.
When one variable is clearly explanatory, it makes sense to describe
the relationship by comparing the conditional distributions of the
response variable for each value (level) of the explanatory variable.
Music and wine purchase decision
What is the relationship between type of music
played in supermarkets and type of wine purchased?
We want to compare the conditional distributions of the response
variable (wine purchased) for each value of the explanatory
variable (music played). Therefore, we calculate column percents.
Calculations: When no music was played, there were
84 bottles of wine sold. Of these, 30 were French wine.
30/84 = 0.357 35.7% of the wine sold was French
when no music was played.
We calculate the column
conditional percents similarly for
each of the nine cells in the table:
30 = 35.7%
84
= cell total .
column total
Computing expected counts
When testing the null hypothesis that there is no relationship between
both categorical variables of a two-way table, we compare actual
counts from the sample data with expected counts given H0.
The expected count in any cell of a two-way table when H0 is true is:
Although in real life counts must be whole numbers, an expected count
need not be. The expected count is the mean over many repetitions of
the study, assuming no relationship.
Music and wine purchase decision
The null hypothesis is that there is no relationship between music and wine
sales. The alternative is that these two variables are related.
What is the expected count in the upper-left cell
of the two-way table, under H0?
Column total 84: Number of bottles sold
without music
Row total 99: Number of bottles of French wine
sold
Table total 243: all bottles sold during the study
This expected cell count is thus
(84)(99) / 243 = 34.222
Nine similar calculations
produce the table of
expected counts:
Computing the chi-square statistic
The chi-square statistic (2) is a measure of how much the observed
cell counts in a two-way table diverge from the expected cell counts.
The formula for the 2 statistic is:
2
observed count -
expected count
expect ed count
2
Music and wine purchase decision
H0: No relationship between music and wine
Observed counts
We calculate nine X2
components and sum them
to produce the X2 statistic:
Ha: Music and wine are related
Expected counts
Finding the p-value with table F
χ2 distributions are a family of distributions that can take only positive
values, are skewed to the right, and are distinguished by “degrees of
freedom”.
Table F gives upper
critical values for many
χ2 distributions.
Table F
df = (r−1)(c−1)
Ex: In a
4x3 table,
df = 3*2 = 6
If 2 = 16.1,
the p-value
is between
0.01−0.02.
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
50
60
80
100
p
0.25
0.2
0.15
0.1
0.05
0.025
0.02
0.01
0.005 0.0025
0.001
1.32
1.64
2.07
2.71
3.84
5.02
5.41
6.63
7.88
9.14
10.83
2.77
3.22
3.79
4.61
5.99
7.38
7.82
9.21
10.60
11.98
13.82
4.11
4.64
5.32
6.25
7.81
9.35
9.84
11.34
12.84
14.32
16.27
5.39
5.99
6.74
7.78
9.49
11.14
11.67
13.28
14.86
16.42
18.47
6.63
7.29
8.12
9.24
11.07
12.83
13.39
15.09
16.75
18.39
20.51
7.84
8.56
9.45
10.64
12.59
14.45
15.03
16.81
18.55
20.25
22.46
9.04
9.80
10.75
12.02
14.07
16.01
16.62
18.48
20.28
22.04
24.32
10.22
11.03
12.03
13.36
15.51
17.53
18.17
20.09
21.95
23.77
26.12
11.39
12.24
13.29
14.68
16.92
19.02
19.68
21.67
23.59
25.46
27.88
12.55
13.44
14.53
15.99
18.31
20.48
21.16
23.21
25.19
27.11
29.59
13.70
14.63
15.77
17.28
19.68
21.92
22.62
24.72
26.76
28.73
31.26
14.85
15.81
16.99
18.55
21.03
23.34
24.05
26.22
28.30
30.32
32.91
15.98
16.98
18.20
19.81
22.36
24.74
25.47
27.69
29.82
31.88
34.53
17.12
18.15
19.41
21.06
23.68
26.12
26.87
29.14
31.32
33.43
36.12
18.25
19.31
20.60
22.31
25.00
27.49
28.26
30.58
32.80
34.95
37.70
19.37
20.47
21.79
23.54
26.30
28.85
29.63
32.00
34.27
36.46
39.25
20.49
21.61
22.98
24.77
27.59
30.19
31.00
33.41
35.72
37.95
40.79
21.60
22.76
24.16
25.99
28.87
31.53
32.35
34.81
37.16
39.42
42.31
22.72
23.90
25.33
27.20
30.14
32.85
33.69
36.19
38.58
40.88
43.82
23.83
25.04
26.50
28.41
31.41
34.17
35.02
37.57
40.00
42.34
45.31
24.93
26.17
27.66
29.62
32.67
35.48
36.34
38.93
41.40
43.78
46.80
26.04
27.30
28.82
30.81
33.92
36.78
37.66
40.29
42.80
45.20
48.27
27.14
28.43
29.98
32.01
35.17
38.08
38.97
41.64
44.18
46.62
49.73
28.24
29.55
31.13
33.20
36.42
39.36
40.27
42.98
45.56
48.03
51.18
29.34
30.68
32.28
34.38
37.65
40.65
41.57
44.31
46.93
49.44
52.62
30.43
31.79
33.43
35.56
38.89
41.92
42.86
45.64
48.29
50.83
54.05
31.53
32.91
34.57
36.74
40.11
43.19
44.14
46.96
49.64
52.22
55.48
32.62
34.03
35.71
37.92
41.34
44.46
45.42
48.28
50.99
53.59
56.89
33.71
35.14
36.85
39.09
42.56
45.72
46.69
49.59
52.34
54.97
58.30
34.80
36.25
37.99
40.26
43.77
46.98
47.96
50.89
53.67
56.33
59.70
45.62
47.27
49.24
51.81
55.76
59.34
60.44
63.69
66.77
69.70
73.40
56.33
58.16
60.35
63.17
67.50
71.42
72.61
76.15
79.49
82.66
86.66
66.98
68.97
71.34
74.40
79.08
83.30
84.58
88.38
91.95
95.34
99.61
88.13
90.41
93.11
96.58 101.90 106.60 108.10 112.30 116.30 120.10 124.80
109.10 111.70 114.70 118.50 124.30 129.60 131.10 135.80 140.20 144.30 149.40
0.0005
12.12
15.20
17.73
20.00
22.11
24.10
26.02
27.87
29.67
31.42
33.14
34.82
36.48
38.11
39.72
41.31
42.88
44.43
45.97
47.50
49.01
50.51
52.00
53.48
54.95
56.41
57.86
59.30
60.73
62.16
76.09
89.56
102.70
128.30
153.20
Music and wine purchase decision
H0: No association between music and wine
Ha: Music and wine are related
We found that the X2 statistic under H0 is 18.28.
The two-way table has a 3x3 design (3 levels of
music and 3 levels of wine). Thus, the degrees of
freedom for the X2 distribution for this test is:
(r – 1)(c – 1) = (3 – 1)(3 – 1) = 4
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
0.25
1.32
2.77
4.11
5.39
6.63
7.84
9.04
10.22
11.39
12.55
13.70
14.85
15.98
17.12
p
0.2
0.15
0.1
0.05
0.025
0.02
0.01
0.005 0.0025
0.001
0.0005
1.64
2.07
2.71
3.84
5.02
5.41
6.63
7.88
9.14
10.83
12.12
3.22
3.79
4.61
5.99
7.38
7.82
9.21
10.60
11.98
13.82
15.20
4.64
5.32
6.25
7.81
9.35
9.84
11.34
12.84
14.32
16.27
17.73
5.99
6.74
7.78
9.49
11.14
11.67
13.28
14.86
16.42
18.47
20.00
7.29
8.12
9.24
11.07
12.83
13.39
15.09
16.75
18.39
20.51
22.11
2
8.56
9.45
10.64
12.59
14.45
15.03
16.81
18.55
20.25
22.46
24.10
16.42 < X 18.28 < 18.47
9.80
10.75
12.02
14.07
16.01
16.62
18.48
20.28
22.04
24.32
26.02
11.03
12.03
13.36
15.51
17.53
18.17
20.09
21.95
23.77
26.12
27.87
0.0025
>
p-value
>
0.001
very
significant
12.24
13.29
14.68
16.92
19.02
19.68
21.67
23.59
25.46
27.88
29.67
13.44
14.53
15.99
18.31
20.48
21.16
23.21
25.19
27.11
29.59
31.42
There
is
a
significant
relationship
between
the
type
of
music
played
14.63
15.77
17.28
19.68
21.92
22.62
24.72
26.76
28.73
31.26
33.14
15.81
16.99
18.55
21.03
23.34
24.05
26.22
28.30
30.32
32.91
34.82
and
wine18.20
purchases
in
supermarkets.
16.98
19.81
22.36
24.74
25.47
27.69
29.82
31.88
34.53
36.48
18.15
19.41
21.06
23.68
26.12
26.87
29.14
31.32
33.43
36.12
38.11
Interpreting the 2 output
The values summed to make up 2 are called the 2-components.
When the test is statistically significant, the largest components point
to the conditions most different from the expectations based on H0.
Two chi-square components contribute
Music and wine purchase decision
most to the X2 total the largest
X2 components
effect is for sales of Italian wine, which
are strongly affected
by Italian and French
0.5209
2.3337
0.5209
0.0075
7.6724
6.4038
0.3971
0.0004
0.4223
music.
Actual proportions show that
Italian music helps sales of Italian
wine, but French music hinders it.
Models for two-way tables
The chi-square test is an overall technique for comparing any number
of population proportions, testing for evidence of a relationship between
two categorical variables. We can be either:
Compare several populations: Randomly select several SRSs each
from a different population (or from a population subjected to different
treatments) experimental study.
Test for independence: Take one SRS and classify the individuals in
the sample according to two categorical variables (attribute or condition)
observational study, historical design.
Both models use the X2 test to test of the hypothesis of no relationship.
Comparing several populations
Select independent SRSs from each of c populations, of sizes
n1, n2, . . . , nc. Classify each individual in a sample according to a
categorical response variable with r possible values. There are c
different probability distributions, one for each population.
The null hypothesis is that the distributions of the response variable are
the same in all c populations. The alternative hypothesis says that
these c distributions are not all the same.
Cocaine addiction
Back to the cocaine problem.
The pleasurable high followed by
unpleasant after-effects encourage
repeated compulsive use, which can
easily lead to dependency.
We compare treatment with an antidepressant (desipramine), a standard
treatment (lithium), and a placebo.
Population 1: Antidepressant treatment (desipramine)
Population 2: Standard treatment (lithium)
Population 3: Placebo (“nothing pill”)
Cocaine addiction
H0: The proportions of success (no relapse)
are the same in all three populations.
Observed
Expected
Expected relapse counts
35%
35%
35%
No
Yes
8.78
16.22
Lithium
9.14
16.86
Placebo
8.08
14.92
Desipramine
Cocaine addiction
Table of counts:
“actual / expected,” with
three rows and two
columns:
No relapse
Relapse
Desipramine
15
8.78
10
16.22
Lithium
7
9.14
19
16.86
Placebo
4
8.08
19
14.92
df = (3−1)*(2−1) = 2
2
2
15
8
.
78
10
16
.
22
2
8.78
16.22
2
2
7 9.14
19 16.86
9.14
16.86
2
2
4 8.08
19 14.92
8.08
14.92
10.74
2-components:
4 .4 1
0 .5 0
2 .0 6
2 .3 9
0 .2 7
1 .1 2
Cocaine addiction: Table F
H0: The proportions of success (no relapse)
are the same in all three populations.
p
df
0.25
0.2
0.15
0.1
0.05
0.025
0.02
0.01
1
1.32
1.64
2.07
2.71
3.84
5.02
5.41
6.63
2
2.77
3.22
3.79
4.61
5.99
7.38
7.82
9.21
3
4.11
4.64
5.32
6.25
7.81
9.35
9.84
11.34
4
5.39
5.99
6.74
7.78
9.49
11.14
11.67
13.28
5
6.63
7.29
8.12
9.24
11.07
12.83
13.39
15.09
6
7.84
8.56
9.45
10.64
12.59
14.45
15.03
16.81
2
7
9.04
9.80
10.75
12.02 X 14.07
16.01
= 10.71
and16.62
df = 2 18.48
8
10.22
11.03
12.03
13.36
15.51
17.53
18.17
20.09
2 < 11.98
9
11.3910.60
12.24< X13.29
14.68
16.920.005
19.02< p19.68
21.67
< 0.0025
10
12.55
13.44
14.53
15.99
18.31
20.48
21.16
23.21
11
13.70
14.63
15.77
17.28
19.68
21.92
22.62
24.72
12
14.85
15.81
16.99
18.55
21.03
23.34
24.05
26.22
13
15.98
16.98
18.20
19.81
22.36
24.74
25.47
27.69
The
proportions
of
successes
differ
in
the
three
14
17.12
18.15
19.41
21.06
23.68
26.12
26.87
29.14
15
18.25
19.31
20.60
22.31
25.00
27.49
28.26
30.58
populations.
From
the
individual
chi-square
16
19.37
20.47
21.79
23.54
26.30
28.85
29.63
32.00
17
20.49
21.61
22.98
24.77
27.59
30.19
31.00
33.41
components
we
see
that
Desipramine
is
the
more
18
21.60
22.76
24.16
25.99
28.87
31.53
32.35
34.81
19
22.72
23.90
25.33
27.20
30.14
32.85
33.69
36.19
successful
treatment.
20
23.83
25.04
26.50
28.41
31.41
34.17
35.02
37.57
21
24.93
26.17
27.66
29.62
32.67
35.48
36.34
38.93
0.005 0.0025
0.001 0.0005
7.88
9.14
10.83
12.12
10.60
11.98
13.82
15.20
12.84
14.32
16.27
17.73
14.86
16.42
18.47
20.00
16.75
18.39
20.51
22.11
18.55
20.25
22.46
24.10
20.28
22.04
24.32
26.02
21.95
23.77
26.12
27.87
23.59
reject25.46
the H027.88 29.67
25.19
27.11
29.59
31.42
26.76
28.73
31.26
33.14
28.30
30.32
32.91
34.82
29.82
31.88
34.53
36.48
Observed
31.32
33.43
36.12
38.11
32.80
34.95
37.70
39.72
34.27
36.46
39.25
41.31
35.72
37.95
40.79
42.88
37.16
39.42
42.31
44.43
38.58
40.88
43.82
45.97
40.00
42.34
45.31
47.50
41.40
43.78
46.80
49.01
Testing for independence
Suppose we now have a single sample from a population. For each
individual in this SRS of size n we measure two categorical variables.
The results are then summarized in a two-way table.
The null hypothesis is that the row and column variables are
independent. The alternative hypothesis is that the row and column
variables are dependent.
Successful firms
How does the presence of an exclusive-territory clause in the contract for a
franchise business relate to the survival of that business?
A random sample of 170 new franchises recorded two categorical variables
for each firm: (1) whether the firm was successful or not (based on economic
criteria) and (2) whether or not the firm had an exclusive-territory contract.
This is a 2x2, two-way table
(2 levels for business success, yes/no,
2 levels for exclusive territory, yes/no).
We will test H0: The variables exclusive clause and success are independent.
Successful firms
Computer output for
the chi-square test
using Minitab:
The p-value is significant at α 5% thus we reject H0:
The existence of an exclusive territory clause in a franchise’s contract and the
success of that franchise are not independent variables.
Parental smoking
Does parental smoking influence the incidence of smoking in children when
they reach high school? Randomly chosen high school students were asked
whether they smoked (columns) and whether their parents smoked (rows).
Examine the computer output for the chi-square test performed on these data.
What does it tell you?
Sample size?
Hypotheses?
Are data ok for 2 test?
Interpretation?
Testing for goodness of fit
We have used the chi-square test as the tool to compare two or
more distributions based on some sample data from each one.
We now consider a variation where there is only one sample and we
want to compare it with some hypothesized distribution.
Data for n observations on a categorical variable with k possible
outcomes are summarized as observed counts, n1, n2, . . . , nk.
The null hypothesis specifies probabilities p1, p2, . . . , pk for each of
the possible outcomes.
Car accidents and day of the week
A study of 667 drivers who were using a cell phone when they were involved
in a collision on a weekday examined the relationship between these
accidents and the day of the week.
Are the accidents equally likely to occur on any day of the working week?
H0 specifies that all 5 days are equally likely for car accidents each pi = 1/5.
The chi-square goodness of fit test
Data for n observations on a categorical variable with k possible
outcomes are summarized as observed counts, n1, n2, . . . , nk in k
cells.
H0 specifies probabilities p1, p2, . . . , pk for the possible outcomes.
For each cell, multiply the total number of observations n by the
specified probability pi:
expected count = npi
The chi-square statistic follows the chi-square distribution with k − 1
degrees of freedom:
Car accidents and day of the week
H0 specifies that all days are equally likely for
car accidents each pi = 1/5.
The expected count for each of the five days is npi = 667(1/5) = 133.4.
2
2
(count
133.4)
(observed
expected)
day
2
8.49
expected
133.4
Following the chi-square distribution with 5 − 1 = 4 degrees of freedom.
p
df
0.25
0.2
0.15
0.1
0.05
0.025
0.02
0.01
0.005 0.0025
0.001 0.0005
1
1.32
1.64
2.07
2.71
3.84
5.02
5.41
6.63
7.88
9.14
10.83
12.12
2
2.77
3.22
3.79
4.61
5.99
7.38
7.82
9.21
10.60
11.98
13.82
15.20
3
4.11
4.64
5.32
6.25
7.81
9.35
9.84
11.34
12.84
14.32
16.27
17.73
4
5.39
5.99
6.74
7.78
9.49
11.14
11.67
13.28
14.86
16.42
18.47
20.00
5
6.63
7.29
8.12
9.24
11.07
12.83
13.39
15.09
16.75
18.39
20.51
22.11
6
7.84
8.56
9.45
10.64
12.59
14.45
15.03
16.81
18.55
20.25
22.46
24.10
7
9.04
9.80
10.75
12.02
14.07
16.01
16.62
18.48
20.28
22.04
24.32
The p-value is thus between 0.1 and 0.05, which is not significant at α 5%. 26.02
8
10.22
11.03
12.03
13.36
15.51
17.53
18.17
20.09
21.95
23.77
26.12
27.87
11.39
12.24
13.29
14.68
16.92of different
19.02
19.68
21.67
23.59
27.88
29.67
9 There
is no
significant
evidence
car accident
rates25.46
for different
10
12.55
13.44
14.53
15.99
18.31
20.48
21.16
23.21
25.19
27.11
29.59
31.42
11
13.70 when
14.63the15.77
19.68 a21.92
22.62
24.72
26.76
28.73
31.26
33.14
weekdays
driver17.28
was using
cell phone.
12
14.85
15.81
16.99
18.55
21.03
23.34
24.05
26.22
28.30
30.32
32.91
34.82
13
15.98
16.98
18.20
19.81
22.36
24.74
25.47
27.69
29.82
31.88
34.53
36.48