Transcript Total

RMTD 404
Lecture 4
Chi Squares
 In Chapter 1, you learned to differentiate between quantitative (aka
measurement or numerical) and (aka frequency or categorical)
qualitative variables. Most of this course will focus on statistical
procedures that can be applied to quantitative variables. This
chapter, however, focuses on qualitative variables.
 This chapter describes 4 concepts relating to the use of the term chisquare (χ2). They are…
1.
2.
3.
4.
A sampling distribution named the chi-square distribution.
A statistical test for comparing marginal proportions of a categorical
variable to theory-based proportions (goodness-of-fit chi-square).
A statistical test for evaluating independence of two categorical
variables (Pearson’s chi-square).
A statistical test for comparing the relative fit of two models to the
same data (Likelihood Ratio chi-square).
Chi Squares
• This is the probability density function that
depicts the relationship between an observed
score (x) and the height of the y axis as a
function of the population mean and standard
deviation
f x  
exp{.5[( x    ]2 }
2 2
Chi Squares
 The chi-square distribution takes the following form (k is the
degree of freedom).
 Fortunately, you don’t really need to use these equations
because tables exist that contain computed values of the
areas under these curves and computer programs
automatically produce these areas.
 
f 
2

 2  e  2
2


1
2 k
k
2
2
k 2 
Chi-Squares

An interesting feature of the chi-square distribution is how its shape
changes as the parameter k increases. In fact, as k approaches , the chisquare distribution becomes normal in shape.
k=5
k=6
k=7
k=8
Chi-Square Goodness of Fit
•
•
One application of the chi-square distribution comes into play when we
want to compare observed relative frequencies (percentages) to theorybased relative frequencies. Recall that in a hypothesis testing framework,
we compare observed statistics to null parameters and determine the
likelihood of obtaining the observed statistic due to sampling error under
the assumption that the null parameter is correct.
We use the chi-square goodness-of-fit test in contexts in which we have a
single categorical variable, and we want to determine whether observed
classifications are consistent with a theory. In this case, our observed
statistics are the proportions associated with each classification, and our
null parameters are the expected proportions for each classification.
Chi-Square Goodness of Fit
•
•
For example, we might be interested in determining whether a purposive
sample that we have drawn is comparable to the US population with
respect to ethnicity. That is, we want to determine whether the observed
proportions of Asians, African Americans, Hispanics, and Caucasians in
our sample reflect the proportions of these groups in the general
population.
In this case, the frequencies and proportions are observed, as in the
following table. The theory-based null parameters (shown in the bottom
row of the table) are obtained from the US Census.
Asian
Africa
American
Hispanic
Caucasian
Observed Frequencies
30
50
30
200
Observed Proportions
(ng / N)
.10
.16
.10
.65
Census Proportions
.04
.12
.10
.74
Chi-Square Goodness of Fit
•
•
•
In the case of the chi-square goodness-of-fit test, the chi-square statistic is
defined as:
2
k O  E 2


O

E
2
i
k   i

Ei
E
i 1
O is the observed frequency, and k equals the number of classifications in
the table (i.e., the number of cells). The expected value (designated E) is
defined as the null proportion (i.e., theory-based) for that classification (ρ)
times the sample size (N).
Ei  i N
Also note the meaning of the expected frequencies (E). These values
constitute what we would expect to be the values of the observed
frequencies (O) if, indeed, our theory was true. That is, the expected
number of cases in each group should be consistent with p (our theorybased proportions).
Chi-Square Goodness of Fit
•
•
Hence, the chi-square statistic tells us how far, on average,
the observed cell frequencies are from the theory-based
expectations.
The table below shows the computations for the example.
The sum of the last row, the chi-square statistic, equals 33.15.
Asian
Africa
Hispanic White
American
Observed Frequencies
30
50
30
200
Expected Frequencies
12.4
37.2
31
229.4
Observed – Expected
17.6
12.8
-1.0
-29.4
(O – E)2
309.76
163.84
1.00
864.36
(O – E)2 / E
24.98
4.40
0.03
3.77
Chi-Square Goodness of Fit
•
We can compare our obtained chi-square value of 33.15, which has 3 degrees of freedom
(degrees of freedom equals k – 1, the number of columns that are free to vary in the
table), to the table values of the chi-square statistic—the range of values that occur due
solely to random sampling. According to this table (Appendix on page 671), the critical
value of the chi-square distribution with 3 degrees of freedom for α = .05 equals 7.82.
•
Hence, the observed differences between our sample and our expected values are
extremely unlikely if the null hypothesis is true—that the vector of observed probabilities
equals the vector of theory-based probabilities.
Chi-Square Goodness of Fit
•
For the sake of being thorough, let’s summarize how we would utilize the
chi-square goodness-of-fit test.
1. Determine which test statistic is required for your problem and data.
The chi-square goodness-of-fit statistic is relevant when you want to
compare the observed frequencies or proportions for a single
categorical variable to the frequencies predicted by a theory.
2. State your research hypothesis—that the observed frequencies were
not generated by the population described by your theory.
3. State the alternative hypothesis: that the observed proportions are not
equal to the theory based proportions (i.e., ρobserved  ρ theory—this is a
non-directional test).
4. State the null hypothesis: that the observed proportions are equal to
the theory-based proportions (i.e., ρ observed = ρ theory—here, ρ is the
population parameter estimated by p, which is not the p-value but the
proportion in each group observed).
Chi-Square Goodness of Fit
•
Summary of how we would utilize the chi-square goodness-of-fit test …
continued.
5. Compute your observed chi-square value.
6. Determine the critical value for your test based on your degrees of
freedom and desired a level OR determine the p-value for the
observed chi-square value based on its degrees of freedom.
7. Compare the observed chi-square value to your critical value OR
compare the p-value for the observed chi-square statistic to your
chosen α, and make a decision to reject or retain your null hypothesis
8. Make a substantive interpretation of your test results.
Chi-Square Test of Association
•
•
A second important application involving the chi-square distribution
allows us to evaluate whether two categorical variables are related to one
another (aka associated). If they are not related, we say that they are
independent of one another (i.e., knowledge about one of the variables
does not tell us anything about the other variable.).
One way of depicting the relationship between two variables involves
creating a contingency table (aka crosstab) showing the frequencies for
each pairing of levels of the two variables. Consider the table below, a
contingency table comparing the SES quartile of two Ethnicity groups.
Note that the cell frequencies within a row or column constitute
conditional totals, and the conditional totals in a row or margin sum to
the marginal totals (i.e., row and column totals).
Q1 Q2 Q3 Q4 Total
Africa American
11
7
4
2
24
Caucasian
28
58
54
49
189
Total
39
65
58
51
213
Chi-Square Test of Association
•
•
•
The distinction between conditional and marginal distributions is an important
one because it highlights the manner in which the Pearson chi-square test of
association is linked to the hypothesis testing framework.
Specifically, when we believe that there is no relationship between ethnicity and
SES, we can predict cell frequencies based on the marginal frequencies. That is,
when there is no relationship between the two variables, the conditional
distributions of ethnicity across the SES quartiles should be similar enough to
one another that we can conclude that any observed differences are due to
sampling error. Specifically, when there is no association, all of the conditional
distributions should merely be random deviations from the marginal
distributions.
Sample 1 gives us p11 and p21. Sample 2 gives us p12 and p22. Sample 3 gives us
p13 and p23. Sample 4 gives us p14 and p24. All of these are assumed to differ
from the marginal distribution, p1+ and p2+, due to sampling error.
Q1 Q2 Q3 Q4
Total
Africa American
p11
p12
p13
p14
p1+
Caucasian
p21
p22
p23
p24
p2+
Chi-Square Test of Association
•
To perform the Pearson chi-square test of association we do the following:
1. Determine which statistic is required for your problem and data. The chisquare test of association is relevant when you want to compare observed
frequencies of two categorical variables to those implied by the tables
marginal when no relationship exists.
2. State the research hypothesis: There is a relationship between ethnicity and
SES in the population.
3. State the alternative hypothesis: Ethnicity and SES are associated in the
population.
4. State the null hypothesis: Ethnicity and SES are independent (note that our
test determines whether the observed ps are too different to have been
generated from the same p due to random sampling variation).
5. Compute the chi-square statistic.
Chi-Square Test of Association
6.
7.
8.
Determine the critical value for your test based on your degrees of
freedom [(R – 1)(C – 1)] and desired a level OR determine the p-value for
the observed test statistic.
Compare the observed chi-square value to your critical value or compare
the p-value for the observed chi-square to your chosen a, and make a
decision to reject or retain your null hypothesis.
Make a substantive interpretation of your test results.
Chi-Square Test of Association
•
•
•
Again, recall the formula for the chi-square statistic
R C Oij  Eij 2

O  E 2
2
ij   

E
E
i 1 j 1
ij
Note that in this case, we sum the values across both rows and columns. In the
case of the test of association, we compute our expected values based on the
marginal totals (rather than state them based on substantive theory).
where Ri is the row total for the cell and
Ri C j
Cj is the column total for the cell.
Eij 
However, justNbecause our expected values come from numbers does not
mean that we are not imposing a substantive theory on their generation. In
fact, we are imposing a substantive theory. Computing expected frequencies
based on the marginal totals implies that the cell frequencies only depend on
the joint distribution of ethnicity and SES. That is, the proportion of members
in each ethnicity group will be the same at each level of SES. Hence, our
computation of the expected value imposes a theory-based assumption that
there is no relationship between ethnicity and SES.


Chi-Square Test of Association
•
•
•
For our example, we need to determine the critical value for our
hypothesis test. To do this, we need to state a level—we’ll use the
traditional .05 level. We also need to know the degrees of freedom for the
test—recall that the shape of the chi-square distribution changes
depending on the value of k.
Degrees of freedom is a concept that will reappear several times in this
course. We use the term degrees of freedom to relay the fact that only a
portion of the values in our data set are free to vary once we impose the
null assumptions. In our case, only some of the cell frequencies are free to
vary once we impose the marginal totals on the table. As shown, in our
case only three of the cells are free to vary.
In general, the degrees of freedom for a chi-square test of association are
defined by
Q1
Q2
Q3
Q4
Total
df   R  1C  1
Black
11
7
4
FIXED 24
White FIXED FIXED FIXED FIXED
189
Total
213
39
65
58
51
Chi-Square Test of Association
•
That means that the degrees of freedom for our example equals 3 [or (2 –
1)(4 – 1)]. From the chi-square table, we see that the critical value for our
test equals 7.82.
Chi-Square Test of Association
•
Here are expected frequencies for each cell.
Q1
Q2
Q3
Q4
Total
Africa American
4.39
7.32
6.54
5.75
24
Caucasian
34.61 57.68 51.46 45.25
189
Total
•
39
65
58
51
213
And the difference between the observed and expected frequencies (aka
residuals). Notice that blacks are over represented in the first SES quartile
and whites are over represented in quartiles 2 through 4 under the null
assumption of no association.
Q1
Q2
Q3
Q4
Total
Africa American
6.61 -0.32 -2.54 -3.75
24
Caucasian
-6.61 0.32
Total
39
65
2.54
3.75
189
58
51
213
•
Here are the squared difference between observed and expected frequencies.
Q1
•
Q2
Q3
Q4
Total
Africa American
43.63 0.10 6.43 14.04
24
Caucasian
43.63 0.10 6.43 14.04
189
Total
39
65
58
51
213
And here are the squared differences divided by the expected frequencies. Each
of these is equivalent to a chi-square with one degree of freedom. Based on this,
we see that the largest deviations are due to the frequency of blacks in the
extreme SES quartiles. The sum of these values is our chi-square statistic (15.07).
Q1
Q2
Q3
Q4
Total
Africa American 9.93 0.01 0.98 2.44
24
Caucasian
189
Total
1.26 0.00 0.12 0.31
39
65
58
51
213
Chi-Square Test of Association
•
•
•
Because our observed chi-square of 15.07 is greater than our critical value
(7.82), we reject the null hypothesis and conclude that there is a
relationship between ethnicity and SES. Note that the p-value for the
observed statistic (which is reported by most statistical software—the
critical value typically is not reported) equals .002 (less than α = .05),
indicating that the observed pattern of cell frequencies is highly unlikely
under the null assumption of independence.
A substantive interpretation of this test might read something like this:
A chi-square test of association indicated that ethnicity and SES are
related, χ2(3) = 15.07, p = .002. Examination of the residuals indicates that
blacks appear in the low SES category too frequently and in the high SES
category too infrequently to assume that the observed frequencies are
random departures from a model of independence.
Example – Goodness of Fit
•
6.1 The chairperson of psychology department suspects that some of her
faculty members are more popular with students than are others. There
are three sections of introductory psychology, taught at 10:00am, 11:00am,
and 12:00pm by professors Anderson, Klatsky, and Kamm. The number of
students who enroll for each is
Professor Anderson
32
•
Professor Klatsky
25
Professor Kamm
10
State the null hypothesis, run the appropriate chi-square test, and interpret
the results.
Example – Goodness of Fit
•
The null hypothesis in this study is that students enroll at random in the
population.
H0: πAnderson=πKlatsky=πKamm
•
Professor
Professor
Professor
Anderson
Klatsky
Kamm
Observed
32
25
10
Expected
22.3
22.3
22.3
(O  E ) 2
2
 
E
(32  22.3) 2 (25  22.3) 2 (10  22.3) 2
 11.33



2
22.3 χ (2)=5.9922.3
The critical
at 0.05 level. 22.3
So?
•
TOTAL
67
67
Example – Goodness of Fit
• 6.2. The data in 6.1 will not really answer the question the
chairperson wants answered. What is the problem , and how
could the experiment be improved?
Example – Test of Association
•
6.8 We know that smoking has all sorts of ill effects on people; among other
things, there is evidence that it affects fertility. Weinberg and Gladen (1986)
examined the effects of smoking and the ease with which women become
pregnant. The researchers asked 586 women who had planned pregnancies how
many menstrual cycles it had taken for them to become pregnant after
discontinuing contraception. Weinberg and Gladen also sorted the women into
whether they were smokers or nonsmokers. The data follow.
Smokers
Nonsmokers
Total
•
1 Cycle
29
198
227
2 Cycles
16
107
123
3+ Cycles
55
181
236
Is there an association between smoking and pregnancy?
Total
100
486
586
Example – Test of Association
•
The expected values are in the parentheses.
1 Cycle
29
(38.74)
198
(188.26)
227
Smokers
Nonsmokers
Total
E smoker_1cycle 
•
2 Cycles
16
(22.70)
107
(110.30)
123
3+ Cycles
55
(40.27)
181
(195.73)
236
Total
100
486
586
227 *100
 38.74
586
2
2
2
(29

38.74)
(16

22.70)
(181

195.73)
2  

 ... 
38.74
22.70
195.73
 11.54
The
critical χ2 =5.99 at 0.05 level. So?
(2)