Chi-square (χ2) Fenster Chi-Square  Chi-Square χ2  Tests of Statistical Significance for Nominal Level Data (Note: can also be used for ordinal level data).

Download Report

Transcript Chi-square (χ2) Fenster Chi-Square  Chi-Square χ2  Tests of Statistical Significance for Nominal Level Data (Note: can also be used for ordinal level data).

Chi-square (χ2)
Fenster
Chi-Square

Chi-Square χ2

Tests of Statistical Significance for
Nominal Level Data (Note: can also be
used for ordinal level data).
Chi-Square
Chi-Square is an elegant and beautiful test
 The assumptions required to use the test
are very weak. That is to say, we do not
have to make many assumptions about
how the data are distributed.

Chi-Square

We ask the following question- Are the
frequencies empirically obtained (by this
we mean OBSERVED) significantly
different from those, which would have
been EXPECTED under some general set
of assumptions:
Chi-Square
Assumptions to use Chi-Square Test:
 Samples are randomly selected from the
population.
 EXPECTED frequencies (to be defined
later) are greater than 5 in every cell. But
even this assumption can be modified with
the use of Yates' correction. WE DO NOT
NEED TO ASSUME NORMALITY!!!!

Chi-Square

This may not surprise you. After all, the
concept of a normal distribution has no
meaning for nominal level data and chisquare is a test for nominal level data. Chisquare is so popular because of this weak
set of assumptions.
Chi-Square


Ho for chi-square: If there were no relationship
between the dependent and independent
variable, the column percentages will not
change as we move across levels of the
INDEPENDENT variable.
Note: We covered this earlier in the course. We
said we had no relationship between two
variables if the column percentages do not
change across the independent variables.
Chi-Square
We can compute an EXPECTED set of
frequencies from the MARGINAL totals of
the dependent and independent variables.
 To calculate EXPECTED frequencies we
take the row total multiplied by the column
total and divide by the grand total

Chi-Square

Expected frequencies= (Row total) X (Column total)
Grand Total

OBSERVED Frequencies are those
frequencies that are empirically obtained.

Those are the frequencies that are given
to us.
Chi-Square

Chi-Square= Σ(Observed frequencies- Expected Frequencies)2
Expected Frequencies
Chi-Square
Usually this formula is written
 Chi-Square = Σ (O - E)2

E


The larger the difference between
observed and expected frequencies the
larger the value for χ2.
Chi-Square
If you look at a chi-square table, you will
see many different χ2 distributions.
 Which one should you use?
 You use the χ2 distribution with the
appropriate number of degrees of
freedom.
 For χ2 degrees of freedom are given with
the following formula: df= (r-1) X (c-1)

Chi-Square
That is to say
 (1) we take the number of rows we have
and subtract one.
 (2) We take the number of columns we
have and subtract one.
 (3) We then multiply the numbers we get
for the first two parts.

Chi-Square
Logic of the χ2 test:
 We do not expect observed and expected
frequencies to be EXACTLY the same.
 Observed and expected values can vary
simply by sampling variability.
 However, if the value of χ2 turns out to be
larger than that expected by chance, we
shall be in a position to reject the null
hypothesis.

Chi-Square
EXAMPLE:
 Let us say one was interested in
investigating the relationship between
gender and opinions on accountability.
 Our null hypothesis is that gender makes
no difference in attitudes towards
accountability. Our research hypothesis is
that gender makes a difference in attitudes
towards accountability.

Chi-Square
Gender
Opinion on
Accountability
Male
Female
Row Totals
Accountability good
for educational
system
126
99
225
Accountability bad
for educational
system
71
162
233
197
261
458
Col. Totals
Chi-Square
It is important to note that the numbers in
each cell are actual frequencies rather
than percentages.
 Let us go through our six-step hypothesis
testing method in this case.

Chi-Square
Step 2: State the Research hypothesis
 H1: Gender does make a difference when
predicting to attitudes towards educational
accountability.
 Step 1-State null hypothesis
 Ho: Gender does not make a difference
when predicting to attitudes towards
educational accountability.

Chi-Square
Step 3: Select a significance level: Let’s
chose α=.01
 Step 4: Collect and summarize the sample
data:
 Calculation of χ2:
 Compute out EXPECTED FREQUENCIES
for EACH CELL

Chi-Square
Computing out EXPECTED
FREQUENCIES
 cell a- males who believe that that
accountability is good for the educational
system
 (197) (225) = 96.8
458

Chi-Square
b- females who believe that that
accountability is good for the educational
system
 (261) (225) = 128.2
458

Chi-Square
c- males who believe that that
accountability is bad for the educational
system
 (197) (233) = 100.2
458

Chi-Square
d- females who believe that that
accountability is bad for the educational
system
 (261) (233) = 132.8

458
Set up a chi-square table
f observed
Cell
(f obs- f exp)2
f
f observedexpected f expected
(f obs- f exp) 2/
f exp
A
126
96.8
29.2 852.64
8.808
B
99
128.2
-29.2 852.64
6.651
C
71
100.2
-29.2 852.64
8.509
D
162
132.8
29.2 852.64
6.420
Total
458
458
0
30.388
Chi-Square





Step 5
Obtaining the sampling distribution. Look at a
chi-square table. We will use the chi-square test
with 1 degree of freedom. Why one? df=(r-1) X
(c-1)
We have 2 rows and 2 columns.
so we get df= (2-1) X (2-1)= 1 X 1=1
With our choice of α=.01, we get a χ2 critical of
6.635 (found in chi-square table, p. 566)
Chi-Square

If we find a χ2 greater than or equal to 6.635 we
reject the null hypothesis and conclude that
gender does make a difference when predicting
to attitudes towards educational accountability.

If we find a χ2 less than 6.635 we fail to reject
the null hypothesis and conclude that gender
does not make a difference when predicting to
attitudes towards educational accountability.
Chi-Square
Note: All χ2 tests are one-tailed tests.
 Chi-square can only tell you whether a
variable is significant.
 Chi-square can not tell you anything about
the DIRECTIONALITY of the relationship.
 You must inspect the column percentages
as you move across categories of the
independent variable to determine
DIRECTIONALITY.

Chi-Square




Another way to determine DIRECTIONALITY is to look
at the RESIDUALS (you can instruct SPSS to present
the residuals on your output file.
RESIDUALS are simply the OBSERVED cell count
minus the EXPECTED value.)
If the RESIDUALS are NEGATIVE, you are getting
fewer OBSERVED cases than EXPECTED in a CELL.
If the RESIDUALS are POSITIVE, you are getting
more OBSERVED cases than EXPECTED in a CELL.
Chi-Square
To determine DIRECTIONALITY, look at
the SIGN changes of the RESIDUALS as
you move across categories of the
independent variable.
 Let us assume that the RESIDUALS start
out NEGATIVE and end up POSITIVE.
This would imply that the independent
variable is related to the dependent
variable.

Chi-Square
Step 6: Make a decision:
 χ2 observed= 30.388 and χ2 critical=
6.635
 Decision: REJECT Ho :
 χ2 observed is greater than χ2 critical
 We easily reject the null hypothesis and
conclude that gender does make a
difference when predicting to attitudes
towards educational accountability.

Chi-Square
Two points to note in this example.
 We had one degree of freedom.
 By one degree of freedom we mean that
only one number in the table is actually
free to vary.
 Assume we know the row and column
totals.
 Once we know one number in a 2 X 2
table, we can find the other three.

Chi-Square
a
b
a+b
c
d
c+d
a+c
b+d
a+b+c+d

If I knew the row and column totals, there
is only one cell that is free to vary.
Chi-Square

WE ONLY NEED TO KNOW ONE CELL
TO KNOW THE ENTIRE TABLE. THIS IS
WHY WE HAD ONE DEGREE OF
FREEDOM.
Chi-Square

In our example, f observed - f expected =
the same number for each cell (-29.2 or
29.2) because the table had only one
degree of freedom. If a table has more
than one degree of freedom, f observed- f
expected does not necessarily equal the
same number in every cell (and will not
generally be the same).
Chi-Square
How many cells do we need to know in a
3 X 3 table? I told you the formula tells
us the answer is (r-1) (c-1)

(3-1) (3-1)=(2) X (2) = 4

Let us see how we get df to equal 4.
a
b
c
a+b+c
d
e
f
d+e+f
g
h
i
g+h+i
a+d+g
b+e+h
c+f+i
a+b+c+d+e+
f+g+h+i
Chi-Square





Let us say we knew cell a.
Could we know all the other cells in the table?
Not this time. Let’s say we know cells a and b.
If we knew cells a and b than we can find out cell
c, but we would not know any other cells.
Only if we know four cells: a, b, d, and e
would we be able to find the other five cells.
This is why we have four degrees of freedom in
a 3 X 3 table.
Chi-Square
One other point about chi-square.
 Chi-square can tell you whether a
relationship is significant.
 Chi-square can also tell you what cells are
most important in determining the
significance of the relationship. In our
example we find that all cells contribute to
the significance of the relationship.

Set up a chi-square table
f observed
Cell
(f obs- f exp)2
f
f observedexpected f expected
(f obs- f exp) 2/
f exp
A
126
96.8
29.2 852.64
8.808
B
99
128.2
-29.2 852.64
6.651
C
71
100.2
-29.2 852.64
8.509
D
162
132.8
29.2 852.64
6.420
Total
458
458
0
30.388
Chi-Square

Three of our cells have individual χ2
greater than needed to establish statistical
significance for an entire relationship.
Since χ2 cannot be negative, we can
determine if part of our relationship drives
the entire relationship to statistical
significance.
SPSS Command Syntax for Crosstabs
Note: You can get EXPECTED
frequencies in SPSS by going into
 ANALYZE
 DESCRIPTIVE STATISTICS
 CROSSTABS and clicking on
CROSSTABS
 Dependent variable goes into row box
 Independent variable goes into column
box

SPSS Command Syntax for Crosstabs
Click on Cells
 Click on EXPECTED
 Also click on UNSTANDARDIZED under
residuals.) Click Continue.
 Click on the STATISTICS box
 Click on Chi-Square.
 You may also want to click on the
Contingency Coefficient and lambda.
