No Slide Title

Download Report

Transcript No Slide Title

Chi-Squared tests (2):
Chi-Squared tests (2):
Use with nominal (categorical) data – when all you have is
the frequency with which certain events have occurred.
categorical data
(avoid this, where
possible)
"non-psycho"
"psycho"
score per participant
(aim for this, where
possible)
The 2 “Goodness of
Fit” test:
Compares an observed
frequency distribution
with an expected
frequency distribution.
No. of squirrels killed yearly on the A27
Useful when you have the
observed frequencies for
a number of mutuallyexclusive categories, and
you want to decide if they
have occurred equally
frequently.
Number of dead squirrels
60
50
40
30
20
10
0
1
2
3
4
5
6
7
Year of study
Observed frequency
Expected frequency
Which soap-powder name do shoppers like best?
Each of 100 shoppers picks the powder name they like most.
Number of shoppers picking each name
(observed frequencies):
Washo Scruba Musty Stainzoff Beeo
40
35
5
10
10
Expected frequency for each category is
total no.observations / number of categories
100 / 5 = 20.
total
100
The formula for Chi-Square:

Washo
40
20
O:
E:
O  E 
2
2


E
Scruba Musty
35
5
20
20
Stainzoff Beeo total
10
10
100
20
20
100
(O-E):
20
15
-15
-10
-10
(O-E) 2
400
225
225
100
100
O  E 
20
11.25
11.25
5
5
2
E
2 = 52.5
Chi-squared is the sum of the squared
differences between each observed frequency
and its associated expected frequency.
The bigger the value of 2, the greater the
difference between observed and expected
frequencies.
But how big does 2 have to be, to be regarded
as “big”? Is 52.5 “big”?
We compare our obtained 2 value to 2 values which
would be obtained by chance.
To do this, we need the “degrees of freedom”: this is
the number of categories (or “cells”) minus one.
We have a 2 value of 52.5, with 5-1 = 4 d.f.
Tables show how likely various values of 2 are to
occur by chance. e.g.:
d.f.
1
2
3
4
5
probability level:
.05
.01
.001
3.84
6.63
10.83
5.99
9.21
13.82
7.81
11.34 16.27
9.49
13.28 18.46
11.07 etc.
etc.
52.5 is bigger than 18.46, a value of 2 which will
occur by chance less than 1 times in a 1000 (p<.001).
The sampling distribution of chi-square:
Frequency with which 2 values occur purely by
chance:
With 4 d.f., 2 values of 9.49 or more
are likely to occur by chance on less
than .05 of occasions.
Our obtained 2 = 52.5, with 4 d.f., p < .001.
A 2 value this large is highly unlikely to have
arisen by chance.
It appears that the distribution of shoppers’
choices across soap-powder names is not
random. Some names get picked more than we
would expect by chance and some get picked
less.
The 2 test of association between two
independent variables:
Another common use of 2 is to determine whether
there is an association between two independent
variables.
Is there an association between gender (male or
female: IV A) and soap powder (Washo, Musty, etc.: IV
B)?
This gives a 2 x 5 contingency table.
Data for a random sample of 100 shoppers, 70 men and
30 women:
Washoe
Scrubbup
Musty
Stainoff
Nogunge
total
male
10
12
5
3
40
70
female
6
2
1
20
1
30
totals:
16
14
6
23
41
100
To calculate expected frequencies:
E
= row total * column total
grand total
Work out the expected frequency for each cell:
Washoe Scrubbup
male
female
totals:
Musty
Stainoff
Nogunge
total
70
10
12
5
3
40
(11.2)
(9.8)
(4.2)
(16.1)
(28.7)
6
2
1
20
1
(4.8)
(4.2)
(1.8)
(6.9)
(12.3)
16
14
6
23
41
e.g. 11.2 = (16 * 70)/100
6.9 = (23 * 30)/100, etc.
30
100
Using exactly the same formula as before, we get 2
= 52.94.
d.f. = (number of rows - 1) * (number of columns - 1).
We have two rows and five columns,
so d.f. = (2-1) * (5-1) = 4 d.f.
Use the same table to assess the chances of
obtaining a Chi-Squared value as large as this by
chance; again p< .001.
Conclusion:
our
observed
frequencies
are
significantly different from the frequencies we would
expect to obtain if there were no association
between the two variables: i.e. the pattern of name
preferences is different for men and women.
Chi-Square test merely tells you that there is some
relationship (an association) between the two
variables in question: it does not tell you anything
about the causal relationship between the two
variables.
Here, it is reasonable to assume that gender
causes people to pick different soap powder
names; it's unlikely that soap powder names
cause people to be male or female.
However, in principle the direction of causality
could equally well go in either direction.
Assumptions of the Chi-Square test:
1. Observations must be independent: each
subject must contribute to one and only one
category. Otherwise the test results are
completely invalid.
2. Problems arise when expected frequencies are
very small. Chi-Square should not be used if
more than 20% of the expected frequencies have
a value of less than 5. (It does not matter what
the observed frequencies are).
Two solutions: combine some categories (if this
is meaningful in your experiment), OR obtain
more data (make the sample size bigger).
2 test of association - the one- d.f. case:
Degree:
Like Statistics?
Yes: No:
Row total:
BA:
BSc:
13
5
10
24
23
29
Column total:
18
34
52
If you have only 1 d.f. (as with a 2 x 2 table), the 2
value obtained is inflated; some statisticians
therefore advocate using "Yates' Correction for
Continuity" to make the 2 test more conservative
(i.e. make the obtained 2 value smaller and
hence less likely to be significant).
Same procedure as before, except
(a) take the absolute value of O - E (i.e., ignore
any negative signs).
(b) Subtract 0.5 from each O-E, before squaring it.
 
2
O  E
 0.5 
E
Without Yates’ Correction: 2 = 8.74.
With Yates’ Correction: 2 = 7.09.
2
Why you should avoid using Chi-Square if you can:
Design studies so that you can avoid using ChiSquare!
Frequency data give little information about
participants' performance: all you have is
knowledge about which category someone is in, a
very crude measure.
It's much more informative to obtain one or more
scores per participant; scores give you more
information about performance than categorical
data (and can be used with better statistical tests).
e.g. IQ: which is better - to know participants are
“bright” or “dim”, or have their actual IQ scores?
2 Goodness of Fit test on the "fast food" data, using
SPSS/PASW:
Are all brands mentioned equally frequently?
Analyze > Nonparametric Tests > Chi-Square
Brand first mentioned
Test Statistics
Chi-Squarea
df
Asymp. Sig.
Brand first
mentioned
1209.440
7
.000
a. 0 cells (.0%) have expected frequencies less than
5. The minimum expected cell frequency is 50.0.
Burger King
Domino Pizza
KFC
McDonalds
Pizza Express
Pizza Hut
Wimpy
Other
Total
Observed N
57
1
44
274
1
10
3
10
400
Expected N
50.0
50.0
50.0
50.0
50.0
50.0
50.0
50.0
Residual
7.0
-49.0
-6.0
224.0
-49.0
-40.0
-47.0
-40.0
2 test of association on the "fast food" data, using
SPSS/PASW:
Is there an association between gender and brand first mentioned?
Analyze > Descriptive Statistics > Crosstabs...
2 test of association on the "fast food" data (continued):
Is there an association between gender and brand first mentioned?
Case Processing Summary
11 response categories gives too many expected
frequencies < 5.
Therefore confined analysis
to Burger King, KFC and
McDonalds.
(Use "Select Cases" on
"Data" menu to filter out
unwanted response
categories).
Valid
N
Sex * Brand
first mentioned
Percent
375
Cases
Missing
N
Percent
100.0%
N
.0%
Percent
375
100.0%
Brand first mentioned
Burger King
KFC
McDonalds
30
21
135
28.3
21.8
135.9
27
23
139
28.7
22.2
138.1
57
44
274
57.0
44.0
274.0
Total
186
186.0
189
189.0
375
375.0
Sex * Brand first mentioned Crosstabulation
Sex
Male
Female
Total
Count
Expected Count
Count
Expected Count
Count
Expected Count
Chi-Square Tests
Conclusion: no significant association
between gender and brand first mentioned.
(2 (2) = 0.28, p = .87)
0
Total
Pearson Chi-Square
Likelihood Ratio
Linear-by-Linear
Association
N of Valid Cases
Value
.283a
.283
.135
2
2
Asymp. Sig.
(2-sided)
.868
.868
1
.714
df
375
a. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 21.82.