Transcript PPch15

Chapter 15
Nonparametric Methods:
Chi-Square Applications
1
Nonparametric
 One-Look.Com Definition:
 adjective: not involving an estimation of the
parameters of a statistic
 adjective: not requiring knowledge of underlying
distribution: used to describe or relating to statistical
methods that do not require assumptions about the
form of the underlying distribution
 You mean we can test without assuming a
normal curve?
 Yes!
2
Goals
1. Conduct a test of hypothesis comparing
an observed set of frequencies to an
expected set of frequencies
We can test a
i.
Goodness-of-fit tests:
1) Equal Expected Frequencies
2) Unequal Expected Frequencies
hypothesis with
assuming
data distribution
is normal!
2. List the characteristics of the Chi-square
distribution
3
Chi-square (2) Applications
1. Testing Method where we don’t need
assumptions about the shape of the data
2. Testing methods for Nominal data


Data with no natural order
Examples:




Gender
Brand preference
Color
There will be two difference from earlier tests
when we do our hypothesis testing:


Look up critical value of Chi-square in appendix B
Use new formula for Calculated Test Statistic
4
Conduct A Test Of Hypothesis Comparing
An Observed Set Of Frequencies To An
Expected Set Of Frequencies
1. Goodness-of-fit tests:
1. Equal Expected Frequencies
2. Unequal Expected Frequencies
5
Purpose Of Goodness-of-fit Tests:
1. Compare an observed distribution
(sample) to an expected distribution
(population)
2. We will ask the question:
1. Is the difference between the observed values and
the expected values:
 Due to chance (sampling error):
 The observed distribution is the same as the
expected distribution
 Not due to chance:
 The observed distribution is not the same as the
expected distribution
6
Hypothesis Testing:
Equal Expected Frequencies
 Step 1: State null and alternate
hypotheses
 Ho : There is no significant difference between
the set of observed frequencies and the set of
expected frequencies
 H1 : There is a difference between the
observed and expected frequencies
 Step 2: Select a level of significance
 α = .01 or .05…
7
Hypothesis Testing
 Use α and df to look up
critical value in appendix B
 k = number of
categories
 (k – 1) = degrees of
freedom
Right Tail test
Degrees of Freedom (df), (k - 1), k = # of categories
 Step 3: Identify the test
statistic (Chi Square = 2)
and draw curve with
critical value
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Alpha = risk
0.1
2.706
4.605
6.251
7.779
9.236
10.645
12.017
13.362
14.684
15.987
17.275
18.549
19.812
21.064
22.307
23.542
24.769
25.989
that true Ho will be rejected
0.05
0.02
0.01
3.841
5.412
6.635
5.991
7.824
9.21
7.815
9.837
11.345
9.488
11.668
13.277
11.07
13.388
15.086
12.592
15.033
16.812
14.067
16.622
18.475
15.507
18.168
20.09
16.919
19.679
21.666
18.307
21.161
23.209
19.675
22.618
24.725
21.026
24.054
26.217
22.362
25.472
27.688
23.685
26.873
29.141
24.996
28.259
30.578
26.296
29.633
32
27.587
30.995
33.409
28.869
32.346
34.805
8
Hypothesis Testing
 Step 4: Formulate a decision rule
 If our calculated test statistic is greater than
18.307, we reject Ho and accept H1, otherwise
we fail to reject Ho
9
Hypothesis Testing
 Step 5: Take a random
sample, compute the
calculated test statistic,
compare it to critical value,
and make decision to
reject or not reject null
and hypotheses
Chi Square =
2
Observed (sample data) frequency
in a particular category =
fo
Expected (pop data) frequency in
a particular category =
Number of categories =
degrees of freedom = df =
Sample size =
fe
k
k-1
n
Equal
Expected
Frequencies
1st
2nd
fe
Unequal
Expected
Frequencies
f


k
o
fe
will be
given
or
n*% for cell
2


fo  fe  
2
  

fe


10
Hypothesis Testing
 Step 5: Conclude:
 There is either:
 The sample evidence suggests that there is not
a difference between the observed and
expected frequencies
 The observed distribution is the same as the
expected distribution
 The sample evidence suggests that there is a
difference between the observed and expected
frequencies
 The observed distribution is not the same as
the expected distribution
11
List The Characteristics Of The
Chi-square Distribution
 It is positively skewed
 However, as the degrees of freedom increase, the
curve approaches normal
 It is non-negative
 Because (fo – fe)2 is never negative
 There is a family of chi-square distributions
 df determines which curve to use
 df = k – 1
 k = # of categories
12
C2 Distribution
15- 13
df = 3
df = 5
df = 10
2
Limitations Of Chi-Square

Because fe is used in the denominator,
very small fe could result in very large
calculated test statistic
 In General, avoid using Chi-Square
when:
1. If there are only two cells:
fe >= 5
2. If there are more than two cells
20% of fe cells contain values less than 5
14