Transcript Ch 13 Notes
YMS - 13.1
Test for Goodness of Fit
Intro to Three Chi-Squared Tests
Goodness of Fit - used to determine whether a specified
population distribution is valid (testing yours against a
stated claim)
Homogeneity – organize data in a two-way table and
compare two or more population proportions (all of the
proportions in your sample are the same)
Independence/Association – also organizes data in a
two-way table but then determines whether the
distribution of one variable has been influenced by
another
Chi-Square Basics
The more the observed counts differ from the expected
counts, the more evidence we have to reject Ho
Plot the data before testing (segmented bar graphs)
Test statistic - square of difference between observed
and expected counts divided by expected
Reasoning behind degrees of freedom and p-value are
the same as they have been for every other test (top
paragraph on p731)
Chi-Square Distributions
Total area under the curve is equal to 1 (just like
any other density curve)
Each curve (except when df = 1) begins at 0 on
the horizontal axis, peaks, and then approaches
horizontal axis asymptotically from above.
Each curve is skewed right. As df increases, the
curve becomes more symmetrical and looks
more like a normal curve.
Goodness of Fit Test
Ho – “The actual population proportions
are equal to the hypothesized proportions”
or list with proportions such as in Example
13.1 on p729
Ha – “The actual population proportions
are different from the hypothesized
proportions” or “At least one of the
proportions differs from the stated values.”
Conditions
all individual expected counts are at least 1
no more than 20% of the expected counts are
less than 5
Test statistic - can be found using lists
P-value - x2(test statistic, upperbound, df)
found in distribution menu
Some calculators have this test, but most
don’t (they will all do the other ones)
p736 #13.1, 13.2, 13.7 and M&M activity
Simulations
Use if we don’t have the resources to
gather a representative sample
Follow-Up Analysis
Which component contributed most to the
test statistic?
Calculate (O-E)2/E for each observed
count to determine which one is furthest
from expected.
YMS -13.2
Inference for Two-Way Tables
Problem of Multiple Comparisons
How do we do many comparisons at once
with some overall measure of confidence
in all of our conclusions?
If we used two-sample z procedures many
times, it would tell how different each pair
is, but not how likely it is that we get n
sample proportions spread so far apart.
Two – Way Tables
Gives counts for both successes and
failures
r x c table showing the relationship
between two categorical values
Example: Create an r x c table and find the
expected counts in each cell.
Expected Count
row total x column total divided by table total
(finding proportion and multiplying by count)
p748 #13.14-13.15
Homogeneity of populations
Chi-Square statistic, conditions and follow-up
are the same as for G of F test
Degrees of freedom equal (r – 1)(c – 1)
Ho states that distribution of response variable is
the same in all c proportions of r x c two-way
table (Example: All treatments for cocaine
addicts are equally effective)
Ha says there is at least one proportion that is
different
Chi-Square test in calculator
Enter observed counts into matrix [A] and
TI-83 will generate expected counts
Practice: p756 #13.16-13.17
Homework: p761 #13.19 and 13.21
Association/Independence
Two-way table classifies observations from a
single population in two ways (2 categories...
Not just success/failure)
Ho states “There is no relationship between the
two categorical variables” or “The two variables
are independent.” (Remember to put in context)
Ha says there is a relationship or they are not
independent
Expected counts will equal row total times
column total divided by table total
Distinguishing Between Tests
Goodness of Fit is the only one not in a
two way table
Homogeneity is in a two-way table with
sample from two of more populations
Association/Independence is another twoway table but it comes from a single
sample of a single population
Different Hypothesis
Goodness of Fit
Null: p(br) = 0.13, p(y) = 0.14, p(r) = 0.13, p(bl) = 0.24,
p(o) = 0.20, p(g) = 0.16
Alt: At least one of the color proportions differs from
the stated proportion
Homogeneity
Null: All of the treatments to quit smoking are equally
as effective.
Alt: At least one of the treatments has a different rate
of effectiveness.
Association/Independence
Null: The is no relationship between student smoking
habits and parent smoking habits.
Alt: There is a relationship between student smoking
and parent smoking habits.
Chi-Square and Z-test
The tests yield the same results when counts
come from a 2 x 2 table with chi-square stat just
being the square of the z statistic
Use z test to compare just two proportions
because you can choose for it to be one-sided
and it has a related confidence interval for the
difference in the proportions
p768 #13.29-13.30
Classify p770 #31-39