Transcript Ch 13 Notes

YMS - 13.1
Test for Goodness of Fit
Intro to Three Chi-Squared Tests
 Goodness of Fit - used to determine whether a specified
population distribution is valid (testing yours against a
stated claim)
 Homogeneity – organize data in a two-way table and
compare two or more population proportions (all of the
proportions in your sample are the same)
 Independence/Association – also organizes data in a
two-way table but then determines whether the
distribution of one variable has been influenced by
another
Chi-Square Basics
 The more the observed counts differ from the expected
counts, the more evidence we have to reject Ho
 Plot the data before testing (segmented bar graphs)
 Test statistic - square of difference between observed
and expected counts divided by expected
 Reasoning behind degrees of freedom and p-value are
the same as they have been for every other test (top
paragraph on p731)
Chi-Square Distributions
 Total area under the curve is equal to 1 (just like
any other density curve)
 Each curve (except when df = 1) begins at 0 on
the horizontal axis, peaks, and then approaches
horizontal axis asymptotically from above.
 Each curve is skewed right. As df increases, the
curve becomes more symmetrical and looks
more like a normal curve.
Goodness of Fit Test
 Ho – “The actual population proportions
are equal to the hypothesized proportions”
or list with proportions such as in Example
13.1 on p729
 Ha – “The actual population proportions
are different from the hypothesized
proportions” or “At least one of the
proportions differs from the stated values.”
 Conditions
 all individual expected counts are at least 1
 no more than 20% of the expected counts are
less than 5
 Test statistic - can be found using lists
 P-value - x2(test statistic, upperbound, df)
found in distribution menu
 Some calculators have this test, but most
don’t (they will all do the other ones)
 p736 #13.1, 13.2, 13.7 and M&M activity
Simulations
Use if we don’t have the resources to
gather a representative sample
Follow-Up Analysis
 Which component contributed most to the
test statistic?
 Calculate (O-E)2/E for each observed
count to determine which one is furthest
from expected.
YMS -13.2
Inference for Two-Way Tables
Problem of Multiple Comparisons
 How do we do many comparisons at once
with some overall measure of confidence
in all of our conclusions?
 If we used two-sample z procedures many
times, it would tell how different each pair
is, but not how likely it is that we get n
sample proportions spread so far apart.
Two – Way Tables
 Gives counts for both successes and
failures
 r x c table showing the relationship
between two categorical values
 Example: Create an r x c table and find the
expected counts in each cell.
Expected Count
row total x column total divided by table total
(finding proportion and multiplying by count)
p748 #13.14-13.15
Homogeneity of populations
 Chi-Square statistic, conditions and follow-up
are the same as for G of F test
 Degrees of freedom equal (r – 1)(c – 1)
 Ho states that distribution of response variable is
the same in all c proportions of r x c two-way
table (Example: All treatments for cocaine
addicts are equally effective)
 Ha says there is at least one proportion that is
different
Chi-Square test in calculator
Enter observed counts into matrix [A] and
TI-83 will generate expected counts
Practice: p756 #13.16-13.17
Homework: p761 #13.19 and 13.21
Association/Independence
 Two-way table classifies observations from a
single population in two ways (2 categories...
Not just success/failure)
 Ho states “There is no relationship between the
two categorical variables” or “The two variables
are independent.” (Remember to put in context)
 Ha says there is a relationship or they are not
independent
 Expected counts will equal row total times
column total divided by table total
Distinguishing Between Tests
 Goodness of Fit is the only one not in a
two way table
 Homogeneity is in a two-way table with
sample from two of more populations
 Association/Independence is another twoway table but it comes from a single
sample of a single population
Different Hypothesis
 Goodness of Fit
 Null: p(br) = 0.13, p(y) = 0.14, p(r) = 0.13, p(bl) = 0.24,
p(o) = 0.20, p(g) = 0.16
 Alt: At least one of the color proportions differs from
the stated proportion
 Homogeneity
 Null: All of the treatments to quit smoking are equally
as effective.
 Alt: At least one of the treatments has a different rate
of effectiveness.
 Association/Independence
 Null: The is no relationship between student smoking
habits and parent smoking habits.
 Alt: There is a relationship between student smoking
and parent smoking habits.
Chi-Square and Z-test
 The tests yield the same results when counts
come from a 2 x 2 table with chi-square stat just
being the square of the z statistic
 Use z test to compare just two proportions
because you can choose for it to be one-sided
and it has a related confidence interval for the
difference in the proportions
p768 #13.29-13.30
Classify p770 #31-39