Transcript STA291

STA291
Statistical Methods
Lecture 25
Goodness-of-Fit Tests
Given the following…
1) Counts of items in each of several categories
2) A model that predicts the distribution of the relative
frequencies
…this question naturally arises:
“Does the actual distribution differ from the model
because of random error, or do the differences mean
that the model does not fit the data?”
In other words, “How good is the fit?”
Goodness-of-Fit Tests
Example : Credit Cards
At a major credit card bank, the percentages of people who
historically apply for the Silver, Gold, and Platinum cards are
60%, 30%, and 10% respectively. In a recent sample of
customers, 110 applied for Silver, 55 for Gold, and 35 for
Platinum. Is there evidence to suggest the percentages have
changed?
Null Hypothesis: The distribution of types of credit card
applications is no different from the historic distribution.
Test the hypothesis with a chi-square goodness-of-fit
test.
Goodness-of-Fit Tests
Assumptions and Condition
Counted Data Condition – The data must be
counts for the categories of a categorical variable.
Independence Assumption – The counts should be
independent of each other. Think about whether this is
reasonable.
Randomization Condition – The counted
individuals should be a random sample of the
population. Guard against auto-correlated
samples.
Goodness-of-Fit Tests
Sample Size Assumption
There must be enough data so check the following
condition:
Expected Cell Frequency Condition – must be
at least 5 individuals per cell.
Goodness-of-Fit Tests
Chi-Square Model
To decide if the null model is plausible, look at the
differences between the observed values and the
values expected if the model were true.
c2 

all
cells
f  f 
 Expected 
  o e
Expected
fe
all
Observed
2
2
cells
Note that c2 “accumulates” the relative squared
deviation of each cell from its expected value.
So, c2 gets “big” when
i) the data set is large and/or
ii) the model is a poor fit.
Goodness-of-Fit Tests
The Chi-Square Calculation
1. Find the expected values. These come from the
null hypothesis value.
2. Compute the residuals,  fo  fe 
3. Square the residuals,  fo  fe 2
 f o  f e 2
4. Compute the components. Find
for each
fe
cell.
2


f

f
5. Find the sum of the components, c 2   o f e
all
cells
6. Find the degrees of freedom (no. of cells – 1)
7. Test the hypothesis, finding the p-value or
comparing the test statistic from 5 to the
appropriate critical value.
e
Goodness-of-Fit Tests
Example : Credit Cards
At a major credit card bank, the percentages of people
who historically apply for the Silver, Gold, and
Platinum cards are 60%, 30%, and 10% respectively.
In a recent sample of customers, 110 applied for
Silver, 55 for Gold, and 35 for Platinum. Is there
evidence to suggest the percentages have changed?
What type of test do you conduct?
What are the expected values?
Find the test statistic and p-value.
State conclusions.
Goodness-of-Fit Tests
Example : Credit Cards
At a major credit card bank, the percentages of people who
historically apply for the Silver, Gold, and Platinum cards are
60%, 30%, and 10% respectively. In a recent sample of
customers, 110 applied for Silver, 55 for Gold, and 35 for
Platinum. Is there evidence to suggest the percentages
have changed?
What type of test do you conduct?
This is a goodness-of-fit test comparing a single sample to
previous information (the null model).
What are the
expected values?
Silver
Gold
Platinum
Observed
110
55
35
Expected
120
60
20
Goodness-of-Fit Tests
Example : Credit Cards
At a major credit card bank, the percentages of people
who historically apply for the Silver, Gold, and Platinum
cards are 60%, 30%, and 10% respectively. In a
recent sample of customers, 110 applied for Silver, 55
for Gold, and 35 for Platinum. Is there evidence to
suggest the percentages have changed?
Find the test statistic
c2 

 Obs  Exp 
Exp
all cells
110  120 

120
 12.499
and p-value. ???????
2
2
 55  60 

60
2
 35  20 

20
2
Interpreting Chi-Square Values
The Chi-Square Distribution
The c2 distribution is right-skewed and becomes broader
with increasing degrees of freedom:
The c2 test is a one-sided test.
Goodness-of-Fit Tests
Example : Credit Cards
Is there evidence to suggest the percentages have
changed?
With the test statistic c2 = 12.499, find the p-value:
Using df = 2 and technology (Excel: “=1 CHISQ.DIST(12.499, 2, TRUE)”, the p-value =
0.001931
State conclusions.
Reject the null hypothesis. There is sufficient
evidence customers are not applying for cards in the
traditional proportions.
Examining the Residuals
When we reject a null hypothesis, we can examine the
residuals in each cell to discover which values are
extraordinary.
Because we might compare residuals for cells with
very different counts, we should examine standardized
residuals:
fo  fe
fe
Note that standardized residuals from goodness-of-fit
tests are distributed as z-scores (which we already
know how to interpret and analyze).
Examining the Residuals
Standardized residuals for the credit card
data:
Standardized
Card Type Residual
Silver
-0.91287
Gold
-0.6455
Platinum
3.354102
• Neither of the Silver nor Gold values is
remarkable.
• The largest, Platinum, at 3.35, is where the
difference from historic values lies.
The Chi-Square Test for Homogeneity
Assumptions and Conditions
Counted Data Condition – Data must be counts
Independence Assumption – Counts need to be
independent from each other. Check for randomization
Randomization Condition – Random samples
/stratified sample needed
Sample Size Assumption – There must be enough
data so check the following condition.
Expected Cell Frequency Condition – Expect at
least 5 individuals per cell.
The Chi-Square Test for Homogeneity
Following the pattern of the goodness-of-fit test,
compute the component for each cell:
Component
2

fo  fe 

fe
Then, sum the components:
c 
2

all
cells
 f o  f e 2
fe
The degrees of freedom are  R  1   C  1 .
The Chi-Square Test for Homogeneity
Example: More Credit Cards
A market researcher for the credit card bank wants to
know if the distribution of applications by card is the
same for the past 3 mailings. She takes a random
sample of 200 from each mailing and counts the
number of applications for each type of card.
Type of Card
Silver
Gold Platinum Total
Mailing 1
120
50
30
200
Mailing 2
115
50
35
200
Mailing 3
105
55
40
200
Total
340
155
105
600
The Chi-Square Test for Homogeneity
Example: More Credit Cards
A market researcher for the credit card bank wants to
know if the distribution of applications by card is the same
for the past 3 mailings. 250
200
150
Platinum
Gold
100
Silver
50
0
Mailing 1
Mailing 2
Mailing 3
But, are the differences real or just natural sampling variation?
Our null hypothesis is that the relative frequency distributions
are the same (homogeneous) for each country.
Test the hypothesis with a chi-square test for homogeneity.
The Chi-Square Test for Homogeneity
Example: More Credit Cards
A market researcher for the credit card bank wants to
know if the distribution of applications by card is the
same for the past 3 mailings.
Use the total % to determine the expected
counts for each table column (type of card):
Type of Card
Mailing 1
Mailing 2
Mailing 3
Total
Silver
Gold
Platinum Total
113.33
51.67
35
113.33
51.67
35
113.33
51.67
35
340
155
105
200
200
200
600
The Chi-Square Test for Homogeneity
Example : More Credit Cards
A market researcher for the credit card bank wants to
know if the distribution of applications by card is the
same for the past 3 mailings. She takes a random
sample of 200 from each mailing and counts the
number of applications for each type of2card.
Find the test statistic. c 
2

 Obs  Exp 
all cells
Exp
120  113.33


113.33
 2.7806
2
50  51.67 


51.67
2
40  35 

 ... 
Given p-value = 0.5952,state conclusions.
Fail to reject the null. There is insufficient evidence to
suggest that the distributions are different for the three
mailings.
35
2
Looking back
o Recognize
when a chi-square test of
goodness of fit or homogeneity is
appropriate.
o For each test, find the expected cell
frequencies.
o For each test, check the assumptions
and corresponding conditions and
know how to complete the test.
o Interpret a chi-square test.
o Examine the standardized residuals