Tests of Significance

Download Report

Transcript Tests of Significance

Hypothesis Tests
1. A lottery advertises that 10% of people
canticket
I tell ifwin
therea really
who buy a How
lottery
prize.are
too
few
winners?
Recently, the organization that oversees
this lottery
has test
received several
A hypothesis
will help
me
complaints
claiming
that there are fewer
decide!
winners than Take
therea sample
should& be.
find p
But how do I know if this particular p is one
that is representative of the population?
What do hypothesis tests answer?
Could our p haveIshappened
just
by
it one of the sample
random chance, orproportions
is it statistically
that are
significant?
likely to occur?
Is it . . .
Is it one that
isn’t NOT
significant:
random
–Statistically
a random
occurrence
dueato
natural
likelychance
to occur?
occurrence!
variation?
– a biased occurrence
due to some other
reason?
The Idea
How does a murder trial work?
1) Assume your suspicion is NOT correct
1) Assume the person is
innocent
2) See2)ifMust
datahave
provides
evidence
against
sufficient evidence
that assumption
to prove guilty
Hypothesis tests use the
same process!
Steps for a
Same as confidence intervals,
Hypothesis
Test
except we add
hypotheses
1) Check conditions
2) State hypotheses & define
parameters
3) Calculations
4) Conclusion in context
Conditions for a Proportions z-Test
YAY!
Theseassigned
are the same
as
randomly
treatments)
confidence intervals!!
• SRS (or
• Sampling dist. is (approx.) normal
– np > 10 and n(1 – p) > 10
• Independence
– Pop. is at least 10n
Conditions for a Means t-Test
YAY!
Theseassigned
are the same
as
randomly
treatments)
confidence intervals!!
• SRS (or
• Sampling dist. is (approx.) normal
– Pop. is normal (given)
– CLT (n > 30)
– Graph shows normality
2. Bottles of a popular cola are supposed to
contain 300 mL of cola. There is some variation
from bottle to bottle. An inspector, who
suspects that the bottler is under-filling,
measures the contents of six randomly selected
bottles. Are the conditions met?
299.4 297.7 298.9 300.2 297 301
• SRS of bottles
• Normal prob. plot is linear  sampling dist.
is approx. normal
Writing Hypothesis Statements
• Null hypothesis: The statement being tested;
“nothing suspicious is going on”
H0:
• Alternative hypothesis: The statement we
suspect is true
Ha:
Writing Hypothesis Statements
Null hypothesis:
H0: parameter = hypothesized value
Alternative hypothesis:
Ha: parameter > hyp. value (right tail test)
Ha: parameter < hyp. value (left tail test)
Ha: parameter ≠ hyp. value (two-tailed test)
3. A lottery advertises that 10% of people who
buy a lottery ticket win a prize. Recently, the
organization that oversees this lottery has
received several complaints claiming that there
are fewer winners than there should be.
State the hypotheses we'd use to test a sample of
lottery tickets.
H0: p = .1
Ha: p < .1
Where p is the true
proportion of winners
4. A consumer magazine advertizes a new
compact car as getting, on average, 22 mpg. A
dealership believes this ad underrates the car's
mileage.
State the hypotheses we'd use to test a sample
of compact cars.
H0: μ = 47
Ha: μ > 47
Where μ is the
true mean mpg
5. The carbon dioxide (CO2) level in a home varies greatly,
but a typical level is around .06%. Since CO2 concentration
outdoors is typically lower, an indoor level of less than .06%
may indicate that the home is not properly sealed. Indoor
CO2 levels above .06%, on the other hand, may cause
residents to feel drowsy and report that the air feels poor.
State the hypotheses we'd use to test CO2 levels in a sample
of homes.
H0: p = .0006
Ha: p ≠ .0006
Where p is the true proportion
of CO2 in a home
The Golden Rules of Hypotheses
• ALWAYS refer to populations (parameters)
• H0 always equals a value
• H0 always means "nothing interesting is
happening"
Mustvalid
use parameter
(population);
x is a
Are these
hypotheses?
If
not,
why?
Must be different than H0
statistic (sample)
a) H0 :   15; H a :   15
b)
p is the population
H
x same
123;
Hproportion
Must
number
0 : use
a : x  123
as H0
c) H0 : p  .1; H a : p  .1
H0 must be "="
d) H0 :   .4; H a :   .6
e) H0 : p  0; H a : p  0
P-value
• Assuming H0 is true, p-value =
probability that the statistic (x or p)
would be as extreme or more than
what we actually found
 P(our data | H0)
In other words . . . is our
statistic far out in the tails of
the sampling distribution?
Level of Significance
• Threshold of enough evidence to doubt the
null hypothesis
• Called α
– Can be any probability
– Usual values: .1, .05, .01
– When in doubt, use .05
Statistically Significant
• When p-value < α
 If p-value < α, reject the null
hypothesis (guilty)
 If p-value > α, fail to reject the null
hypothesis (not guilty)
p-value low, reject the H0
Golden Rules of p-Values
• We're assuming H0 is true, so reject/fail to reject
H0, not Ha
• Large p-values support the H0, but never prove
it is true!
• Two-tailed (≠) tests have double the p-value of
one-tail tests
• Never accept the null hypothesis!
 No jury ever says "We find the defendant
innocent." They only say "Not guilty."
At an α level of .05, would you reject or
fail to reject H0 for the given p-values?
a)
b)
c)
d)
.03
.15
.45
.023
Reject
Fail to reject
Fail to reject
Reject
Calculating p-Values
• For a z-test:
– normalcdf(lower, upper) with z-scores
• For a t-test:
– tcdf(lower, upper, df) with t-scores
Draw & shade a curve, and
calculate the p-value:
1) right-tail test  z = 1.6
p-value = .0548
2) left-tail test  z = -2.4
p-value = .0082
3) two-tailed test  z = 2.3
p-value = .0107*2 = .0214
Hypothesis Test Conclusions
1) The decision (reject or fail to reject
H0) and why (p-value & α)
AND
2) The results (in terms of Ha) in context
“Since the p-value </> α, I
reject/fail to reject the H0.
There is/is not sufficient
evidence to suggest that Ha.”
Be sure to write Ha in context!
6. To be considered two percent milk, a carton of
milk must have at most a 2.5% fat concentration. A
consumer randomly selects 25 two percent milk
cartons and computes a z-test statistic of 2.1. Write
the hypotheses, calculate the p-value, and write the
appropriate conclusion for α = .05.
z=2.1
H0: p = .025
Where p is
the true proportion
HaSince
: p > .025
ofα,
fatI in
2% milk
p-value <
reject
the H . There is
0
sufficient evidence to suggest that the
p-value = normalcdf(2.1, 1E99) =.0179
concentration of milkfat is greater than 2.5%.
7. A lottery advertises that 10% of people who buy a
lottery ticket win a prize. Recently, the organization that
oversees this lottery has received several complaints
claiming that there are fewer winners than there should be.
A group of citizens collects a random sample of lottery
tickets and finds a test statistic of -1.35. Assume the
conditions are met. Write the hypotheses, calculate the pvalue, and write the appropriate conclusion for α = 0.05.
z=-1.35
H0: p = .1
Where
p is the true
Since
> proportion
α, I fail toofreject
the H0. There is
Ha: pp-value
< .1
winners
not sufficient evidence to suggest that the true
p-value = normalcdf(-1E99, -1.35) =.0885
proportion of winners is less than 10%.
Test Statistic for a Proportion
statistic - parameter
Test statistic 
SD of statistic
z
pˆ  p
p1  p
n
8. A company is willing to renew its advertising
contract with a local radio station only if the
station can prove that more than 20% of the
residents of the city have heard the ad. The
radio station conducts a random sample of 400
people and finds that 90 have heard the ad. Is
this sufficient evidence for the company to
renew its contract?
Conditions:
• SRS of people
We assume the H0 is true, so use the value of
p from the H0 in the conditions
• np = 400(.2) = 80 > 10, n(1 – p) = 400(.8) = 320 > 10  sampling
dist. is approx. normal
• Population of people is at least 4000
H0: p = .2
where p is the true proportion of people
Ha: p > .2
who heard the ad
.225 .2
z
1.25 p  value .1056   .05
.2(.8)
Again: We have a value of p from the H0,
so we can use that in the formula
400
Since p-value > α, we fail to reject the H0. There is not sufficient
evidence to suggest that the true proportion of people who heard the
ad is greater than .2.
9. A supernatural magazine claims that 63% of
Americans believe in ghosts. Gallup surveys
200 randomly selected Americans and finds that
54% of them say they believe in ghosts. At the
1% significance level, does the Gallup poll give
us evidence to doubt the magazine's claim?
Conditions:
• SRS of Americans
• np = 200(.63) = 126 > 10, n(1 – p) = 200(.37) = 74 > 10  sampling
dist. is approx. normal
• Population of Americans is at least 2000
H0: p = .63
where p is the true proportion of Americans
Ha: p ≠ .63
who believe in ghosts
.54  .63
z
 2.64 p  value .0084   .01
.63(.37)
200
Since p-value < α, we reject the H0. There is sufficient evidence to
suggest that the true proportion of people who believe in ghosts is
not .63.
Test Statistic for a Mean
statistic - parameter
test statistic
standard deviation of statistic
tdf =
xμ
s
n
10. Kraft buys milk from several suppliers as the essential raw
material for its cheese. Kraft suspects that some producers are
adding water to their milk to increase their profits. Excess water
can be detected by determining the freezing point of milk.
The freezing temperature of natural milk varies normally, with a
mean of -0.545 degrees and a standard deviation of 0.008. Added
water raises the freezing temperature toward 0 degrees, the
freezing point of water (in Celsius).
A laboratory manager measures the freezing temperature of five
randomly selected lots of milk from one producer and finds a mean
of -0.538 degrees. Is there sufficient evidence to suggest that this
producer is adding water to the milk?
Conditions:
• SRS of milk
• Freezing temp. is normal
H0: μ = -0.545
Ha: μ > -0.545
where μ is the true mean freezing temperature of milk
.538 .545
z
 1.9566
.008
5
p-value = .0252
α = .05
Plug values
into formula
Calculate p-value
Conclusion:
Compare p-value to α &
make a decision
Since p-value < α, we reject the H0.
There is sufficient evidence to suggest that the true mean freezing
temperature is greater than -0.545. (So the producer is adding
water to the milk.)
Write a conclusion in
context in terms of Ha
11. The Degree of Reading Power (DRP) is a test of the
reading ability of children. Here are DRP scores for a
random sample of 44 third-grade students in a suburban
district:
(data on note page)
At the .1 significance level, is there sufficient evidence
to suggest that this district’s third graders' reading ability
is different than the national average of 34?
Conditions:
• SRS of third-graders
• n > 30  normal samp. dist. (CLT)
H0: μ = 34
where μ is the true mean reading
Ha: μ ≠ 34
ability of the district’s third-graders
35.091  34
t 43 
 .6467
11.189
44
p-value = .5212
α = .1
Conclusion:
Since p-value > α, we fail to reject the H0.
There is not sufficient evidence to suggest that the true mean
reading ability of the district’s third-graders is different than the
national average of 34.
12. a) In 2011, Mrs. Field's chocolate chip cookies were
selling at a mean rate of $1323 per week. A random
sample of 30 weeks in 2012 in the same stores showed
that the cookies were selling at an average rate of $1228
per week with standard deviation of $275.
Compute a 95% confidence interval for the mean weekly
sales rate.
CI = ($1125.30, $1330.70)
Based on this interval, is the mean weekly sales rate
statistically lower than the 2011 figure?
Since $1323 is in the interval, we do not have
significant evidence that it is lower than 2011.
12. b) In 2011, Mrs. Field's chocolate chip cookies
were selling at a mean rate of $1323 per week. A
random sample of 30 weeks in 2012 in the same stores
showed that the cookies were selling at an average
rate of $1228 per week with standard deviation $275.
Does this indicate that the sales of the cookies in 2012
were lower than the 2011 figure?
H0: μ = 1323
where μ is the true mean
Ha: μ < 1323
cookie sales per week
p-value = .034
α = 0.05
Since p-value < α, we reject the H0. There is sufficient evidence to
suggest that the sales of cookies were lower than the 2011 figure.
Why did we get different answers?
We reject at α = .05, but we fail to reject with a 95% CI.
In a one-tail test, all of α (5%)
goes into that tail.
α = .05
Tail probabilities between the
But a CI has
two tails
with
equal
area – so
significance
level
(α) and
the confidence
also be 5% level
in the
upper
tail.
must
match
for CIs and tests to
give the same results
.90
.05
there should
90% CI in 12(a): ($1142.70, $1313.70)
Since $1323 is not in this interval, we would reject
H0 – just like we did with the hypothesis test!
Matched Pairs Test
A special type of
t-inference
Two Types
• Pair people by certain
characteristics
• Randomly select a
treatment for person A
• Person B gets the other
treatment
• B's treatment is
dependent on A's
• Every person gets both
treatments
• Random order, or
before/after
measurements
• Measures are
dependent on the
person
Is this matched pairs?
a) A college wants to see if there’s a difference in time it took
last year’s class to find a job after graduation and the time it
took the class from five years ago to find work after
graduation. Researchers take a random sample from both
classes and measure the number of days between graduation
and first day of employment
No, there's no pairing of individuals
(independent samples)
Is this matched pairs?
b) A pharmaceutical company wants to test its new weightloss drug. Before giving the drug to a random sample,
company researchers take a weight measurement on each
person. After a month of using the drug, each person’s
weight is measured again.
Yes, each person is their own pair
Is this matched pairs?
c) In a taste test, a researcher asks people in a random
sample to taste a certain brand of spring water and rate it.
Another random sample of people is asked to taste a
different brand of water and rate it. The researcher wants to
compare these samples.
No, people aren't being paired.
If the same people tasted both brands (in a random
order), then it would be matched pairs.
13. a) A whale-watching company noticed that many
customers wanted to know whether it was better to book an
excursion in the morning or the afternoon. To test this
question, the company recorded the number of whales
spotted in the morning and afternoon on 15 randomly
selected days over the past month.
Day
1
2
3
4
5
Morning
8
9 7 9 10 13 10
Afternoon
8 10 9 8
9
6
11
7
8
Each pair is dependent on the day,
making this data matched pairs
8
8
9
10
11 12 13 14 15
2 5 7 7 6 8 7
10 We
4 can
7 only
8 handle
9 6 one
6 9
set of data in a t-test, so
let's find the differences
Day
1
2
3
Morning
8
9
Afternoon
8 10
Differenc
es
8
9
10
11 12 13 14 15
7 9 10 13 10
8
2
5
7 7 6 8 7
9 8 9
11
8
10
4
7
8 9 6 6 9
0 -1 -2 1 1
2
2
-2 -2 -2 -1 -2 0 2 -2
Conditions:
4
5
6
7
You can subtract either way – just
be careful when writing your Ha
• SRS of days
The differences are the data you're testing, so use
• Normal probab. plot is linear  sampling dist. is approx.
them to check the conditions
normal
Differences
0
-1
-2
1
1
2
2
-2
-2
-2
-1
-2
0
2
Is there sufficient evidence that more whales are sighted in the
afternoon?
Be careful writing the Ha!
Think about how we subtracted:
(morning –(afternoon
afternoon)– morning),
If we subtracted
H 0: μ D = 0
Ha: μD < 0
we
would
have
Hdifferences
μ
=
mean
of
the
a: μD > 0
D
If more are sighted in the
afternoon, should the differences
H0 = "nothing
be +isorgoing
-? on"  What
should the mean difference be?
where μD is the true mean difference
in whale sightings from morning
minus afternoon
-2
Differences
0
-1
-2
1
1
2
 .4  0
t14 
 .945
1.639
15
p  .1803
  .05
2
-2
-2
-2
-1
-2
0
2
-2
Perform a t-test using
the differences (L3)
If we subtracted
(afternoon – morning),
the test statistic would
be t = +.945, but the
p-value would be the
same
Since p-value > α, I fail to reject H0. There is not sufficient
evidence to suggest that more whales are sighted in the
afternoon than in the morning.
Differences
0
-1
-2
1
1
2
2
-2
-2
-2
-1
-2
0
2
-2
13. b) Construct a 90% confidence interval for the mean
difference in whale sightings. Does your conclusion
match the conclusion from the hypothesis test?
(-1.145, .34528)
We are 90% confident that the true mean difference in
whale sightings (morning – afternoon) is between -1.145
and .34528.
Since 0 is included in this interval, we do not have
evidence to suggest more whales are sighted in the
afternoon.