Statistics Chapter 7: Inferences Based on a Single Sample: Estimation with Confidence Interval

Download Report

Transcript Statistics Chapter 7: Inferences Based on a Single Sample: Estimation with Confidence Interval

Statistics
Chapter 7: Inferences Based on a
Single Sample: Estimation with
Confidence Interval
Where We’ve Been



Populations are characterized by
numerical measures called parameters
Decisions about population
parameters are based on sample
statistics
Inferences involve uncertainty
reflected in the sampling distribution of
the statistic
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
2
Where We’re Going



Estimate a parameter based on a large
sample
Use the sampling distribution of the
statistic to form a confidence interval
for the parameter
Select the proper sample size when
estimating a parameter
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
3
7.1: Identifying the Target
Parameter

The unknown population parameter
that we are interested in estimating is
called the target parameter.
Parameter
Key Word or Phrase
Type of Data
µ
Mean, average
Quantitative
p
Proportion, percentage,
fraction, rate
Qualitative
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
4
7.2: Large-Sample Confidence
Interval for a Population Mean

A point estimator of a population
parameter is a rule or formula that tells
us how to use the sample data to
calculate a single number that can be
used to estimate the population
parameter.
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
5
7.2: Large-Sample Confidence
Interval for a Population Mean

Suppose a sample of 225 college
students watch an average of 28 hours
of television per week, with a standard
deviation of 10 hours.

What can we conclude about all college
students’ television time?
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
6
7.2: Large-Sample Confidence
Interval for a Population Mean

Assuming a normal distribution for television
hours, we can be 95%* sure that

  x  1.96
n
10
  28  1.96
225
  28  1.96 (.67)
  28  1.31
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
*In the standard normal
distribution, exactly 95% of the
area under the curve is in 7the
interval
-1.96 … +1.96
7.2: Large-Sample Confidence
Interval for a Population Mean

An interval estimator or confidence interval is a
formula that tell us how to use sample data to
calculate an interval that estimates a population
parameter.
  x  z x
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
8
7.2: Large-Sample Confidence
Interval for a Population Mean


The confidence coefficient is the probability that a
randomly selected confidence interval encloses the
population parameter.
The confidence level is the confidence coefficient
expressed as a percentage.
(90%, 95% and 99% are very commonly used.)
95% sure
  x  z x
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
9
7.2: Large-Sample Confidence
Interval for a Population Mean

The area outside the confidence interval is called 
95 % sure
  x  z x
So we are left with (1 – 95)% = 5% =  uncertainty about µ
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
10
7.2: Large-Sample Confidence
Interval for a Population Mean


Large-Sample (1-)% Confidence
Interval for µ

  x  za / 2 x  x  za / 2
n
If  is unknown and n is large, the
confidence interval becomes
s
  x  za / 2 s x  x  za / 2
n
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
11
7.2: Large-Sample Confidence
Interval for a Population Mean
For the confidence
interval to be valid …
the sample must be
random and …
the sample size n
must be large.
If n is large, the sampling distribution of the sample
mean is normal, and s is a good estimate of
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
12
7.3: Small-Sample Confidence
Interval for a Population Mean
Large Sample


Small Sample
Sampling Distribution on
is normal
Known  or large n
Standard Normal (z)
Distribution
  x  za / 2

n


Sampling Distribution on
is unknown
Unknown  and small n
Student’s t Distribution
(with n-1 degrees of
freedom)
  x  ta / 2
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
s
n
13
7.3: Small-Sample Confidence
Interval for a Population Mean
Large Sample
  x  za / 2

n
Small Sample
  x  ta / 2
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
s
n
14
7.3: Small-Sample Confidence
Interval for a Population Mean
For the confidence
interval to be valid …
the sample must be
random and …
the population must have
a relative frequency
distribution that is
approximately normal*
* If not, see Chapter 14
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
15
7.3: Small-Sample Confidence
Interval for a Population Mean

Suppose a sample of 25 college
students watch an average of 28 hours
of television per week, with a standard
deviation of 10 hours.

What can we conclude about all college
students’ television time?
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
16
7.3: Small-Sample Confidence
Interval for a Population Mean

Assuming a normal distribution for television
hours, we can be 95% sure that
s
  x  2 .064
n
10
  28  2.064
25
  28  2.064(.67)
  28  4.128
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
17
7.4: Large-Sample Confidence
Interval for a Population Proportion

Sampling distribution of p̂


The mean of the sampling distribution is p, the population
proportion.
The standard deviation of the sampling distribution is
 pˆ 

pq
n
where
q  1 p
For large samples the sampling distribution is
approximately normal. Large is defined as
0  pˆ  3 pˆ  1
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
18
7.4: Large-Sample Confidence
Interval for a Population Proportion

Sampling distribution of
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
19
7.4: Large-Sample Confidence
Interval for a Population Proportion
We can be 100(1-)% confident that
p  pˆ  z / 2 pˆ  pˆ  z / 2
where
x
pˆ 
n
pq
pˆ qˆ
 pˆ  z / 2
n
n
and
qˆ  1  pˆ
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
20
7.4: Large-Sample Confidence
Interval for a Population Proportion
A nationwide poll of nearly
1,500 people …
conducted by the
syndicated cable television
show Dateline: USA
found that more than 70
percent of those surveyed
believe there is intelligent
life in the universe,
perhaps even in our own
Milky Way Galaxy.
What proportion of the entire
population agree, at the 95%
confidence level?
p  pˆ  z a / 2
pˆ qˆ
n
(.70)(. 30)
p  .70  1.96
1500
p  .70  (1.96)(. 012)
p  .70  .023
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
21
7.4: Large-Sample Confidence
Interval for a Population Proportion

If p is close to 0 or 1, Wilson’s
adjustment for estimating p yields
better results
~
p  z / 2
where
~
~
p (1  p )
n4
x2
~
p
n4
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
22
7.4: Large-Sample Confidence
Interval for a Population Proportion
Suppose in a particular year
the percentage of firms
declaring bankruptcy that
had shown profits the
previous year is .002. If 100
firms are sampled and one
had declared bankruptcy,
what is the 95% CI on the
proportion of profitable firms
that will tank the next year?
~
~
p
(
1

p)
~
p  p  z / 2
n4
x2
1 2
~
p

 .0289
n  4 100  4
.0289(1  .0289)
p  .0289  1.96
100  4
p  .0289  .032
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
23
7.5: Determining the Sample Size

To be within a certain sampling error (SE) of
µ with a level of confidence equal to
100(1-)%, we can solve
  
z / 2 
  SE
 n
( z / 2 ) 
n
SE 2
2
for n:
2
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
24
7.5: Determining the Sample Size

The value of  will almost always be
unknown, so we need an estimate:



s from a previous sample
approximate the range, R, and use R/4
Round the calculated value of n
upwards to be sure you don’t have too
small a sample.
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
25
7.5: Determining the Sample Size

Suppose we need to know the mean
driving distance for a new composite
golf ball within 3 yards, with 95%
confidence. A previous study had a
standard deviation of 25 yards. How
many golf balls must we test?
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
26
7.5: Determining the Sample Size
Suppose we need to
know the mean
driving distance for a
new composite golf
ball within 3 yards,
with 95% confidence.
A previous study had
a standard deviation
of 25 yards. How
many golf balls must
we test?
( z / 2 ) 
n
SE 2
2
2
1.96 25
n
2
3
n  266.78  267
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
2
2
27
7.5: Determining the Sample Size

For a confidence interval
on the population
proportion, p, we can
solve
z / 2
pq
 SE
n
2
( z / 2 ) ( pq)
for n: n 
SE 2
To estimate p, use
the sample proportion
from a prior study, or
use p = .5.
Round the value of n
upward to ensure the
sample size is large
enough to produce the
required level of
confidence.
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
28
7.5: Determining the Sample Size

How many cellular phones must a
manufacturer test to estimate the
fraction defective, p, to within .01 with
90% confidence, if an initial estimate of
.10 is used for p?
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
29
7.5: Determining the Sample Size
How many cellular
phones must a
manufacturer test to
estimate the fraction
defective, p, to within
.01 with 90%
confidence, if an initial
estimate of .10 is
used for p?
pq
SE  z / 2
n
( z / 2 ) 2 ( pq )
n
2
( SE )
(1.645) 2 (.1)(. 9)
n
(.01) 2
n  2435.4  2436
McClave, Statistics, 11th ed. Chapter 7:
Inferences Based on a Single Sample:
Estimation by Confidence Intervals
30