Sampling Distributions and the Central Limit Theorem

Download Report

Transcript Sampling Distributions and the Central Limit Theorem

Sampling Distributions and the
Central Limit Theorem
© 2010 Pearson Prentice Hall. All rights reserved
Statistics such as x are random variables since
their value varies from sample to sample. As
such, they have probability distributions
associated with them. In this chapter we
on the shape, center and spread of
focus
statistics such as x .

8-2
The sampling distribution of a statistic is a
probability distribution for all possible values
of the statistic computed from a sample of size
n.
The sampling distribution of the sample mean
x is the probability distribution of all
possible values of the random variable x
computed from a sample of size n from a
population with mean  and standard
deviation .

8-3
Parallel Example 1: Sampling Distribution of the Sample
Mean-Normal Population
The weights of pennies minted after 1982 are
approximately normally distributed with mean
2.46 grams and standard deviation 0.02 grams.
Approximate the sampling distribution of the
sample mean by obtaining 200 simple random
samples of size n = 5 from this population.
8-4
The data on the following slide represent the
sample means for the 200 simple random samples
of size n = 5.
For example, the first sample of n = 5 had the
following data:
2.493
2.466 2.473 2.492
Note:
x =2.479 for this sample
8-5
2.471
Sample Means for Samples of Size n =5
8-6
The mean of the 200 sample means is 2.46, the
same as the mean of the population.
The standard deviation of the sample means is
0.0086, which is smaller than the standard
deviation of the population.
The next slide shows the histogram of the
sample means.
8-7
8-8
What role does n, the sample size, play in the
standard deviation of the distribution of the
sample mean?
8-9
What
Whatrole
roledoes
doesn,n,the
thesample
samplesize,
size,play
playininthe
the
standard
standarddeviation
deviationofofthe
thedistribution
distributionofofthe
the
sample
samplemean?
mean?
As the size of the sample gets larger, we do not
expect as much spread in the sample means
since larger observations will offset smaller
observations.
8-10
Parallel Example 2: The Impact of Sample Size on Sampling
Variability
• Approximate the sampling distribution of the
sample mean by obtaining 200 simple random
samples of size n = 20 from the population of
weights of pennies minted after 1982 (=2.46
grams and =0.02 grams)
8-11
The mean of the 200 sample means for n =20 is still 2.46, but the
standard deviation is now 0.0045 (0.0086 for n = 5).
As expected, there is less variability in the distribution
of the sample mean with n =20 than with n =5.
8-12
The Mean and Standard Deviation of the
Sampling Distribution of x
Suppose that a simple random sample of size n is
drawn from a large population with mean  and
standard deviation . The sampling
 distribution of
x willhave mean x   and standard deviation
.
 
x
n
The standard deviation of the sampling distribution of
x is called
 the standard error of the mean and is
denoted  x .
8-13
The Shape of the Sampling
Distribution of x If X is Normal
If a random variable X is normally distributed,
the distribution of the sample mean x is
normally distributed.


8-14
Parallel Example 3: Describing the Distribution of the
Sample Mean
The weights of pennies minted after 1982 are
approximately normally distributed with mean
2.46 grams and standard deviation 0.02
grams.
What is the probability that in a simple random
sample of 10 pennies minted after 1982, we
obtain a sample mean of at least 2.465 grams?
8-15
Parallel Example 3: Describing the Distribution of the
Sample Mean
The weights of pennies minted after 1982 are
approximately normally distributed with mean
2.46 grams and standard deviation 0.02
grams.
What is the probability that in a simple random
sample of 10 pennies minted after 1982, we
obtain a sample mean of at least 2.465 grams?
8-16
Solution
•
x
0.02
is normally distributed with x =2.46 and  x 
.
 0.0063
10
2.465 2.46
 0.79
• Z
.
0.0063
• P(Z>0.79)=1-0.7852

=0.2148.
8-17

Parallel Example 4: Sampling from a Population that is Not
Normal
The following table and histogram
give the probability distribution
for rolling a fair die:
Face on Die
Relative Frequency
1
0.1667
2
0.1667
3
0.1667
4
0.1667
5
0.1667
6
0.1667
=3.5, =1.708
Note that the population distribution is NOT normal
8-18
Estimate the sampling distribution of x by obtaining
200 simple random samples of size n=4 and
calculating the sample mean for each of the 200
samples. Repeat for n = 10 and 30.
 distribution of the sample
Histograms of the sampling
mean for each sample size are given on the next
slide.
8-19
8-20
8-21
8-22
Key Points from Example 4
• The mean of the sampling distribution is equal to
the mean of the parent population and the
standard deviation of the sampling distribution of

the sample mean is
regardless of the sample
n
size.
• The Central Limit Theorem: the shape of the
distributionof the sample mean becomes
approximately normal as the sample size n
increases, regardless of the shape of the
population.
8-23
Parallel Example 5: Using the Central Limit Theorem
Suppose that the mean time for an oil change at a “10-minute
oil change joint” is 11.4 minutes with a standard deviation
of 3.2 minutes.
(a) If a random sample of n = 35 oil changes is selected,
describe the sampling distribution of the sample mean.
(b) If a random sample of n = 35 oil changes is selected, what
is the probability the mean oil change time is less than 11
minutes?
8-24
Parallel Example 5: Using the Central Limit Theorem
Suppose that the mean time for an oil change at a “10-minute
oil change joint” is 11.4 minutes with a standard deviation
of 3.2 minutes.
(a) If a random sample of n = 35 oil changes is selected,
describe the sampling distribution of the sample mean.
Solution: x is approximately normally distributed
with mean=11.4 and std. dev. = 3.2  0.5409.
35
(b) IfIfaarandom
randomsample
sampleof
ofnn==35
35oil
oilchanges
changesisisselected,
selected,what
what
 is the probability the mean oil change time is less than 11

minutes?
1111.4
 0.74 , P(Z<-0.74)=0.23.
Solution: Z 
0.5409
8-25
Point Estimate of a Population Proportion
Suppose that a random sample of size n is obtained
from a population in which each individual either
does or does not have a certain characteristic. The
sample proportion, denoted (read “p-hat”) is
given by pˆ
x
pˆ 
n
where x is the number of individuals in the sample
with the specified characteristic. The sample
proportion
is a statistic that estimates the
population proportion, p.
8-26
Parallel Example 1: Computing a Sample Proportion
In a Quinnipiac University Poll conducted in May of
2008, 1,745 registered voters nationwide were
asked whether they approved of the way George
W. Bush is handling the economy. 349 responded
“yes”. Obtain a point estimate for the proportion
of registered voters who approve of the way
George W. Bush is handling the economy.
8-27
Parallel Example 1: Computing a Sample Proportion
InInaaQuinnipiac
QuinnipiacUniversity
UniversityPoll
Pollconducted
conductedininMay
Mayofof
2008,
2008,1,745
1,745registered
registeredvoters
votersnationwide
nationwidewere
were
asked
askedwhether
whetherthey
theyapproved
approvedofofthe
theway
wayGeorge
George
W.
W.Bush
Bushisishandling
handlingthe
theeconomy.
economy. 349
349responded
responded
“yes”.
“yes”. Obtain
Obtainaapoint
pointestimate
estimatefor
forthe
theproportion
proportion
ofofregistered
registeredvoters
voterswho
whoapprove
approveofofthe
theway
way
George
GeorgeW.
W.Bush
Bushisishandling
handlingthe
theeconomy.
economy.
Solution:
8-28
pˆ 
349
 0.2
1745
Parallel Example 2: Using Simulation to Describe the
Distribution of the Sample Proportion
According to a Time poll conducted in June of
2008, 42% of registered voters believed that
gay and lesbian couples should be allowed to
marry.
Describe the sampling distribution of the sample
proportion for samples of size n=10, 50, 100.
8-29
8-30
8-31
8-32
Key Points from Example 2
• Shape: As the size of the sample, n, increases,
the shape of the sampling distribution of the
sample proportion becomes approximately
normal.
• Center: The mean of the sampling
distribution of the sample proportion equals
the population proportion, p.
• Spread: The standard deviation of the
sampling distribution of the sample
proportion decreases as the sample size, n,
increases.
8-33
Sampling Distribution of pˆ
For a simple random sample of size n with
population proportion p:

ˆ
• The shape of the sampling distribution of p is
approximately normal provided np(1-p)≥10.
• The mean of the sampling distribution of pˆ is
 pˆ  p.

• The standard deviation of the sampling
distribution of pˆ is

 pˆ 
8-34

p(1 p) 
n
Sampling Distribution of pˆ
• The model on the previous slide requires that the
sampled values are independent. When sampling
from finite populations, this
 assumption is verified
by checking that the sample size n is no more than
5% of the population size N (n ≤ 0.05N).
• Regardless of whether np(1-p) ≥10 or not, the
mean of the sampling distribution of pˆ is p, and
the standard deviation is
 pˆ 
8-35
p(1 p)
n 
Parallel Example 3: Describing the Sampling Distribution of
the Sample Proportion
According to a Time poll conducted in June of
2008, 42% of registered voters believed that
gay and lesbian couples should be allowed to
marry. Suppose that we obtain a simple
random sample of 50 voters and determine
which believe that gay and lesbian couples
should be allowed to marry. Describe the
sampling distribution of the sample proportion
for registered voters who believe that gay and
lesbian couples should be allowed to marry.
8-36
Solution
The sample of n=50 is smaller than 5% of the
population size (all registered voters in the
U.S.).
Also, np(1-p)=50(0.42)(0.58)=12.18≥10.
The sampling distribution of the sample
proportion is therefore approximately normal
with mean=0.42 and standard deviation=
0.42(1 0.42)
 0.0698
.
50
(Note: this is very close to the standard deviation of 0.072
found using simulation in Example 2.)
 8-37