Transcript StatsChap8

Chapter
8
Sampling
Distributions
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Chap 2
2
Section 8.1
Distribution
of the
Sample Mean
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Statistics such as the “mean” ( x ) are random
variables. Their value varies from sample to
sample, so they have probability distributions
associated with them.

In this chapter we focus on the shape (normal?),
center (peak) and spread(deviation) of
distributions of x.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-4
The sampling distribution of the sample mean
is the probability distribution of all possible
values of the random variable “ ”
x
Take many samples of size “n” from a
population whose mean is μ and standard
deviation is σ.
Then plot just the means ( ) ofxeach sample.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-5
Illustrating Sampling Distributions
Step 1: Obtain a simple random sample of size “n”
Step 2: Compute the sample mean.
Step 3: Repeat Steps 1 and 2 until all possible
simple random samples of size n have
been obtained (in theory). In practice,
take as many n-sized samples as
practicable (time & money…).
8-6
Sampling Distribution of the Sample
Mean: Normal Population
The weights of pennies minted after 1982 are approx
normally distributed with mean = 2.46g (grams) and
std dev = 0.02g. (28g = 1 oz)
Approximate the sampling distribution of the sample
mean by taking 200 simple random samples of size
n = 5 pennies from this population (in other words, find
the mean weight of each 5-penny sample and then plot
just the 200 x ‘s, not the weight of all 1000 pennies.)
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-7
The data on the following slide represent the
sample means for the 200 simple random
samples, each of size n = 5.
For example, the first sample of n = 5 pennies
had the following weight data (in g):
2.493
2.466 2.473 2.492
x Note:
2.471
= 2.479g for this sample
8-8
The data on the following slide represent the
sample means for the 200 simple random
samples, each of size n = 5.
For example, the first sample of n = 5 pennies
had the following weight data (in g):
2.493
2.466 2.473 2.492
x Note:
2.471
= 2.479g for this sample
8-9
The data on the following slide represent the
sample means for the 200 simple random
samples, each of size n = 5.
For example, the first sample of n = 5 pennies
had the following weight data (in g):
2.493
2.466 2.473 2.492
x Note:
2.471
= 2.479g for this sample
8-10
Sample Means for Samples of Size n = 5
8-11
The mean of the 200 sample means is
2.46g, the same as the mean of the
population.
The standard deviation of the sample
means (“standard error”) is 0.0086g,
which is smaller than the standard
deviation of the population.
The next slide shows the histogram of all
200 sample means.
8-12
8-13
What role does “n”, the size of each sample,
play in the value of the standard deviation of the
sample means?
As the size “n” of each sample increases, the
standard deviation of the distribution of the
sample mean decreases.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-14
The Impact of Sample Size “n” on Sampling
Variation (Variance, Std Dev)
Approximate the distribution of the sample means
by obtaining 200 simple random samples of size
n = 20 pennies (not 5)….
from the same population of pennies minted
after 1982 (μ = 2.46 grams and σ = 0.02 grams)
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-15
The mean of the 200 sample means for n = 20 pennies is
still 2.46g, but the standard deviation is now 0.0045g
(0.0086g for n = 5). There is less variation in the
distribution of the sample means when
n =20p than with n = 5p and the curve is more “normal”.8-16
The Mean & Standard Deviation of the
Sampling Distribution of x
If a random sample of size “n” is drawn from a
large population with mean “μ” and std dev “σ”.
Then the distribution of the x ‘s will have:
mean
x  

std dev  x  n .

x
is the “standard error of the mean”
or just “standard error”
8-18
Theorem
If the parent population of the random
variable X is normally distributed, the
distribution of the sample means from
x that
population will also be normally distributed.

Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-19
Distribution of the Sample Mean
The weights of pennies minted after 1982 (pop)
are normally distributed with:
mean = 2.46g and std dev = 0.02g.
What is the probability that, in a random sample
of 10 of these pennies, the mean weight of the
sample is at least 2.465 grams?
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-20
x
normally distributed
.
2.465 2.46
Z
 0.79

0.0063
.
P(Z > 0.79) = 1 – 0.7852
x
= 2.46g
x 
0.02
 0.0063
10
normcdf (-1E99, 0.79) = 0.7852
Or: normcdf (0.79,1E99) = 0.2148

Or: normcdf(2.465,1E99,2.46,0.0063) =
0.2137
= 0.2148
.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
Now…
Lets describe the Distribution of the
Sample Means if the parent population is
non-normal,
Or
(same thing) we don’t know whether the
parent is normal or not.
22
Sampling from a Population that is Not Normal (or Unk)
The following table and histogram
give the probability distribution
for rolling a fair die:
Result Face
on Die
Relative Frequency or
Probability
1
0.1667
2
0.1667
3
0.1667
4
0.1667
5
0.1667
6
0.1667
μ = 3.5, σ = 1.708
Note that the population distribution is Uniform, NOT normal
8-23
Probability Experiment
Roll a dice 4 times (n = 4) and find the mean value
x
Repeat this 200 times. Calculating the sample mean for
each of the 200 samples. Repeat for n = 10 and 30.
Estimate the sampling distribution of
simple random samples of size and
by
obtaining 200
Histograms of the sampling distribution of the sample
mean for each sample size are given on the next slide.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-24
Probability Experiment
Roll a dice 4 times (n = 4) and find the mean value
x
Repeat this 200 times. Calculating the sample mean for
each of the 200 samples. Repeat for n = 10 and 30.
Estimate the sampling distribution of
simple random samples of size and
by
obtaining 200
Histograms of the sampling distribution of the sample
mean for each sample size are given on the next slide.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-25
Probability Experiment
1. Roll 4 dice (n = 4). Find the mean value of result
x
2. Repeat this 200 times. Calculate the sample mean for
each of the 200 rolls. Repeat for n = 10 and n = 30.
3. Estimate the sampling distribution ofx for each “n”
Histograms of the distribution of the sample means for each
sample size are given on the next slide.

Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-26
8-27
8-28
8-29
Key Points from Dice Experiment

The mean of the sampling distribution is equal to the
mean of the parent population. The std error of the
distribution of the sample means is 
regardless of “n”
n

The Central Limit Theorem: the shape of the
distribution of the sample
 means becomes approx
normal as the sample size n ≥ 30 increases, regardless
of the shape of the parent population (means can use
z-scores).
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-30
Using the Central Limit Theorem
Suppose that the mean time for an auto oil change at
Karlos’ Speedy Oil Change is 11.4 minutes with a standard
deviation of 3.2 min.
(a) If a random sample of n = 35 oil changes is selected,
describe the sampling distribution of the sample mean.
(b) If a sample of n = 35 oil changes at Karlos’ is selected,
what is the probability that the mean of these 35 oil change
times is less than 11 minutes?
8-31
Using the Central Limit Theorem
S”what is the probability the mean of 35 oil change times is less
than 11 minutes?
that
the mean
timen for
an oil
at a “10-minute
oil change
x change
Solution:
since
> 30,
is approximately
normally
distributed
joint”
is 11.4=minutes
with
standard
of 3.2 min
minutes.
with mean
11.4 min
anda std.
dev. deviation
= 3.2
 0.5409
35
(a) If a random sample of n = 35 oil changes is
11  11.4
Z
  0.74
P(Z < –0.74) = 0.2296
0.5409
Or….
2:Distr:2: normcdf(-1E99,11.0,11.4,0.5409) = 0.2298
IOW, if you did this experiment (n=35) 100 times, about 23 of those
times (1/4) the mean oil change would take less than 11 mins.
8-32
Section 8.2
Distribution
of the Sample
Proportion
(not Mean)
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
POINT ESTIMATE OF A POPULATION PROPORTION
If a random sample of size “n” is obtained from a
population in which each outcome is binomial
(Yes/No; True/False; Success/Fail) The sample
proportion, pˆ (“p-hat”) is
x
pˆ 
n
where x is the number of “successes”.

ˆ
The sample proportion is a pstatistic
that estimates
the population proportion, ρ (Gr: rho).

Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-34
Computing a Sample Proportion
In a 2003 national Harris Poll, 1745 registered voters
were asked whether they approved of the way President
Bush was handling the economy (wthtm). 349
responded “yes”, and the other 1396 said various other
unprintable things.
Obtain a point estimate (%) for the proportion of
registered voters who approved of the way President
Bush was handling the economy.
x 349
pˆ  
 0.2000
n 1745
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-35
Using Simulation to Describe the
Distribution of the Sample Proportion
According to a Time poll in 2008, 42% of
registered voters believed that gay/lesbian
couples should be allowed to (civil) marry.
Describe the sampling distribution of the sample
proportion for samples of size:
n = 10, 50, 100.
Note: We are using simulations to create the histograms on the
following slides.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-36
8-37
8-38
8-39
Key Points from Time Poll

Shape: As the size of the sample “n” increases, the
shape of the sampling proportion distribution
becomes approximately normal.
Center: The mean of the sampling proportion
distribution pˆ equals the population proportion, ρ.


Spread: The std dev of the sampling proportion
distribution decreases as the sample size, n

increases.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-40
Sampling Distribution of pˆ
For a simple random sample of size n with point
estimate pˆ and population proportion ρ:
1. The shape of the sampling distribution of
approximately normal provided npq ≥ 10.

2. The mean of the sampling distribution of
is  pˆ  
is
pˆ
3. The standard deviation of the sampling
distribution of pˆ is
pq 
 pˆ 
n
Note: p + q=1 or q = 1 - p
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-41
Normality for Qualitative
Variables (Proportions)…
In order for us to use z-scores for
probability of proportions, the
ˆ must be approx
distribution of p
normal.
Your text uses npq ≥ 10 for this
requirement,
but
most
books
use
both

np ≥ 5 and also nq ≥ 5 to assume
normality
42
Normality Requirement
In this civil marriage problem: p = 0 .42
(42% approve) , so q = 1-p = 0.58
(58% disapprove)
If n = 50, then: npq = 12.18 > 10
np = 21 >5 and nq = 29 >5
But, if n = 20, npq = 4.872 < 10
np = 8.4 > 5 and nq = 11.6 > 5
43
Sampling Distribution of the Sample Proportion
According to a Time poll conducted in 2008, 42% of
registered voters believed that gay/lesbian couples
should be allowed to civil marry.
Suppose that we obtain a random sample of 50
voters and determine which of those voters believe
that gay/lesbian couples should be allowed to
marry (“Success” = x).
Describe the sampling distribution of the sample
proportion for registered voters who believe that
gay and lesbian couples should be allowed to marry.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-44
Proportion Marriage Survey
np(1 – p) = npq = 50(0.42)(0.58) = 12.18 ≥ 10.
Thus the sampling distribution of the sample
proportion is therefore approximately normal with:
mean = 0.42
(42% of the 50 = Success, 58% of the 50 = Fail)
and standard deviation =
(0.42)(0.58)
 0.0698  6.98%
50
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-45
Compute Probabilities of a Sample
Proportion
According to the Centers for Disease Control and Prevention
(CDC) in 2004, 18.8% of school-aged children
(aged 611 yrs) were overweight.
(a)
In a random sample of 90 school-aged children, what is the
probability that at least 19% are overweight?
(b)
Suppose in a random sample of 90 school-aged children you
find 25 overweight children.
What might you conclude?
Note: we are dealing with % of children, not quantity of
children, because this variable is proportion.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-46
Porky Pigs: a) In sample of 90 children,
what is prob at least 19% are overweight?
1. npq = 90(0.188)(0.812) ≈ 13.739 ≥ 10
So, pˆ is approximately normal with mean = 0.188
and standard deviation =

(0.188)(0.812)
 0.0412
90
0.19  0.188
Z
 0.0485
0.0412
P( Z  0.0485)  0.5193
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-47
Porky Pigs: b) In a sample of 90, you find 25 overweight
children. What might you conclude?
1. As before, pˆ is approximately normal with mean = 0.188
and standard deviation = 0.0412
0.2778  0.188
25
Z
 2.179
pˆ 
 0.2778
0.0412

90
2.normcdf (2.179, 1E99) = 0.0147 = prob of finding 25/90 O/W
3.If the true population proportion is 0.188 (18.8% O/W), then
the prob of finding 25/90 or 27.8% o/w children in a sample is
“unusual” (any data point outside 2σ).
4.If we repeated this experiment 100 times, we would only
expect to get this many O/W children (25/90) approx 1 or 2 times.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc.
8-48
Chap 2
49