Transcript Ch7

Chapter 7
Sampling Distributions
McGraw-Hill/Irwin
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter Outline
7.1 The Sampling Distribution of the
Sample Mean
7.2 The Sampling Distribution of the
Sample Proportion
7-2
7.1 Sampling Distribution of the
Sample Mean
The sampling distribution of the
sample mean is the probability
distribution of the population of the
sample means obtainable from all
possible samples of size n from a
population of size N
7-3
Example 7.1: The Risk Analysis Case
Population of the percent returns from
six grand prizes
In order, the values of prizes (in $,000)
are 10, 20, 30, 40, 50 and 60
 Label each prize A, B, C, …, F in order of
increasing value
 The mean prize is $35,000 with a standard
deviation of $17,078
7-4
Example 7.1: The Risk Analysis Case
Continued
Any one prize of these prizes is as likely
to be picked as any other of the six
 Uniform distribution with N = 6
 Each stock has a probability of being
picked of 1/6 = 0.1667
7-5
Example 7.1: The Risk Analysis Case
#3
Figure 7.2 (a)
7-6
Example 7.1: The Risk Analysis Case
#4
Now, select all possible samples of size
n = 2 from this population of size N = 6
 Select all possible pairs of prizes
How to select?
 Sample randomly
 Sample without replacement
 Sample without regard to order
7-7
Example 7.1: The Risk Analysis Case
#5
Result: There are 15 possible samples
of size n = 2
Calculate the sample mean of each and
every sample
7-8
Example 7.1: The Risk Analysis Case
#6
Sample
Mean
15
20
25
30
35
40
45
50
55
Frequency Probability
1
1/15
1
1/15
2
2/15
2
2/15
3
3/15
2
2/15
2
2/15
1
1/15
1
1/15
Table 7.2 (b)
Figure 7.2 (b)
7-9
Observations
Although the population of N = 6 grand
prizes has a uniform distribution, …
… the histogram of n = 15 sample
mean prizes:
 Seems to be centered over the sample
mean return of 35,000, and
 Appears to be bell-shaped and less spread
out than the histogram of individual
returns
7-10
General Conclusions
1. If the population of individual items is
normal, then the population of all
sample means is also normal
2. Even if the population of individual
items is not normal, there are
circumstances when the population of
all sample means is normal (Central
Limit Theorem)
7-11
General Conclusions Continued
3. The mean of all possible sample means
equals the population mean
4. The standard deviation sx of all sample
means is less than the standard deviation of
the population

Each sample mean averages out the high and
the low measurements, and so are closer to the
population mean than many of the individual
population measurements
7-12
The Sampling Distribution of x
1. If the population being sampled is
normal, then so is the sampling
distribution of the sample mean, x
2. The mean sx of the sampling
distribution of x is mx = m
 That is, the mean of all possible sample
means is the same as the population
mean
7-13
The Sampling Distribution of x #2
3. The variance s2x of the sampling distribution
of x is
2
s 
2
x
s
n
 That is, the variance of the sampling
distribution of x is
 Directly proportional to the variance of the
population
 Inversely proportional to the sample size
7-14
The Sampling Distribution of x #3
 The standard deviation sx of the sampling
distribution of x is
s
sx 
n
 That is, the standard deviation of the
sampling distribution of x is
 Directly proportional to the standard deviation of
the population
 Inversely proportional to the square root of the
sample size
7-15
Notes
 The formulas for s2x and sx hold if the
sampled population is infinite
 The formulas hold approximately if the
sampled population is finite
 x is the point estimate of m, and the larger
the sample size n, the more accurate the
estimate
 Because as n increases, sx decreases as
1/√n
7-16
Effect of Sample Size
Figure 7.3
7-17
Example 7.2: Car Mileage Statistical
Inference
 x = 31.56 mpg for a sample of size n =
50 and σ = 0.8
If the population mean µ is exactly 31,
what is the probability of observing a
sample mean that is greater than or
equal to 31.56?
7-18
Example 7.2: Car Mileage Statistical
Inference #2
Calculate the probability of observing a
sample mean that is greater than or
equal to 31.6 mpg if µ = 31 mpg
 Want P(x > 31.5531 if µ = 31)
Use s as the point estimate for s so that
sx 
s
0.8

 0.113
n
50
7-19
Example 7.2: Car Mileage Statistical
Inference #3

31.56  m x 

Px  31.56 if m  31  P z 
sx


31.56  31

 P z 

0.113 

 Pz  4.96
7-20
Example 7.2: Car Mileage Statistical
Inference #4
 z = 4.96 > 3.99, so P(z ≥ 4.84) < 0.00003
 If m = 31 mpg, fewer than 3 in 100,000 samples
have a mean at least as large as observed
 Have either of the following explanations:
 If m = 31 mpg, very unlucky in picking this sample
 Not unlucky, m is not 31 mpg, but is really larger
 Difficult to believe such a small chance would
occur, so conclude that there is strong
evidence that m does not equal 31 mpg
 Also, m is, in fact, actually larger than 31 mpg
7-21
Central Limit Theorem
 Now consider a non-normal population
 Still have: mx = m and sx = s/n
 Exactly correct if infinite population
 Approximately correct if population size N finite
but much larger than sample size n
 But if population is non-normal, what is the
shape of the sampling distribution of x?
 The sampling distribution is approximately normal
if the sample is large enough, even if the
population is non-normal (Central Limit Theorem)
7-22
The Central Limit Theorem #2
 No matter what is the probability distribution
that describes the population, if the sample
size n is large enough, then the population of
all possible sample means is approximately
normal with mean mx = m and standard
deviation sx = s/n
 Further, the larger the sample size n, the
closer the sampling distribution of the sample
mean is to being normal
 In other words, the larger n, the better the
approximation
7-23
Effect of Sample Size
Figure 7.5
7-24
How Large?
 How large is “large enough?”
 If the sample size is at least 30, then for most
sampled populations, the sampling
distribution of sample means is approximately
normal
 Here, if n is at least 30, it will be assumed that the
sampling distribution of x is approximately normal
 If the population is normal, then the sampling
distribution of x is normal regardless of the
sample size
7-25
Unbiased Estimates
 A sample statistic is an unbiased point
estimate of a population parameter if the
mean of all possible values of the sample
statistic equals the population parameter
 x is an unbiased estimate of m because mx = m
 In general, the sample mean is always an unbiased
estimate of m
 The sample median is often an unbiased estimate
of m but not always
7-26
Unbiased Estimates
Continued
The sample variance s2 is an unbiased
estimate of s2
 That is why s2 has a divisor of n – 1 and
not n
However, s is not an unbiased estimate
of s
 Even so, the usual practice is to use s as
an estimate of s
7-27
Minimum Variance Estimates
Want the sample statistic to have a
small standard deviation
 All values of the sample statistic should be
clustered around the population parameter
 Then,
the statistic from any sample should be
close to the population parameter
7-28
Minimum Variance Estimates
Continued
Given a choice between unbiased
estimates, choose one with smallest
standard deviation
 The sample mean and the sample median
are both unbiased estimates of m
 The sampling distribution of sample means
generally has a smaller standard deviation
than that of sample medians
7-29
7.2 The Sampling Distribution of the
Sample Proportion
 The probability distribution of all possible
sample proportions is the sampling
distribution of the sample proportion
 If a random sample of size n is taken from a
population then the sampling distribution of
the sample proportion is
 Approximately normal, if n is large
 Has a mean that equals ρ
 Has standard deviation
p1  p 
s ˆp 
n
7-30