Transcript 8.1
Chapter 8
Section 1
Distributions of the
Sample Mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 1 of 29
Chapter 8 – Section 1
● Learning objectives
1
Understand the concept of a sampling distribution
2 Describe the distribution of the sample mean for
samples obtained from normal populations
3
Describe the distribution of the sample mean for
samples obtained from a population that is not normal
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 2 of 29
Chapter 8 – Section 1
● Learning objectives
1
Understand the concept of a sampling distribution
2 Describe the distribution of the sample mean for
samples obtained from normal populations
3
Describe the distribution of the sample mean for
samples obtained from a population that is not normal
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 3 of 29
Chapter 8 – Section 1
● Often the population is too large to perform a
census … so we take a sample
● How do the results of the sample apply to the
population?
What’s the relationship between the sample mean
and the population mean?
mean
What’s the relationship between the sample standard
deviation and the population standard deviation?
● This is statistical inference
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 4 of 29
Chapter 8 – Section 1
● We want to use the sample mean x to estimate
the population mean μ
● If we want to estimate the heights of eight year
old girls, we can proceed as follows
Randomly select 100 eight year old girls
Compute the sample mean of the 100 heights
Use that as our estimate
● This is using the sample mean to estimate the
population mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 5 of 29
Chapter 8 – Section 1
● However, if we take a series of different random
samples
Sample 1 – we compute sample mean x1
Sample 2 – we compute sample mean x2
Sample 3 – we compute sample mean x3
Etc.
● Each time we sample, we may get a different
result
● The sample mean x is a random variable!
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 6 of 29
Chapter 8 – Section 1
● Because the sample mean is a random variable
The sample mean has a mean
The sample mean has a standard deviation
The sample mean has a probability distribution
● This is called the sampling distribution of the
sample mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 7 of 29
Chapter 8 – Section 1
● When we use the sample mean to estimate the
population mean, we are estimating a parameter
(number) with a random variable
● The sampling distribution of the sample mean
has the parameters of
A sample of size n
A population with mean μ
A population with standard deviation σ
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 8 of 29
Chapter 8 – Section 1
● Example
● We have the data
1, 7, 11, 12, 17, 17, 17, 21, 21, 21, 22, 22
and we want to take samples of size n = 3
● First, a histogram of the entire data set
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 9 of 29
Chapter 8 – Section 1
● A histogram of the entire data set
● Definitely skewed left … not bell shaped
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 10 of 29
Chapter 8 – Section 1
● Taking some samples of size 3
● The first sample, 17, 21, 12, has a mean of 16.7
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 11 of 29
Chapter 8 – Section 1
● More sample means from more samples
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 12 of 29
Chapter 8 – Section 1
● A histogram of 20 sample means
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 13 of 29
Chapter 8 – Section 1
● The original data set was skewed left, but the set
of sample means is close to bell shaped
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 14 of 29
Chapter 8 – Section 1
● Learning objectives
1
Understand the concept of a sampling distribution
2 Describe the distribution of the sample mean for
samples obtained from normal populations
3
Describe the distribution of the sample mean for
samples obtained from a population that is not normal
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 15 of 29
Chapter 8 – Section 1
● If we know that the population has a normal
distribution, then the sampling distribution (i.e.
the distribution of x) will also be normal
● In fact, the sampling distribution
Will be normally distributed
Will have a mean equal to the mean of the population
Will have a standard deviation less than the standard
deviation of the population
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 16 of 29
Chapter 8 – Section 1
● Why does it have a smaller standard deviation?
● The population standard deviation
Is a measure of the distance between an individual
value and the mean
The sampling error for a sample of size n = 1
● The standard deviation of the sample mean
Is a measure of the distance between the sample
mean and the mean
● It makes sense that the estimate is more
accurate if we’ve taken more values (a larger n)
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 17 of 29
Chapter 8 – Section 1
● The Law of Large Numbers says that
As we take more observations to the sample
(i.e. as n gets larger), the difference between
the sample mean x and the population mean μ
approaches 0
● That is to say, we get to the right answer with
larger and larger sample sizes
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 18 of 29
Chapter 8 – Section 1
● The standard error, x , is the standard deviation
of the sample mean
● The formula for x is
x
n
● This is an extremely important formula
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 19 of 29
Chapter 8 – Section 1
● What does this mean?
● If we have a normally distributed random
variable X, then the distribution of the sample
mean x is completely determined
We know that it’s also normally distributed
We know its mean
We know its standard deviation
● So … we can do all of our calculations!
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 20 of 29
Chapter 8 – Section 1
● If a simple random sample of size n is drawn
from a large population, then the sampling
distribution has
Mean x and
Standard deviation x
n
● In addition, if the population is normally
distributed, then
The sampling distribution is normally distributed
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 21 of 29
Chapter 8 – Section 1
● Example
● If the random variable X has a normal
distribution with a mean of 20 and a standard
deviation of 12
If we choose samples of size n = 4, then the sample
mean will have a normal distribution with a mean of
20 and a standard deviation of 6
If we choose samples of size n = 9, then the sample
mean will have a normal distribution with a mean of
20 and a standard deviation of 4
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 22 of 29
Chapter 8 – Section 1
● Learning objectives
1
Understand the concept of a sampling distribution
2 Describe the distribution of the sample mean for
samples obtained from normal populations
3
Describe the distribution of the sample mean for
samples obtained from a population that is not normal
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 23 of 29
Chapter 8 – Section 1
● This is great if our random variable X has a
normal distribution
● However … what if X does not have a normal
distribution
● What can we do?
Wouldn’t it be very nice if the sampling distribution for
X also was normal?
This is almost true …
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 24 of 29
Chapter 8 – Section 1
● The Central Limit Theorem states
Regardless of the shape of the distribution,
the sampling distribution
becomes approximately normal
as the sample size n increases
● Thus
If the random variable X is normally distributed, then
the sampling distribution is normally distributed also
For all other random variables X, the sampling
distributions are approximately normally distributed
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 25 of 29
Chapter 8 – Section 1
● This approximation, of the sampling distribution
being normal, is good for large sample sizes …
large values of n
● How large does n have to be?
● A rule of thumb – if n is 30 or higher, this
approximation is probably pretty good
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 26 of 29
Chapter 8 – Section 1
● Example
● We’ve been told that the average weight of
giraffes is 2400 pounds with a standard
deviation of 300 pounds
● We’ve measured 50 giraffes and found that the
sample mean was 2600 pounds
● Is our data consistent with what we’ve been
told?
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 27 of 29
Chapter 8 – Section 1
● The sample mean is approximately normal with
mean 2400 (the same as the population) and a
standard deviation of 300 / √ 50 = 42.4
● Using our calculations for the general normal
distribution, 2600 is 200 pounds over 2400, and
200 pounds is 200 / 42.4 = 4.7
● From our normal calculator, there is about a 1
chance in 1 million of this occurring
● Something is definitely strange … we’ll see what
to do later in inferential statistics
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 28 of 29
Summary: Chapter 8 – Section 1
● The sample mean is a random variable with a
distribution called the sampling distribution
If the sample size n is sufficiently large (30 or more is
a good rule of thumb), then this distribution is
approximately normal
The mean of the sampling distribution is equal to the
mean of the population
The standard deviation of the sampling distribution is
equal to / n
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 29 of 29
Chapter 8 – Example 1
● The combined (verbal + quantitative reasoning) score on the GRE is
normally distributed with mean 1066 and standard deviation 191.
(Source: www.ets.org/Media/Tests/GRE/pdf/01210.pdf.) Suppose n
= 15 randomly selected students take the GRE on the same day.
a. What is the probability that a randomly selected student scores above
1100 on the GRE?
b. Describe the sampling distribution of the sample mean.
c. What is the probability that a random sample of 15 students has a
mean GRE score that is less than 1100?
d. What is the probability that a random sample of 15 students has a
mean GRE score that is 1100 or above?
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 30 of 29
Chapter 8 – Example 1
● The combined (verbal + quantitative reasoning) score on the GRE is
normally distributed with mean 1066 and standard deviation 191.
(Source: www.ets.org/Media/Tests/GRE/pdf/01210.pdf.) Suppose n
= 15 randomly selected students take the GRE on the same day.
a. What is the probability that a randomly selected student scores above
1100 on the GRE? (0.4286)
b. Describe the sampling distribution of the sample mean. (It is normal
with mean 1066 and standard deviation 49.3)
c. What is the probability that a random sample of 15 students has a
mean GRE score that is less than 1100? (0.7549)
d. What is the probability that a random sample of 15 students has a
mean GRE score that is 1100 or above? (0.2451)
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 31 of 29
Chapter 8 – Example 2
● In the United States, the year each coin was minted is printed on the coin.
To find the age of a coin, simply subtract the current year from the year
printed on the coin. The ages of circulating pennies are right skewed.
Assume the ages of circulating pennies have a mean of 12.2 years and a
standard deviation of 9.9 years.
a. Based on the information given, can we determine the probability that a
randomly selected penny is over 10 years old?
b. What is the probability that a random sample of 40 circulating pennies has a
mean less than 10 years?
c. What is the probability that a random sample of 40 circulating pennies has a
mean greater than 10 years?
d. What is the probability that a random sample of 40 circulating pennies has a
mean greater than 15 years? Would this be unusual?
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 32 of 29
Chapter 8 – Example 2
● In the United States, the year each coin was minted is printed on the coin.
To find the age of a coin, simply subtract the current year from the year
printed on the coin. The ages of circulating pennies are right skewed.
Assume the ages of circulating pennies have a mean of 12.2 years and a
standard deviation of 9.9 years.
a. Based on the information given, can we determine the probability that a
randomly selected penny is over 10 years old? (No, because the population of
the ages of circulating pennies is not normally distributed.)
b. What is the probability that a random sample of 40 circulating pennies has a
mean less than 10 years? (0.0793)
c. What is the probability that a random sample of 40 circulating pennies has a
mean greater than 10 years? (0.9207)
d. What is the probability that a random sample of 40 circulating pennies has a
mean greater than 15 years? Would this be unusual? (0.0367; yes)
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 33 of 29