Transcript 8.1

Chapter 8
Section 1
Distributions of the
Sample Mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 1 of 29
Chapter 8 – Section 1
● Learning objectives
1

Understand the concept of a sampling distribution
2 Describe the distribution of the sample mean for
samples obtained from normal populations
3
 Describe the distribution of the sample mean for
samples obtained from a population that is not normal
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 2 of 29
Chapter 8 – Section 1
● Learning objectives
1

Understand the concept of a sampling distribution
2 Describe the distribution of the sample mean for
samples obtained from normal populations
3
 Describe the distribution of the sample mean for
samples obtained from a population that is not normal
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 3 of 29
Chapter 8 – Section 1
● Often the population is too large to perform a
census … so we take a sample
● How do the results of the sample apply to the
population?
 What’s the relationship between the sample mean
and the population mean?
mean
 What’s the relationship between the sample standard
deviation and the population standard deviation?
● This is statistical inference
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 4 of 29
Chapter 8 – Section 1
● We want to use the sample mean x to estimate
the population mean μ
● If we want to estimate the heights of eight year
old girls, we can proceed as follows
 Randomly select 100 eight year old girls
 Compute the sample mean of the 100 heights
 Use that as our estimate
● This is using the sample mean to estimate the
population mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 5 of 29
Chapter 8 – Section 1
● However, if we take a series of different random
samples




Sample 1 – we compute sample mean x1
Sample 2 – we compute sample mean x2
Sample 3 – we compute sample mean x3
Etc.
● Each time we sample, we may get a different
result
● The sample mean x is a random variable!
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 6 of 29
Chapter 8 – Section 1
● Because the sample mean is a random variable
 The sample mean has a mean
 The sample mean has a standard deviation
 The sample mean has a probability distribution
● This is called the sampling distribution of the
sample mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 7 of 29
Chapter 8 – Section 1
● When we use the sample mean to estimate the
population mean, we are estimating a parameter
(number) with a random variable
● The sampling distribution of the sample mean
has the parameters of
 A sample of size n
 A population with mean μ
 A population with standard deviation σ
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 8 of 29
Chapter 8 – Section 1
● Example
● We have the data
1, 7, 11, 12, 17, 17, 17, 21, 21, 21, 22, 22
and we want to take samples of size n = 3
● First, a histogram of the entire data set
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 9 of 29
Chapter 8 – Section 1
● A histogram of the entire data set
● Definitely skewed left … not bell shaped
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 10 of 29
Chapter 8 – Section 1
● Taking some samples of size 3
● The first sample, 17, 21, 12, has a mean of 16.7
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 11 of 29
Chapter 8 – Section 1
● More sample means from more samples
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 12 of 29
Chapter 8 – Section 1
● A histogram of 20 sample means
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 13 of 29
Chapter 8 – Section 1
● The original data set was skewed left, but the set
of sample means is close to bell shaped
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 14 of 29
Chapter 8 – Section 1
● Learning objectives
1

Understand the concept of a sampling distribution
2 Describe the distribution of the sample mean for
samples obtained from normal populations
3
 Describe the distribution of the sample mean for
samples obtained from a population that is not normal
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 15 of 29
Chapter 8 – Section 1
● If we know that the population has a normal
distribution, then the sampling distribution (i.e.
the distribution of x) will also be normal
● In fact, the sampling distribution
 Will be normally distributed
 Will have a mean equal to the mean of the population
 Will have a standard deviation less than the standard
deviation of the population
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 16 of 29
Chapter 8 – Section 1
● Why does it have a smaller standard deviation?
● The population standard deviation
 Is a measure of the distance between an individual
value and the mean
 The sampling error for a sample of size n = 1
● The standard deviation of the sample mean
 Is a measure of the distance between the sample
mean and the mean
● It makes sense that the estimate is more
accurate if we’ve taken more values (a larger n)
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 17 of 29
Chapter 8 – Section 1
● The Law of Large Numbers says that
As we take more observations to the sample
(i.e. as n gets larger), the difference between
the sample mean x and the population mean μ
approaches 0
● That is to say, we get to the right answer with
larger and larger sample sizes
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 18 of 29
Chapter 8 – Section 1
● The standard error,  x , is the standard deviation
of the sample mean
● The formula for  x is

x 
n
● This is an extremely important formula
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 19 of 29
Chapter 8 – Section 1
● What does this mean?
● If we have a normally distributed random
variable X, then the distribution of the sample
mean x is completely determined
 We know that it’s also normally distributed
 We know its mean
 We know its standard deviation
● So … we can do all of our calculations!
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 20 of 29
Chapter 8 – Section 1
● If a simple random sample of size n is drawn
from a large population, then the sampling
distribution has
 Mean x   and



 Standard deviation x
n
● In addition, if the population is normally
distributed, then
 The sampling distribution is normally distributed
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 21 of 29
Chapter 8 – Section 1
● Example
● If the random variable X has a normal
distribution with a mean of 20 and a standard
deviation of 12
 If we choose samples of size n = 4, then the sample
mean will have a normal distribution with a mean of
20 and a standard deviation of 6
 If we choose samples of size n = 9, then the sample
mean will have a normal distribution with a mean of
20 and a standard deviation of 4
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 22 of 29
Chapter 8 – Section 1
● Learning objectives
1

Understand the concept of a sampling distribution
2 Describe the distribution of the sample mean for
samples obtained from normal populations
3
 Describe the distribution of the sample mean for
samples obtained from a population that is not normal
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 23 of 29
Chapter 8 – Section 1
● This is great if our random variable X has a
normal distribution
● However … what if X does not have a normal
distribution
● What can we do?
 Wouldn’t it be very nice if the sampling distribution for
X also was normal?
 This is almost true …
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 24 of 29
Chapter 8 – Section 1
● The Central Limit Theorem states
Regardless of the shape of the distribution,
the sampling distribution
becomes approximately normal
as the sample size n increases
● Thus
 If the random variable X is normally distributed, then
the sampling distribution is normally distributed also
 For all other random variables X, the sampling
distributions are approximately normally distributed
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 25 of 29
Chapter 8 – Section 1
● This approximation, of the sampling distribution
being normal, is good for large sample sizes …
large values of n
● How large does n have to be?
● A rule of thumb – if n is 30 or higher, this
approximation is probably pretty good
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 26 of 29
Chapter 8 – Section 1
● Example
● We’ve been told that the average weight of
giraffes is 2400 pounds with a standard
deviation of 300 pounds
● We’ve measured 50 giraffes and found that the
sample mean was 2600 pounds
● Is our data consistent with what we’ve been
told?
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 27 of 29
Chapter 8 – Section 1
● The sample mean is approximately normal with
mean 2400 (the same as the population) and a
standard deviation of 300 / √ 50 = 42.4
● Using our calculations for the general normal
distribution, 2600 is 200 pounds over 2400, and
200 pounds is 200 / 42.4 = 4.7
● From our normal calculator, there is about a 1
chance in 1 million of this occurring
● Something is definitely strange … we’ll see what
to do later in inferential statistics
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 28 of 29
Summary: Chapter 8 – Section 1
● The sample mean is a random variable with a
distribution called the sampling distribution
 If the sample size n is sufficiently large (30 or more is
a good rule of thumb), then this distribution is
approximately normal
 The mean of the sampling distribution is equal to the
mean of the population
 The standard deviation of the sampling distribution is
equal to  / n
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 29 of 29
Chapter 8 – Example 1
● The combined (verbal + quantitative reasoning) score on the GRE is
normally distributed with mean 1066 and standard deviation 191.
(Source: www.ets.org/Media/Tests/GRE/pdf/01210.pdf.) Suppose n
= 15 randomly selected students take the GRE on the same day.
 a. What is the probability that a randomly selected student scores above
1100 on the GRE?
 b. Describe the sampling distribution of the sample mean.
 c. What is the probability that a random sample of 15 students has a
mean GRE score that is less than 1100?
 d. What is the probability that a random sample of 15 students has a
mean GRE score that is 1100 or above?
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 30 of 29
Chapter 8 – Example 1
● The combined (verbal + quantitative reasoning) score on the GRE is
normally distributed with mean 1066 and standard deviation 191.
(Source: www.ets.org/Media/Tests/GRE/pdf/01210.pdf.) Suppose n
= 15 randomly selected students take the GRE on the same day.
 a. What is the probability that a randomly selected student scores above
1100 on the GRE? (0.4286)
 b. Describe the sampling distribution of the sample mean. (It is normal
with mean 1066 and standard deviation 49.3)
 c. What is the probability that a random sample of 15 students has a
mean GRE score that is less than 1100? (0.7549)
 d. What is the probability that a random sample of 15 students has a
mean GRE score that is 1100 or above? (0.2451)
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 31 of 29
Chapter 8 – Example 2
● In the United States, the year each coin was minted is printed on the coin.
To find the age of a coin, simply subtract the current year from the year
printed on the coin. The ages of circulating pennies are right skewed.
Assume the ages of circulating pennies have a mean of 12.2 years and a
standard deviation of 9.9 years.
 a. Based on the information given, can we determine the probability that a
randomly selected penny is over 10 years old?
 b. What is the probability that a random sample of 40 circulating pennies has a
mean less than 10 years?
 c. What is the probability that a random sample of 40 circulating pennies has a
mean greater than 10 years?
 d. What is the probability that a random sample of 40 circulating pennies has a
mean greater than 15 years? Would this be unusual?
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 32 of 29
Chapter 8 – Example 2
● In the United States, the year each coin was minted is printed on the coin.
To find the age of a coin, simply subtract the current year from the year
printed on the coin. The ages of circulating pennies are right skewed.
Assume the ages of circulating pennies have a mean of 12.2 years and a
standard deviation of 9.9 years.
 a. Based on the information given, can we determine the probability that a
randomly selected penny is over 10 years old? (No, because the population of
the ages of circulating pennies is not normally distributed.)
 b. What is the probability that a random sample of 40 circulating pennies has a
mean less than 10 years? (0.0793)
 c. What is the probability that a random sample of 40 circulating pennies has a
mean greater than 10 years? (0.9207)
 d. What is the probability that a random sample of 40 circulating pennies has a
mean greater than 15 years? Would this be unusual? (0.0367; yes)
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 8 Section 1 – Slide 33 of 29