Sampling Distributions

Download Report

Transcript Sampling Distributions

Chapter 9
Sampling
Distributions
1
9.1 Introduction
 In real life calculating parameters of
populations is prohibitive because
populations are very large.
 Rather than investigating the whole
population, we take a sample, calculate a
statistic related to the parameter of
interest, and make an inference.
 The sampling distribution of the statistic
is the tool that tells us how close is the
statistic to the parameter.
2
9.2 Sampling Distribution of
the Mean
 An example


A die is thrown infinitely many times. Let X
represent the number of spots showing on
any throw.
The probability distribution of X is
x
1 2 3 4 5 6
p(x) 1/6 1/6 1/6 1/6 1/6 1/6
E(X) = 1(1/6) +
2(1/6) + 3(1/6)+
………………….= 3.5
V(X) = (1-3.5)2(1/6) +
(2-3.5)2(1/6) +
…………. …= 2.92
3
Throwing a die twice – sample mean
 Suppose we want to estimate m
from the mean x of a sample of
size n = 2.
 What is the distribution of x ?
4
Throwing a die twice – sample mean
Sample
1
2
3
4
5
6
7
8
9
10
11
12
1,1
1,2
1,3
1,4
1,5
1,6
2,1
2,2
2,3
2,4
2,5
2,6
Mean Sample
Mean
1
13
3,1
2
1.5
14
3,2
2.5
2
15
3,3
3
2.5
16
3,4
3.5
3
17
3,5
4
3.5
18
3,6
4.5
1.5
19
4,1
2.5
2
20
4,2
3
2.5
21
4,3
3.5
3
22
4,4
4
3.5
23
4,5
4.5
4
24
4,6
5
Sample
25
26
27
28
29
30
31
32
33
34
35
36
Mean
5,1
5,2
5,3
5,4
5,5
5,6
6,1
6,2
6,3
6,4
6,5
6,6
3
3.5
4
4.5
5
5.5
3.5
4
4.5
5
5.5
6
5
Sample
1
2
3
4
5
6
7
8
9
10
11
12
1,1
1,2
1,3
1,4
1,5
1,6
2,1
2,2
2,3
2,4
2,5
2,6
Mean Sample
Mean
1
13
3,1
2
1.5
14
3,2
2.5
2
15
3,3
3
2.5
16
3,4
3.5
3
17
3,5
4
3.5
18
3,6
4.5
1.5 x 19
x4,1 2.5
2
20
4,2
3
2.5
21
4,3
3.5
3
22
4,4
4
3.5
23
4,5
4.5
4
24
4,6
5
Sample
25
26
27
28
29
30
2
31
x
32
33
34
35
36
Mean
5,1
5,2
5,3
5,4
5,5 2
5,6 x
6,1
6,2
6,3
6,4
6,5
6,6
3
3.5
4
4.5
5
5.5
3.5
4
4.5
5
5.5
6
The distribution of x when n = 2
Note : m
m
and



2
E( x ) =1.0(1/36)+
1.5(2/36)+….=3.5
6/36
5/36
V(X) = (1.0-3.5)2(1/36)+
(1.5-3.5)2(2/36)... = 1.46
4/36
3/36
2/36
1/36
1
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5 6.0
x
6
Sampling Distribution of the
Mean
n5
n  10
n  25
m x  3 .5
m x  3 .5
m x  3 .5
2
2
x
  . 5833 ( 
x
5
2
2
)
2
6
 x  . 2917 ( 
x
10
)
2
x
  . 1167 ( 
x
25
)
7
Sampling Distribution of the
Mean
n5
n  10
n  25
m x  3 .5
m x  3 .5
m x  3 .5
2
2
2
 x  . 5833 ( 
x
5
)
2
x
  . 2917 ( 
x
10
2
)
2
x
  . 1167 ( 
x
25
)
Notice that  x2 is smaller than .x.
The larger the sample size the
2
smaller  x . Therefore, x tends
to fall closer to m, as the sample
size increases.
2
8
Sampling Distribution of the
Mean
Demonstration: The variance of the sample mean is
smaller than the variance of the population.
Mean = 1.5 Mean = 2. Mean = 2.5
1.5
2.5
22
3
1.5
2.5
22
1.5
2.5
1.5
2
2.5
1.5
2.5
2
Compare
the variability
of the population
1.5
2.5
1.5
22 of the2.5
to the variability
sample mean.
1.5
2.5
1.5
2.5
2
1.5
2.5
1.5
2
2.5
1.5
2
2.5
Let us take samples
1.5
2
2.5
Population
1
of two observations
9
Sampling Distribution of the
Mean
Also,
Expected value of the population =
(1 + 2 + 3)/3 = 2
Expected value of the sample mean =
(1.5 + 2 + 2.5)/3 = 2
10
The Central Limit Theorem
 If a random sample is drawn from any
population, the sampling distribution of the
sample mean is approximately normal for a
sufficiently large sample size.
 The larger the sample size, the more closely
the sampling distribution of x will resemble a
normal distribution.
11
Sampling Distribution of the Sample
Mean
1. m x  m x
2. 
2
x


2
x
n
3 . If x is normal, x is normal.
x is approximat
ely normally
If x is nonnormal
distribute d for
sufficient ly large sample size.
12
Sampling Distribution of the
Sample Mean
 Example 9.1



The amount of soda pop in each bottle is normally
distributed with a mean of 32.2 ounces and a
standard deviation of .3 ounces.
Find the probability that a bottle bought by a
customer will contain more than 32 ounces.
Solution
0.7486
 The random variable X is the
amount of soda in a bottle.
P ( x  32 )  P (
xm
x

32  32 . 2
)
.3
 P ( z   . 67 )  0 . 7486
x = 32 m = 32.2
13
Sampling Distribution of the
Sample Mean
 Find the probability that a carton of four bottles will
have a mean of more than 32 ounces of soda per
bottle.
 Solution

Define the random variable as the mean amount of soda per
bottle.
P ( x  32 )  P (
xm
x

32  32 . 2
.3
0.9082
)
4
 P ( z   1 . 33 )  0 . 9082
0.7486
x = 32
x  32
m = 32.2
m x  32 . 2
14
Sampling Distribution of the
Sample Mean
 Example 9.2



Dean’s claim: The average weekly income of
B.B.A graduates one year after graduation is
$600.
Suppose the distribution of weekly income has a
standard deviation of $100. What is the
probability that 25 randomly selected graduates
have an average weekly income of less than
$550?
Solution
x  m 550  600
P ( x  550 )  P (
x

100
25
 P ( z   2 . 5 )  0 . 0062
)
15
Sampling Distribution of the Sample
Mean
 Example 9.2– continued


If a random sample of 25 graduates actually had
an average weekly income of $550, what would
you conclude about the validity of the claim that
the average weekly income is 600?
Solution


With m = 600 the probability of observing a sample mean
as low as 550 is very small (0.0062). The claim that the
mean weekly income is $600 is probably unjustified.
It will be more reasonable to assume that m is smaller
than $600, because then a sample mean of $550
becomes more probable.
16
Using Sampling Distributions for
Inference
 To make inference about population parameters we use
sampling distributions (as in Example 9.2).
 The symmetry of the normal distribution along with the
sample distribution of the mean lead to:
P (  1 . 96  z  1 . 96 )  . 95 , or P (  1 . 96 
- Z.025
Z.025
This can be written
P (  1 . 96


 1 . 96 )  . 95
n
as
 x  m  1 . 96
n
which
x m

)  . 95
n
become
P ( m  1 . 96

n
 x  m  1 . 96

n
)  . 95
17
Using Sampling Distributions for
Inference
Standard normal distribution Z
P ( 600  1 . 96
Normal distribution of x
100
 x  600  1 . 96
25
.95
-1.96
0
Z
-1.96
m 11. 96
P ( 600
. 96
)  . 95
25
.95
.025 .025
.025
100

100
25
n
m
m600
.025
 100
Pm
( 600
1 . 96
 1 . 96
x
n 25
18
Using Sampling Distributions for
Inference
P ( 600  1 . 96
100
25
Which
reduces
 x  600  1 . 96
100
)  . 95
25
to P ( 560 . 8  x  639 . 2 )  . 95
 Conclusion
 There is 95% chance that the sample mean
falls within the interval [560.8, 639.2] if the
population mean is 600.
 Since the sample mean was 550, the
population mean is probably not 600.
19
9.3 Sampling Distribution of
a Proportion
 The parameter of interest for nominal data
is the proportion of times a particular
outcome (success) occurs.
 To estimate the population proportion ‘p’
we use the sample proportion.
The number
of successes
^ =
The estimate of p = p
X
n
20
9.3 Sampling Distribution of
a Proportion
^
p
 Since X is binomial, probabilities about
can be calculated from the binomial
distribution.
^ we prefer to use
 Yet, for inference about p
normal approximation to the binomial.
21
Normal approximation to the
Binomial

Normal approximation to
the binomial works best
when



the number of
experiments (sample
size) is large, and
the probability of success,
p, is close to 0.5.
For the approximation to
provide good results two
conditions should be met:
np

5; n(1 - p)

5
22
Normal approximation to the
Binomial
Example
Approximate the binomial probability P(x=10)
when n = 20 and p = .5
The parameters of the normal distribution
used to approximate the binomial are:
m = np; 2 = np(1 - p)
23
Normal approximation to the
Binomial
Let us build a normal
distribution to approximate the
binomial P(X = 10).
m = np = 20(.5) = 10;
2 = np(1 - p) = 20(.5)(1 - .5) = 5
 = 51/2 = 2.24
P(9.5<YNormal<10.5)
The approximation
P(XBinomial = 10) =.176
~= P(9.5<Y<10.5)
9.5
 P(
9 . 5  10
2 . 24
10
Z
10.5
10 . 5  10
2 . 24
)  . 1742
24
Normal approximation to the
Binomial
 More examples of normal approximation
to the binomial
P(X 4) @ P(Y< 4.5)
4
P(X 14) @ P(Y > 13.5)
4.5
13.5
14
25
Approximate Sampling Distribution
of a Sample Proportion
 From the laws of expected value and variance,
it can be shown that E( pˆ ) = p and V( pˆ )
=p(1-p)/n
 If both np > 5 and np(1-p) > 5, then
z 
pˆ  p
p (1  p )
n
 Z is approximately standard normally
distributed.
26
 Example 9.3



A state representative received 52% of the
votes in the last election.
One year later the representative wanted
to study his popularity.
If his popularity has not changed, what is
the probability that more than half of a
sample of 300 voters would vote for him?
27
 Example 9.3

Solution

The number of respondents who prefer the
representative is binomial with n = 300 and p =
.52. Thus, np = 300(.52) = 156 and
n(1-p) = 300(1-.52) = 144 (both greater than 5)

P ( pˆ  . 50 )  P 


pˆ  p
p (1  p ) n


  . 7549
(. 52 )( 1  . 52 ) 300 
. 50  . 52
28
9.4 Sampling Distribution of the
Difference Between Two Means
 Independent samples are drawn from
each of two normal populations
 We’re interested in the sampling
distribution of the difference between the
two sample means x 1  x 2
29
Sampling Distribution of the
Difference Between Two Means
 The distribution of x 1  x 2 is normal if


The two samples are independent, and
The parent populations are normally
distributed.
 If the two populations are not both
normally distributed, but the sample
sizes are 30 or more, the distribution of
x 1  x 2 is approximately normal.
30
Sampling Distribution of the
Difference Between Two Means
 Applying the laws of expected value and
variance we have:
E( x 1  x 2 )  E( x 1 )  E( x 2 )  m 1  m 2
2
V(x1  x 2 )  V(x1)  V(x2 ) 
1
n
2

2
n
 We can define:
Z 
( x 1  x 2 )  (m 1  m 2 )
1
2
n1
2
2

n2
31
Sampling Distribution of the
Difference Between Two Means
Example 9.4


The starting salaries of MBA students from two
universities (WLU and UWO) are $62,000
(stand.dev. = $14,500), and $60,000 (stand.
dev. = $18,3000).
What is the probability that a sample mean of
WLU students will exceed the sample mean of
UWO students? (nWLU = 50; nUWO = 60)
32
Sampling Distribution of the
Difference Between Two Means
 Example 9.4 – Solution
We need to determine P ( x 1  x 2  0 )
m1 - m2 = 62,000 - 60,000 = $2,000
1
2
n


2
2

14 ,500
n
P ( x1  x 2  0)  P (
2

18 ,300
50
2
 $ 3 ,128
60
x1  x 2  (m 1 - m 2 )
2
1
n1
2

2

0  2000
3128
)
n2
 P ( z   . 64 )  . 5  . 2389  . 7389
33