STAT 113 - Purdue University

Download Report

Transcript STAT 113 - Purdue University

Chapter 5.6:
Hypergeometric Distribution, Binomial Approx. to HG
Chris Morgan, MATH G160
[email protected]
January 8, 2012
Lecture 13
1
2
Hypergeometric Distribution
With the binomial distribution you sample with
replacement and count the number of successes after a set
number of trials. But what if you sample without
replacement?
Now the trials are no longer independent and we can no
longer use the binomial to model this situation. We need to
look for another type of distribution that will describe
these problems.
3
Hypergeometric Distribution
- Given a population with N members
- We are interested in an outcome that can be classified
as a success or a failure
- Let the probability of success in the population be p
- Sample from this population without replacement of
size n
4
Hypergeometric Distribution
Examples:
–The probability of a full house in a poker hand.
–The probability of 3 brown M&M’s in a selection of 5
M&M’s from a bag with 20 brown M&M’s and 30 other
colors.
- The probability of selecting 3 out of 5 defective parts
on a moving conveyor belt containing 100 parts
5
Hypergeometric Distribution
Notation: X ~ Hyp(N, n, p)
PMF:
 n   N  n  where: x = the # of successes
 

n = the # of trials, or how many
x
m

x
 

we’re choosing
P(X  k) 
f(x) = the probability of x success
N
in n trials
 
m
 
N = # of elements in population
 N p   N (1  p ) 



x
n

x



P(X  k) 
N
 
 n 
m = # of elements in population
labeled a success
p = probability of success in
ENTIRE population
(m/N)
6
Hypergeometric Distribution
Expectation and Variance:
E(X) =
mn
N
Var(X) =
m n ( N  n )( N  m )
( N  1) N
2
7
Hypergeometric Distribution
PMF:
N 
 
m
n
 
x
N  n


mx
n N  n
 

x
m

x
 

P(X  k) 
N
 
m
is the number of ways m elements can be selected from a population of
size N
is the number of ways that x successes can be selected from a total of
r successes in the population
is the number of ways that m-x failures can be selected from a total of
N-n failures in the population
8
Hypergeometric Distribution
Notation: X ~ Hyp(N, n, p)
PMF:
 r  N  r 
 

x
n

x
 

P(X  k) 
N 
 
 n 
where: x = the # of successes
n = the # of trials, or how many
we’re choosing
f(x) = the probability of x success
in n trials
N = # of elements in population
r = # of elements in population
labeled a success
p = probability of success in
ENTIRE population
9
Hypergeometric Distribution
Expectation and Variance:
 r 

N


E(X) = n 
r  N  n 
 r 
Var(X) = n

 1 


N  N 1 
 N 
10
Hypergeometric Distribution
PMF:
N 
 
 n 
r
 
x
N r


nx
 r  N  r 
 

x
n

x
 

P ( X  x) 
N 
 
 n 
is the number of ways n elements can be selected from a population of
size N
is the number of ways that x successes can be selected from a total of
r successes in the population
is the number of ways that n-x failures can be selected from a total of
N-r failures in the population
11
Hypergeometric Example #1
A bag of Skittles has 20 reds and 80 pieces of other colors.
Find the probability that you randomly select 4 reds in a
handful of 10 Skittles…
a) With replacement
b) Without replacement
12
Hypergeometric Example #1a
a) With replacement:
X ~ Bin(n = 4, p = 0.2)
P(X=4) =
 10 
4
6
(0.2)
(0.8)
 0.0881
 
 4 
13
Hypergeometric Example #1b
b) Without replacement:
X ~ Hyp(N=100, n=10, p=0.2)
P(X=4) =
 N p   N (1  p )   20   80 


 


x
n

x
4
6


 



 0.0841
N 
100 
 


n
1
0
 


14
Hypergeometric Example #1c
c) What is the expected number of red Skittles in a handful
of 20 pieces?
With replacement:
E(X) = np = 20(0.2) = 4
Without replacement:
E(X) = n(r/N) = 20(20/100) =4
15
Hypergeometric Example #1c
c) What is the variance of number of red Skittles in a
handful of 20 pieces?
With replacement:
Var(X) = np(1-p) = 20(0.2)(0.8) = 3.2
Without replacement:
Var(X) =
( N  n)
( N  1)
np (1  p ) 
(100  20)
(100  1)
20(0.2)(0.8)  2.5859
16
Hypergeometric Example #2
In a jar there are 20,000 coins, 500 of which are quarters.
You select 5 coins randomly. What is the probability that
you get exactly 2 quarters?
a)With replacement?
b)Without replacement?
17
Hypergeometric Example #3
A marine biologist has been tracking manatees in the
Miami region. There are a total of 200 manatees in the
region, and 80 of them have been tagged with their
information recorded. Each day he will take a random
sample of 12 manatees (without replacement) and will
continue to record information on those that have been
tagged. Let T be the number of manatees that have been
tagged in your sample.
T ~ Hyp(N=200, n=12, p=80/200=0.4)
18
Hypergeometric Example #3a
N=number on POPULATION
n=how many we’re choosing
r=number of “desirable” objects or “success” objects in
population
p=probability of success in ENTIRE population = r/N
What is the probability you choose exactly 3 tagged
manatees?
19
Hypergeometric Example #3b
Given you have less than 4 tagged manatees, what is the
probability you have exactly 3 tagged manatees?
20
Hypergeometric Example #3c
If he continues this same procedure for 4 days (all days
independent of one another), what is the probability that
he has exactly 3 tagged manatees in his sample all 4 days?
How many tagged manatees you expect to see in a sample
of 12?
21
Hypergeometric Example #4
How often do we REALLY know the population size?
Collecting records from all the marine biologists in Florida,
we have a total of 500 tagged manatees. How do we
estimate the population size?
If we sample from this LARGE population, say there are a
total of 2,000 manatees (and 500 tagged), we select 10 of
them.
22
Hypergeometric Example #4a
1. What is the exact distribution for the number of tagged
manatees in our sample?
2. What is the exact probability we have exactly 4 tagged
manatees in our sample?
23
Hypergeometric Example #4b
3. What is a good approximation for the number of tagged
manatees in our sample?
4. What is the approximate probability we select 4 tagged
manatees in the sample?
24
Hypergeometric Example #5
Little Johnny has a jar containing 10 blue marbles and 12
red marbles. He reaches into the jar and selects 5 marbles
without replacement. Let X denote the number of red
marbles he obtains.
a) Identify the distribution and parameters corresponding
to the random variable X.
25
Hypergeometric Example #5
b) What is the probability Johnny obtains exactly 3 red
marbles?
c) If Johnny repeats this experiment a large number of
times, on average how many red marbles can he expect to
obtain?
26
Hypergeometric Example #6a
In a certain mid-west town consisting of 100 residents,
60% are in favor of raising the local sales tax rate while the
other 40% are opposed. Suppose a sample of 10 residents
is taken without replacement. Let X denote the number of
residents who are in favor of raising taxes.
a) Identify the distribution and parameters corresponding
to the random variable X.
27
Hypergeometric Example #6b
b) What is the probability of obtaining at least 9 residents
who are in favor of raising taxes?
c) What number of residents in favor of raising taxes can
we expect to obtain?
28
Hypergeometric Example #7a
Axline Computers manufactures personal computers at
two plants, one in Texas and the other in Hawaii. The
Texas plant has 40 employees; the Hawaii plant has 20. A
random sample of 10 employees is to be asked to fill out a
benefits questionnaire. Let X be a worker from Hawaii.
a) What is the probability that none of the employees in the
sample are from the Hawaii plant?
29
Hypergeometric Example #7b
b) What is the probability that one of the employees in the
sample works at the plant in Hawaii?
c) What is the probability that two or more of the
employees in the sample work at the plant in Hawaii?
d) What is the expected number of employees from the
Hawaii plant to be included in the sample?
30
Approximating Hypergeometric
We can approximate the hypergeometric distribution by
the binomial distribution if: N > 20*n
This is because N is so big there is very little chance of
getting the same object; so even though HG is without
replacement and Bin is with replacement, which such a
large N it is as if the binomial distribution is now without
replacement because the chance of grabbing the same
object twice is so small.
31
Approximation Example #1
You roll two 20-sided dice 400 times. Let X be the number
of double 20’s you see. Find the approximate probability
you see 2 double 20’s.
32
Approximation Example #2
A shoe store has 2,000 pairs of shoes, 800 are men’s
shoes, and the rest are women’s. You randomly select 4
pairs of shoes without replacement. What is the
approximate probability you select exactly 2 pairs of men’s
shoes?
33
Approximation Example #3
A Chicago baseball convention has 5,000 attendees
consisting of 3,500 Cubs fans and 1,500 White Soxs fans.
Ten people are randomly chosen to participate in a contest
to win World Series tickets. What is the approximate
probability exactly 7 Cubs fans are selected?
34
Approximation Example #4
Suppose an earthquake will occur somewhere in California
each day with probability .005. Assuming earthquake
occurrences are weakly dependent from day to day find the
approximate probability Californians will experience no
earthquakes this year. How many earthquakes can
Californians expect to experience in 2010?
35