Binomial and geometric models

Download Report

Transcript Binomial and geometric models

Binomial Random Variables
Binomial Probability Distributions
Binomial Random
Variables
Through 2/24/2011 NC State’s free-throw
percentage is 69.6% (146th out 345 in Div. 1).
 If in the 2/26/2011 game with GaTech, NCSU
shoots 11 free-throws, what is the probability
that:

NCSU makes exactly 8 free-throws?
NCSU makes at most 8 free throws?
NCSU makes at least 8 free-throws?
“2-outcome” situations are very
common
Heads/tails
 Democrat/Republican
 Male/Female
 Win/Loss
 Success/Failure
 Defective/Nondefective

Probability Model for this Common
Situation

Common characteristics
◦ repeated “trials”
◦ 2 outcomes on each trial

Leads to Binomial Experiment
Binomial Experiments

n identical trials
◦ n specified in advance

2 outcomes on each trial
◦ usually referred to as “success” and “failure”
p “success” probability; q=1-p “failure”
probability; remain constant from trial to
trial
 trials are independent

Binomial Random Variable
The binomial random variable X is
the number of “successes” in the n
trials
 Notation: X has a B(n, p)
distribution, where n is the number
of trials and p is the success
probability on each trial.

Examples
a.
b.
c.
d.
Yes; n=10; success=“major repairs within
3 months”; p=.05
No; n not specified in advance
No; p changes
Yes; n=1500; success=“chip is defective”;
p=.10
Binomial Probability Distribution
n trials, p  success probability on each trial
probability distribution:
p ( x)  n Cx p q
x
n x
, x  0,1, 2,
E ( x)   xp( x)   x 
n
x 0
n
x 0
n
x
p q
Var ( x)  E ( x      npq

,n
x
n x
 np
Rationale for the Binomial
Probability Formula
P(x) =
n!
•
(n – x )!x!
Number of
outcomes with
exactly x
successes
among n trials
px •
n-x
q
Binomial Probability
Formula
P(x) =
n!
•
(n – x )!x!
Number of
outcomes with
exactly x
successes
among n trials
px •
n-x
q
Probability of x
successes
among n trials
for any one
particular order
Graph of p(x); x binomial n=10 p=.5;
p(0)+p(1)+ … +p(10)=1
The sum of all the
areas is 1
Think of p(x) as the area
of rectangle above x
p(5)=.246 is the area
of the rectangle above 5
Binomial Probability Histogram: n=100, p=.5
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
70
68
66
64
62
60
58
56
54
52
50
48
46
44
42
40
38
36
34
32
30
0
Binomial Probability Histogram: n=100, p=.95
0.18
0.17
0.16
0.15
0.14
0.13
0.12
0.11
0.1
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
70
72
74
76
78
80
82
84
86
88
90
92
94
96
98
100
Example
A production line produces motor
housings, 5% of which have cosmetic
defects. A quality control manager
randomly selects 4 housings from the
production line. Let x=the number of
housings that have a cosmetic defect.
Tabulate the probability distribution for x.
Solution
(i) D=defective, G=good
outcome
x
P(outcome)
GGGG
0
(.95)(.95)(.95)(.95)
DGGG
1
(.05)(.95)(.95)(.95)
GDGG
1
(.95)(.05)(.95)(.95)
:
:
:
DDDD
4
(.05)4

Solution
(ii ) x is a binomial random variable
p ( x)  n Cx p q
x
n x
, x  0,1, 2,
,n
n  4, p  .05 (q  .95)
p(0)  4 C0 (.05) (.95)  .815
0
4
p(1)  4 C1 (.05) (.95)  .171475
1
3
p(2)  4 C2 (.05) 2 (.95) 2  .01354
p(3)  4 C3 (.05) (.95)  .00048
3
1
p(4)  4 C4 (.05) (.95)  .00000625
4
0
Solution
x 0
p(x) .815
1
2
.171475 .01354
3
4
.00048 .00000625
Example (cont.)
x
0
p(x) .815
1
2
.171475 .01354
3
.00048
4
.00000625
What is the probability that at least 2 of
the housings will have a cosmetic defect?
P(x  p(2)+p(3)+p(4)=.01402625

Example (cont.)
x
0
p(x) .815
1
2
.171475
.01354
3
4
.00048 .00000625
What is the probability that at most 1
housing will not have a cosmetic defect?
(at most 1 failure=at least 3 successes)
P(x  3)=p(3) + p(4) = .00048+.00000625 =
.00048625

Using binomial tables; n=20, p=.3
9, 10, 11, … , 20
P(x  5) = .4164
 P(x > 8) = 1- P(x  8)= 1- .8867=.1133
=P(x 8)
 P(x < 9) = ? 8, 7, 6, … , 0
 P(x  10) = ? 1- P(x  9) = 1- .9520
 P(3  x  7)=P(x  7) - P(x  2)
.7723 - .0355 = .7368

Binomial n = 20, p = .3 (cont.)
P(2 < x  9) = P(x  9) - P(x  2)
= .9520 - .0355 = .9165
 P(x = 8) = P(x  8) - P(x  7)
= .8867 - .7723 = .1144

Color blindness
The frequency of color blindness (dyschromatopsia) in the
Caucasian American male population is estimated to be
about 8%. We take a random sample of size 25 from this population.
We can model this situation with a B(n = 25, p = 0.08) distribution.
 What
is the probability that five individuals or fewer in the sample are color blind?
Use Excel’s “=BINOMDIST(number_s,trials,probability_s,cumulative)”
P(x ≤ 5) = BINOMDIST(5, 25, .08, 1) = 0.9877
 What
is the probability that more than five will be color blind?
P(x > 5) = 1  P(x ≤ 5) =1  0.9877 = 0.0123
 What
is the probability that exactly five will be color blind?
P(x = 5) = BINOMDIST(5, 25, .08, 0) = 0.0329
30%
25%
20%
B(n = 25, p = 0.08)
15%
10%
5%
24
22
20
18
16
14
12
10
8
6
4
2
0%
0
P(X = x) P(X <= x)
12.44%
12.44%
27.04%
39.47%
28.21%
67.68%
18.81%
86.49%
9.00%
95.49%
3.29%
98.77%
0.95%
99.72%
0.23%
99.95%
0.04%
99.99%
0.01% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
0.00% 100.00%
P(X = x)
x
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Number of color blind individuals (x)
Probability distribution and histogram for
the number of color blind individuals among
25 Caucasian males.
What are the mean and standard deviation of the count of
color blind individuals in the SRS of 25 Caucasian American
males?
µ = np = 25*0.08 = 2
σ = √np(1  p) = √(25*0.08*0.92) = 1.36
What if we take an SRS of size 10? Of size 75?
µ = 10*0.08 = 0.8
µ = 75*0.08 = 6
σ = √(75*0.08*0.92) = 2.35
0.5
0.2
0.4
0.15
0.3
p = .08
n = 10
0.2
0.1
P(X=x)
P(X=x)
σ = √(10*0.08*0.92) = 0.86
p = .08
n = 75
0.1
0.05
0
0
0
1
2
3
4
5
Number of successes
6
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of successes
Recall Free-throw
question


Through 2/24/11 NC State’s
free-throw percentage was
69.6% (146th in Div. 1).
If in the 2/26/11 game with
GaTech, NCSU shoots 11 freethrows, what is the probability
that:
1.
2.
3.
NCSU makes exactly 8
free-throws?
NCSU makes at most 8
free throws?
NCSU makes at least 8
free-throws?
1.
n=11; X=# of made
free-throws; p=.696
p(8)= 11C8 (.696)8(.304)3
2.
P(x ≤ 8)=.697
3.
P(x ≥ 8)=1-P(x ≤7)
=1-.4422 = .5578
Recall from beginning of Lecture
Unit 4: Hardee’s vs The Colonel
Out of 100 taste-testers, 63 preferred
Hardee’s fried chicken, 37 preferred KFC
 Evidence that Hardee’s is better? A
landslide?
 What if there is no difference in the
chicken? (p=1/2, flip a fair coin)
 Is 63 heads out of 100 tosses that unusual?

Use binomial rv to analyze
n=100 taste testers
 x=# who prefer Hardees chicken
 p=probability a taste tester chooses
Hardees
 If p=.5, P(x  63) = .0061 (since the
probability is so small, p is probably NOT .5;
p is probably greater than .5, that is,
Hardee’s chicken is probably better).

Recall: Mothers Identify
Newborns





After spending 1 hour with their newborns,
blindfolded and nose-covered mothers were asked
to choose their child from 3 sleeping babies by
feeling the backs of the babies’ hands
22 of 32 women (69%) selected their own newborn
“far better than 33% one would expect…”
Is it possible the mothers are guessing?
Can we quantify “far better”?
Use binomial rv to analyze
n=32 mothers
 x=# who correctly identify their own baby
 p= probability a mother chooses her own baby
 If p=.33, P(x  22)=.000044 (since the probability
is so small, p is probably NOT .33; p is probably
greater than .33, that is, mothers are probably not
guessing.

Geometric Random Variables
Geometric Probability Distributions
 Through 2/24/2011 NC State’s free-throw
percentage was 69.6 (146th of 345 in Div.
1). In the 2/26/2011 game with GaTech
what was the probability that the first
missed free-throw by the ‘Pack occurs on
the 5th attempt?

3
0
Binomial Experiments





n identical trials
◦ n specified in advance
2 outcomes on each trial
◦ usually referred to as “success” and “failure”
p “success” probability; q=1-p “failure” probability;
remain constant from trial to trial
trials are independent
The binomial rv counts the number of successes
in the n trials
31
The Geometric Model
A geometric random variable counts the
number of trials until the first success is
observed.
 A geometric random variable is completely
specified by one parameter, p, the probability
of success, and is denoted Geom(p).
 Unlike a binomial random variable, the
number of trials is not fixed

32
The Geometric Model (cont.)
Geometric probability model for Bernoulli trials:
Geom(p)
p = probability of success
q = 1 – p = probability of failure
X = # of trials until the first success occurs
x-1
p(x) = P(X = x) = q p, x = 1, 2, 3, 4,…
1
E( X )   
p
q
p2
 
33
The Geometric Model (cont.)
The 10% condition: the trials must be
independent. If that assumption is violated, it is
still okay to proceed as long as the sample is
smaller than 10% of the population.
Example: 3% of 33,000 NCSU students are from
New Jersey. If NCSU students are selected 1
at a time, what is the probability that the first
student from New Jersey is the 15th student
selected?

34
Example
The American Red Cross says that about 11% of the
U.S. population has Type B blood. A blood drive is
being held in your area.
1. How many blood donors should the American Red
Cross expect to collect from until it gets the first
donor with Type B blood?
Success=donor has Type B blood
X=number of donors until get first donor with Type B
blood
1 1
p  .11; E ( X )  
 9.09
p .11
35
Example (cont.)
The American Red Cross says that about 11% of the
U.S. population has Type B blood. A blood drive is
being held in your area.
2. What is the probability that the fourth blood
donor is the first donor with Type B blood?
p(4)  q
41
41
 p  (.89) (.11)  .89 .11 .0775
3
36
Example (cont.)
The American Red Cross says that about 11% of the
U.S. population has Type B blood. A blood drive is
being held in your area.
3. What is the probability that the first Type B blood
donor is among the first four people in line?
p  .11; have to find
p(1)  p(2)  p(3)  p(4)
 (.890  .11)  (.891  .11)  (.89 2  .11)  (.893  .11)
 .11  .0979  .087  .078  .3729
37
Geometric Probability Distribution
p = 0.1
0.12
0.1
0.08
0.06
0.04
0.02
0
1
2
3
4
5
6
7
8
9
10
11
12
p(1)  .90 .1 .1 p(3)  .92 .1 .081
p(2)  .91 .1 .09 p(4)  .93 .1 .0729
1 1
E ( X )    10
p .1
38
13
14
15
Geometric Probability Distribution
p = 0.25
0.3
0.25
0.2
0.15
0.1
0.05
0
1
2
3
4
5
6
p (1)  .750  .25  .25
7
8
9
10
11
12
13
14
p(3)  .752  .25  .141
p (2)  .751  .25  .1875 p(4)  .753  .25  .1055
1
1
E( X )  
4
p .25
39
15
Example
Shanille O’Keal is a WNBA player who makes 25%
of her 3-point attempts.
1. The expected number of attempts until she makes
her first 3-point shot is what value?
2. What is the probability that the first 3-point shot
she makes occurs on her 3rd attempt?
1 1
E( X )  
4
p .25
p(3)  .75 .25  .141
2
40
Question from first slide
Through 2/24/2011 NC State’s free-throw
percentage was 69.6%. In the game with
GaTech what was the probability that the first
missed free-throw by the ‘Pack occurs on the
5th attempt?
“Success” = missed free throw
Success p = 1 - .696 = .304
p(5) = .6964  .304 = .0713

41