Transcript + P - CUNY

Probability
Chapter 3
Prof. Felix Apfaltrer
[email protected]
Office:N518
Office Hours: 10:30am-noon
Phone: 212-220 7421
False positives and negatives
Pregnancy test results
Positive test result
Negative test result
(test indicates pregnant)
Subject pregnant
Subject not pregnant
• False positive: test incorrectly
indicates woman pregnant when
she is not.
• False negative: test incorrectly
indicates woman is not pregnant
when she is pregnant.
• True positive: test correctly
indicates woman pregnant when
she is .
• True negative: test correctly
indicates woman not pregnant
when she is not.
(test indicates not pregnant)
80
5
3
11
• Test sensitivity: the probability
of a true positive.
• Test specificity: the
probability of a true negative.
• Ex: Abbot test pack indicates
that their urinte test has a 0.2%
false positive and a 0.6% false
negative rate.
2
Overview
• Rare event rule: If under a
given assumption (lottery is
fair) the probability of a
particular observed event (5
consecutive lottery wins by
the same person) is
extremely small, the
assumption is probably not
correct.
3
Fundamentals
Definitions:
• Procedure: rolling a die, 2
dice, tossing a coin, … A
procedure is an action whose
outcome(s) (result) is (are)
random.
• Event: Any collection of
outcomes of a procedure.
• Simple events: an event that
cannot be simplified even
further.
• Sample space of a procedure:
The set of all simple events.
Notation:
• P
• A, B, C
• P(A)
Examples:
• Procedure: rolling a die, 2 dice,
•
•
•
•
Event: For 1 die, any of 1,2,3,4,
5,6, “even”, greater than 3”.
For 2 dice: “sum is 7”, “sum is
bigger than 10”, “1-1”, “1-2”, “21”, “both even”.
Simple events: for 1 die:1, 2, 3,4,
5, 6. For 2 dice: 1-1, 1-2,1-3,14,1-5,1-6, 2-1, 2-2, 2-3, 2-4, 2-5,
2-6, 3-1, …, 6-6
Sample space of a procedure:
The set of all simple events.
probability
specific events
the probability of the event A occurring
4
Defining a probability
•
Relative Frequency Approach: Observe a
procedure a large number of times and count
the number of times that event A occurs, then
P(A) is estimated by
P(A)= number of times A occurs
number of trials
•
Classical Approach: If a procedure has n
simple (different) events that can occur that are
equally likely, and there are s different ways
that A can occur then
P(A)= number of ways A can occur
number of simple events
•
Examples:
• A tack falls up: repeat the
experiment 1000 times and count
how many times the tack falls up,
then P(A) is the ratio of number it
falls up over the number of times
the tack was thrown.
•
= s
n
Subjective Probability: P(A), the probability
of the event A, is found by based on knowledge
of relevant circumstances.
Rolling a die: assuming the die is
not loaded each face has the same
chance of falling upside
# of ways face even 3
=
P(even)=
Total # of options
6
•
Weather forecast: need
to be expert to estimate
wisely if it will rain tomorrow or
5
not.
More examples
•
Flying on a commercial plane. Find
the probability that a random
selected adult has flown on a plane.
•
•
2 events: flown, or not.
events not equally likely (cannot use
classical approach)
use relative frequency approach.
Gallup poll: 815 randomly selected
adults, 710 indicated the have flown
•
•
Roulette: Bet on number 13 on a
roulette game. What is the
probability that you will lose?
•
38 slots, all equally likely, use classic
approach. 37 result in loss.
P(loss)=
37
38
•
Meteorites: What is the probability
that your house will be hit by a
meteorite?
•
In absence of historical data, need
3rd approach. We know the chance is
very small, say 0.000,000,001. This
is a subjective estimate. A general
6
ballpark.
P(flew on commercial plane)= 710 =0.83
815
Law of large numbers
Law of large numbers: As a procedure
is repeated again and again,
the relative frequency probability of
an event tends to approach the actual
probability.
s
n
•
•
•
•
•
P(A)
for
against
no opinion
total
P(for)=
319
491
Example: 2 boys, 1 girl. What is that
when a couple has 3 children, exactly 2
out of the 3 are boys.
•
Assuming that having boys or girls is
equally likely, use classical approach.
Options are:
– boy-boy-boy
– boy-boy-girl
– boy-girl-boy
– boy-girl-girl
– girl-boy-boy
– girl-boy-girl
– girl-girl-boy
– girl-girl-girl
8 possible outcomes, 3 correspond to
exactly 2 boys
•
Example: Death penalty. In a Gallup
poll, adults are randomly selected and
asked if they are in favor or against the
death penalty. The responses include 319
who are for it, 133 who are against it, and
39 that have no opinion . Based on these
results, estimate the probability that a
randomly selected person is in favor of
the death penalty.
319
133
39
491
•
=0.65
•
P(exaclty 3 boys)= 3
8
=0.375
7
Complementary probabilities and properties
•
Thanksgiving day. What is the
probability that Thanksgiving
day falls on a
a)
b)
–
Examples:
• If X denotes the number the face a die
shows when it lands, then
–
–
–
–
Wednesday?
Thursday?
Thanksgiving is always on a
Thursday!
a)
b)
Impossible: P(Thxgiv. Wed)=0
Always true: P(Thxgiv. Thu)=1
–
–
–
–
–
The probability of the impossible event is 0.
P(  ) =0.
The probability of the certain event is 1.
P(  ) =1.
For any event A, 0 ≤ P(A) ≤ 1.
If Ac denotes the complement event to A, then
P(A)+P(Ac)=1
•
P( X = 7 ) = 0
P( X ≤ 7 ) = 1
P( X not even ) = 1- P( X even )
P( { X ≤ 2} c ) = 1 - P( { X ≤ 2 } )
= 1 - 2/6 = 4/6 = 2/3
= P( X > 2 )
P( X ≥0 ) = 1
For any event A, P(A)≥0
P(A)=0 only if A cannot happen
For any event A, P(A ) ≤ 1
P(A)=1 exactly only if A happens for sure
If Y denotes the sum of the numbers on the
faces when throwing 2 dice:
–
–
–
–
P( Y = 1) =0
P( 2 ≤ Y ≤ 12 ) =1
P(Y=4) = 3/36 namely 1-3, 2-2, and 3-1
P({Y=2} c) = 1-P(Y=2)=1-1/36 = 35/36
8
HW: p.120 #1-7
Venn diagrams
Addition Rule
A compound event is an event
combining 2 or more simple
events.
Notation:
P(AB) intersection of A and B
(both A and B occur)
P(AB) union of A and B
(either A or B or both occur)

A

A
AB
B
B
Overlapping events
=
Non-Overlapping events
(disjoint)
+
–
P(AB ) = P(A ) + P( B ) – P( A B)
Addition Rule:
P(AB) = P(A) + P(B) – P(AB)
Mendel: hybridization experiments. Peas with purple (p)
and white (w) flowers, green (g)
and yellow (y) pods.
8p
9g
6w
5 y
P(g  p) = 9/14 + 8/14 – 5/14
= P(g) + P(p) – P(g p)
Idea: count data only once!
Events A and B are disjoint (or
mutually exclusive) if they cannot
both occur together.
In such a case, the intersection of the
events is empty: AB = ø and we
recall that P(ø) = 0. We then have
P(AB) = P(A) + P(B)
9
Examples: addition rule
Pregnancy test results
Positive test result
Negative test result
(test indicates pregnant)
(test indicates not pregnant)
Subject pregnant
Subject not pregnant
80
5
3
11
Clinical trials of pregnancy test:
Assuming that 1 person is selected at
random from the 99 people in the
test, find the probability of selecting a
subject who is pregnant or had a
positive test result.
P(pregnant) = (80 + 5)/99
P(test positive) = (80 +3 ) / 99
P(pregnant and test positive) = 80 / 99
P(pregnant or test positive) =
P(pregnant) + P(test positive)
- P(pregnant and test positive)
= 85/99 + 83/99 - 80/99
= 88/99 = 8/9 = 0.899
Alternatively
P(pregnant or positive)=
P(pregnant and positive)
+ P(pregnant and negative)
+ P (not pregnant but positive)
= 80/99
+ 5/99
+ 3/99
Note that
Pregnant =(pregnant & pos) + (preg. & neg)
Positive = (pregnant & pos) + (pos. & not preg.)
Substract to avoid double counting!
10
Multiplication rule
•
P( A and B ) = P( A  B )
Example:
Answer at random
1.
True/false: A pound of feathers is
heavier than a pound of gold.
2.
Which has affected society most:
a)
b)
c)
d)
e)
Remote control
Sneakers with high heels
Hostess twinkies
Computers
Phone
HW: p.130 #13-20
•
•
•
To answer at random q. 1, each choice
has probability 1/2.
To answer at random q. 2, each choice
has probability 1/5.
P(both answers correct)
= P( T and (d) )
= 1/2 * 1/5 =1 / 10 = P(T )  P(d )
T
F
a
b
c
d
e
a
b
c
d
e
11
Multiplication rule: independent event
If events A and B are independent,
then
P( A  B ) = P(A)P(B )
Answer: Independent? YES!
A: 1st die even
B: second die larger than 4
P(A ) = 3/6 = 1/2
P(B) = P(“face shows 5 or 6”)
= 2 / 6 = 1/3
Independence: occurrence of 1
event does not affect the other.
Example:
P(A B ) = P (A) P(B )
= 1/2  1/3 = 1/6
Throwing 2 dice. What is the
probability that the first number is
even and the second one is larger
than 4.
#
1
#
2
#1
#
2
#
1
#
2
#
1
#
2
#
1
#
2
#
1
#
2
1
1
1
2
1
3
1
4
1
5
1
6
2
1
2
2
2
3
2
4
2
5
2
6
3
1
3
2
3
3
3
4
3
5
3
6
4
1
4
2
4
3
4
4
4
5
4
6
5
1
5
2
5
3
5
4
5
5
5
6
6
1
6
2
6
3
6
4
6
5
6
6
Alternatively:
from graph there are 6 options that
are good: 2-5, 2-6, 4-5,4-6, 6-5,66:
P(A B ) = 6/36 = 1/6
12
Multiplication rule (without replacement)
Genetics experiments:
If 2 peas are chosen at random without replacement, what is the
probability that the first one has a green pod and the second one a
yellow one?
1st selection:
2nd selection:
P(g) = 9/14
P(y) = 5/13
(14 peas, 9 green pods)
(13 peas left, 5 yellow pods)
P(g first, y second) = P(g)P(y) = 9/14  5/13 = 0.247
• Must take into account : without replacement
• Second pea chosen out of only 13 peas!
• First event should take into account the fact
that the first one occurred!
13
Conditional Probability
P( B | A)
conditional probability of B given A
probability of event B occurring given that event A has occurred
P( A | B)
conditional probability of A given B
probability of event A occurring given that event B has occurred
General multiplication rule:
P(A and B) = P( A | B)  P(B)
Examples:
CD control damage: Water, Crushing, Puncture, Marking
5 damaged items: W, C, C, P, M, 2 items are selected randomly …
a) With replacement probability of first item C, second item C.
P(CC) = 2/5  2/5 = 4/25= 0.16
b)
Without replacement: P(CC) .
P(CC) = 2/5  1/4 = 2/20= 0.1
14
More multiplication examples
•
•
Probability of 4 aces in 4 cards:
P(4 aces) = 4/52  3/51  2/50  1/49
= 0.00000369
Pollsters sample without
replacement, but treat events as
independent if sample size is less
than 5% of population.
Quality control:
- Former DVD defect rate: 3%
- New DVD process. Claimed better!
- 5000 DVDs produced
- 200 sampled, 0 defect
Is the claim plausible?
•
P( 0 DVDs defect) = P(all 200 ok)
P(1 dvd ok) = 0.97
– (assuming old 3% defect rate,
then 97% are good, or 0.97 )
Assume independence
– (sample 5% or less from 5000):
P(200 DVDs )
= P(DVD 1 ok and DVD 2 ok and
DVD ok … and DVD 200 ok)
=P(DVD1 )P(DVD2 )
…P(DVD200)
= 0.97  0.97  ….  0.97
= 0.97 200
= 0.00226
This probability is so small (rare event) that
it indicates that it is very unlikely to have
“by chance” not found any defect DVDs
in the 200 DVD sample. Instead, it is
more likely that the defect rate is lower.
15
Homework problems
7.138 Defective gas masks: 19, 218 gas
masks from US military were tested,
10,332 defective. Find the probability
that 2 random gas masks from this
population are defective, if the
sampling is done
a) with replacement.
b) without replacement.
c) Compare and decide which to choose
in this case.
a) 10332/19218 10332/19218= 0
.289036
b) 10332/19218 10331/19217= 0
.289033
c) The results are VERY similar, it
makes more sense use the first case,
with replacement, because for a
random selection of 2, it is very very
unlikely that the same mask is chosen
twice. Therefore, we can assume
independence.
14.139 Poll confidence levels:
• Public opinion polls usually have a
“confidence level” of 95%, meaning
that with a probability of 0.95, the
poll results are within the margin of
error.
• If 5 different groups conduct
independent polls, what is the
probability that all of them fall
within the margin of error?
• P(5 polls )=P(poll )5= 0.95 5
=0.77378
• Does the result suggest that with a
confidence level of 95%, we can
expect that almost all polls will be
within the margin of error?
• Yes, in average 77 out of 100 polls
will be within the margin of
average, or 4 out of 5 will be .
16
“At least one” event
• At least one = one or more
• Complement: none!
P(at least one girl among 3 children)
= 1- P(no girl)
boy-boy-boy
=1-1/8
boy-boy-girl
boy-girl-boy
=7/8
boy-girl-girl
=0.875
girl-boy-boy
girl-boy-girl
girl-girl-boy
girl-girl-girl
P(at least one poll within confidence
interval)
=1- P(no poll good)
=1- [ 0.05 5] = 0.9999997
Conditional probability
The conditional probability of the event
A given B is denoted by P(A | B) and
it is the probability that A occurs
knowing that B has occurred already
Example:
Subject
Test positive
pregnant
Negative
5
3
11
not pregnant
Total
Total
80
positive
85 pregnant
14 not pregnant
negative
83
16
• 1 subject is selected randomly, find the
probability of a subject being positive, given
that she is pregnant.
P(pos|pregnant) = 80/85=0.941 or
= P(positive and pregnant)
P(pregnant)
happens always!
= 80/99 = 0.964
HW: p.138 #7, 11, 13, 14
85/99
17
Conditional Probability
Titanic
• P(man |died)
• P(died |men)
• P(boy or girl |survived)
• P(man or woman | died)
Titanic Mortality Rate
Men
Survived
Died
Women
Boys
Girls
332
318
29
27
1360
104
35
18
Titanic Mortality Rate
Men
Survived
Died
=m/w/b/g
Women
Boys
Girls
29
Total D/A
332
318
27
1360
104
35
18
1692
422
64
45
706
1517
2223
P(man|died) = P(man & died)/P(died) = 1360/1517=0.897
P(died |man) = P(man & died )/P(man) = 1360/1692=0.804
P(boy or girl |survived)=P( {boy or girl} & survived)/P(survived)=57/706=0.079
P((man or woman | died) )=P({man or woman}& died)/P(died)=1464/1517=0.965
NOTE:
P(man & died)
P(died)
=
1360 / 2223 = 1360 = 0.897
1517 / 2223
1517
18
Counting
Fundamental Counting rule: If event A can occur in m ways and event B in n
ways, the events together can occur in a total of mn ways.
Examples:
• Combination lock:
– 3 dials with digits 0-9.
• Total # of combinations: 101010 = 10 3 =1000.
– bikelock 4 dials digits 1-6
• 6666= 64 =1296.
•
Arrangements of ABC:
ABC, ACB, BAC, BCA, CAB, CBA = 6
•
In 3 spots, we have
__3__ __2__ __1__
3 choices for first spot, 2 for second and 1 for last, or 321=6.
NOTATION:
321= 3 ! or 3 factorial.
4! = 4321= 24, 5! = 54321= 120
also
5! = 5 4!
and note as well that for example
7! /3! = 7654321/ 321 = 7654
19
Factorials and permutations
A collection of n different objects can be arranged in n! different ways.
In the 1st spot there are n possible items to place, in the second one, n-1,
in the third one, n-2, …, in the penultimate one, 2, and in the last one, only 1 choice.
Examples:
• Ways of sitting 20 student in class with 20 chairs:
20!=2,432,902,008,176,640,000
• Arrangements of ABC:
ABC, ACB, BAC, BCA, CAB, CBA, totaling 6 arrangements.
In 3 spots, we have
__3__ __2__ __1__
3 choices for first spot, 2 for second and 1 for last, or 321=6.
• Ways of sitting 4 of the 20 students in 4 preassigned chairs:
20191817 = 20! / 16! = 20! / (20-4)! = 20P4 =116,280
• 20! / 16! = 201918171615…321 / 1615…321 = 20191817
• In general, nPk = n! / (n - k)!= n (n-1)(n-2)… (n-k+1)
• nPk is called a permutation of k objects out of n total objects.
20
Permutations and Combinations
Permutations Rule (all items differ, order does count):
The number of permutations (or sequences) of k items selected from n
available items (without replacement) is
Permutations (some items equal):
If there are n items, n1 alike, n2
alike, … nk alike, the number of
permutations of all items is
Combinations (order does not count):
The number of combinations of k items selected from n different items
(without replacement) is
21
Example: elected officers
The boards of trustees of a college has 9 members. Each year, a 3
person committee is chosen. At the same time, the board elects 3
officers (Prez, VP, and secretary).
a) How many slates of candidates for officers are possible?
b) How many different 3-person committees can be chosen?
a) Order does count for board, it matters who is P, VP and S.
Therefore, permutations of k=3 people out of n=9 different
people, or
b) Order does not count for committee. Therefore, combinations
of k=3 people out of n=9 different people, or
22
p.14 Is the pollster lying?
• A pollster claims that 12 voters were randomly selected from
a population of 200,000 voters (of which 30% are
Republican) and all 12 are Republican. He claims this can
easily happen by chance. Find the probability that the 12 are
Republican when randomly selected, to see if we believe the
claim.
• Assuming independence:
• P(12 Repulicans) = P(#1 R)P(#2 R) P(#3 R)… P(#12 R)
= P( R ) 12
= 0.3 12
=0.000,000,53
• Something is very fishy!
23