Transcript Slide 1

(Part 2)
5
Contingency Tables
Tree Diagrams
Bayes’ Theorem
Counting Rules
McGraw-Hill/Irwin
Copyright © 2009 by The McGraw-Hill Companies, Inc.
Chapter
Probability
Contingency Tables
What is a Contingency Table?
Variable 1
Col 1 Col 2 Col 3
Variable 2
• A contingency
table is a crosstabulation of
frequencies
into rows and
columns.
Row 1
Row 2
Cell
Row 3
Row 4
• A contingency table is like a frequency
distribution for two variables.
5B-2
Contingency Tables
Example: Salary Gains and MBA Tuition
• Consider the following cross-tabulation table
for n = 67 top-tier MBA programs: (Table 5.4)
5B-3
Contingency Tables
Example: Salary Gains and MBA Tuition
• Are large salary gains more likely to accrue
to graduates of high-tuition MBA programs?
• The frequencies indicate that MBA graduates
of high-tuition schools do tend to have large
salary gains.
• Also, most of the top-tier schools charge
high tuition.
• More precise interpretations of this data can
be made using the concepts of probability.
5B-4
Contingency Tables
Marginal Probabilities
• The marginal probability of a single event is
found by dividing a row or column total by
the total sample size.
• For example, find the marginal probability of
a medium salary gain (P(S2)).
P(S2) = 33/67 = .4925
• Conclude that about 49% of salary gains at
the top-tier schools were between $50,000
and $100,000 (medium gain).
5B-5
Contingency Tables
Marginal Probabilities
• Find the marginal probability of a low tuition P(T1).
P(T1) = 16/67 = .2388
• There is a 24% chance that a top-tier school’s
MBA tuition is under $40.000.
5B-6
Contingency Tables
Joint Probabilities
• A joint probability represents the intersection
of two events in a cross-tabulation table.
• Consider the joint event that the school has
low tuition and large salary gains
(denoted as P(T1  S3)).
5B-7
Contingency Tables
Joint Probabilities
• So, using the cross-tabulation table,
P(T1  S3) = 1/67 = .0149
• There is less than a 2% chance that a top-tier
school has both low tuition and large salary
gains.
5B-8
Contingency Tables
Conditional Probabilities
• Found by restricting ourselves to a single
row or column (the condition).
• For example, knowing that a school’s MBA
tuition is high (T3), we would restrict
ourselves to the third row of the table.
5B-9
Contingency Tables
Conditional Probabilities
• Find the probability that the salary gains are
small (S1) given that the MBA tuition is large
(T3).
P(S1 | T3) = 5/32 = .1563
• What does this mean?
5B-10
Contingency Tables
Independence
• To check for independent events in a
contingency table, compare the conditional
to the marginal probabilities.
• For example, if large salary gains (S3) were
independent of low tuition (T1), then
P(S3 | T1) = P(S3).
Conditional
Marginal
P(S3 | T1)= 1/16 = .0625
P(S3) = 17/67 = .2537
• What do you conclude about events S3 and T1?
5B-11
Contingency Tables
Relative Frequencies
• Calculate the relative frequencies below for
each cell of the cross-tabulation table to
facilitate probability calculations.
• Symbolic notation for relative frequencies:
5B-12
Contingency Tables
Relative Frequencies
• Here are the resulting probabilities (relative
frequencies). For example,
P(T1 and S1) = 5/67 P(T2 and S2) = 11/67
P(S1) = 17/67
5B-13
P(T3 and S3) = 15/67
P(T2) = 19/67
Contingency Tables
Relative Frequencies
• The nine joint probabilities sum to 1.0000
since these are all the possible intersections.
• Summing the across a row or down a
column gives marginal probabilities for the
respective row or column.
5B-14
Contingency Tables
Example: Payment Method and Purchase Quantity
• A small grocery store would like to know if
the number of items purchased by a customer
is independent of the type of payment method
the customer chooses to use.
• Why would this information be useful to the
store manager?
• The manager collected a random sample of
368 customer transactions.
5B-15
Contingency Tables
Example: Payment Method and Purchase Quantity
Here is the contingency table of frequencies:
5B-16
Contingency Tables
Example: Payment Method and Purchase Quantity
• Calculate the marginal probability that a
customer will use cash to make the payment.
• Let C be the event cash.
P(C) = 126/368 = .3424
• Now, is this probability the same if we
condition on number of items purchased?
5B-17
Contingency Tables
Example: Payment Method and Purchase Quantity
P(C | 1-5) = 30/88 = .3409
P(C | 6-10) = 46/135 = .3407
P(C | 10-20) = 31/89
= .3483
P(C | 20+) = 19/56
= .3393
• P(C) = .3424, so what do you conclude about
independence?
• Based on this, the manager might decide to
offer a cash-only lane that is not restricted to
the number of items purchased.
5B-18
Contingency Tables
How Do We Get a Contingency Table?
• Contingency tables require careful
organization and are created from raw data.
• Consider the data
of salary gain and
tuition for n = 67
top-tier MBA
schools.
5B-19
Contingency Tables
How Do We Get a Contingency Table?
• The data should be coded so that the values
can be placed into the contingency table.
Once coded,
tabulate the
frequency
in each cell of
the contingency
table using
MINITAB’s :
Stat | Tables | Cross Tabulation
5B-20
Tree Diagrams
What is a Tree?
• A tree diagram or decision tree helps you visualize all
possible outcomes.
• Start with a contingency table.
• For example, this table gives expense ratios by fund
type for 21 bond funds and 23 stock funds.
5B-21
Tree Diagrams
What is a Tree?
• To label the tree, first calculate conditional
probabilities by dividing each cell frequency
by its column total.
• For example,
P(L | B) = 11/21 = .5238
• Here is the table of conditional probabilities
5B-22
Tree Diagrams
What is a Tree?
• The tree diagram
shows all events along with
their marginal, conditional and joint probabilities.
• To calculate joint probabilities, use
P(A  B) = P(A | B)P(B) = P(B | A)P(A)
• The joint probability of each terminal event on the
tree can be obtained by multiplying the probabilities
along its branch.
• For example, P(B  L) = P(L | B)P(B)
= (.5238)(.4773) = .2500
5B-23
Tree Diagrams
Tree Diagram for Fund Type and Expense Ratios
Figure 5.11
5B-24
Bayes’ Theorem
• Thomas Bayes (1702-1761) provided a
method (called Bayes’s Theorem) of revising
probabilities to reflect new probabilities.
• The prior (marginal) probability of an event B
is revised after event A has been considered
to yield a posterior (conditional) probability.
• Bayes’s formula is:
5B-25
P( A | B) P( B)
P( B | A) 
P( A)
Bayes’ Theorem
• Bayes’ formula begins as:
P( A | B) P( B)
P( B | A) 
P( A)
• In some situations P(A) is not given.
Therefore, the most useful and common
form of Bayes’ Theorem is:
P( A | B) P( B)
P( B | A) 
P( A | B) P( B)  P( A | B ') P( B ')
5B-26
Bayes’ Theorem
How Bayes’ Theorem Works
• Consider an over-the-counter pregnancy testing kit
and it’s “track record” of determining pregnancies.
• If a woman is actually pregnant, what is the test’s
“track record”?
• If a woman is not pregnant, what is the test’s “track
record”? False Positive
False Negative
Table 5.17
96% of time
1% of time
5B-27
4% of time
99% of time
Bayes’ Theorem
How Bayes’ Theorem Works
• Suppose that 60% of the women who
purchase the kit are actually pregnant.
• Intuitively, if 1,000 women use this test, the
results should look like this.
5B-28
Bayes’ Theorem
How Bayes’ Theorem Works
• Of the 580 women who test positive, 576 will
actually be pregnant.
• So, the desired probability is:
P(Pregnant│Positive Test) = 576/580 = .9931
5B-29
Bayes’ Theorem
How Bayes’ Theorem Works
• Now use Bayes’s Theorem to formally derive the
result P(Pregnant | Positive) = .9931:
• First define
A = positive test
A' = negative test
• From the contingency
table, we know that:
P(A | B) = .96
P(A | B') = .01
P(B) = .60
5B-30
B = pregnant
B' = not pregnant
• And the compliment of
each event is:
P(A' | B) = .04
P(A' | B') = .99
P(B') = .40
Bayes’ Theorem
How Bayes’ Theorem Works
P(B | A) =
=
P(A | B)P(B)
P(A | B)P(B) + P(A | B')P(B')
(.96)(.60)
(.96)(.60) + (.01)(.40)
.576
.576
=
=
.576 + .04
.580
= .9931
• So, there is a 99.31% chance that a woman is
pregnant, given that the test is positive.
5B-31
Bayes’ Theorem
How Bayes’ Theorem Works
• Bayes’s Theorem shows us how to revise
our prior probability of pregnancy to get the
posterior probability after the results of the
pregnancy test are known.
Prior
Before the test
P(B) = .60
Posterior
After positive test result

P(B | A) = .9931
• Bayes’s Theorem is useful when a direct
calculation of a conditional probability is not
permitted due to lack of information.
5B-32
Bayes’ Theorem
How Bayes’ Theorem Works
• A tree diagram helps visualize the situation.
5B-33
Bayes’ Theorem
How Bayes’ Theorem Works
The 2 branches showing a positive test (A) comprise a
reduced sample space
B  A and B'  A,
so add their
probabilities
to obtain the
denominator
of the fraction
whose
numerator is
P(B  A).
5B-34
Bayes’ Theorem
General Form of Bayes’ Theorem
• A generalization of Bayes’s Theorem allows
event B to be polytomous (B1, B2, … Bn)
rather than dichotomous (B and B').
P( A | Bi ) P( Bi )
P( Bi | A) 
P( A | B1 ) P( B1 )  P( A | B2 ) P( B2 )  ...  P( A | Bn ) P( Bn )
5B-35
Bayes’ Theorem
Example: Hospital Trauma Centers
(Table 5.18)
• Based on historical data, the percent of cases at 3
hospital trauma centers and the probability of a case
resulting in a malpractice suit are as follows:
• let event A = a malpractice suit is filed
Bi = patient was treated at trauma center i
5B-36
Bayes’s Theorem
Example: Hospital Trauma Centers
• Applying the general form of Bayes’
Theorem, find P(B1 | A).
P( A | B1 ) P( B1 )
P( B1 | A) 
P( A | B1 ) P( B1 )  P( A | B2 ) P( B2 )  P( A | B3 ) P( B3 )
(0.001)(0.50)
P( B1 | A) 
(0.001)(0.50)  (0.005)(0.30)  (0.008)(0.20)
0.0005
0.0005
P( B1 | A) 

 0.1389
0.
0.0005  0.0015  0.0016 0.00036
5B-37
Bayes’ Theorem
Example: Hospital Trauma Centers
• Conclude that the probability that the malpractice
suit was filed in hospital 1 is .1389 or 13.89%.
• All the posterior probabilities for each hospital can
be calculated and then compared:
(Table 5.19)
5B-38
Bayes’ Theorem
Example: Hospital Trauma Centers
• Intuitively, imagine there were 10,000
patients and calculate the frequencies:
5B-39
Hospital
Malpractice
Suit Filed
No Malpractice
Suit Filed
Total
1
5
4,995
5,000
= 10,000x.5
2
15
2,985
3,000
= 10,000x.3
3
16
1,984
2,000
= 10,000x.2
Total
36
9,964
10,000
= 5,000 x .001
= 5,000 - 5
= 3,000 x .005
= 3,000 - 15
= 2,000 x .008
= 1,984 - 16
Bayes’ Theorem
Example: Hospital Trauma Centers
• Now, use these frequencies to find the
probabilities needed for Bayes’ Theorem.
• For example,
Hospital
Malpractice
Suit Filed
No Malpractice
Suit Filed
Total
1
P(B1|A)=5/36=.1389
P(B1|A')=.5012
P(B1)=.5
2
P(B2|A)=15/36=.4167
P(B2|A')=.2996
P(B2)=.3
3
P(B3|A)=16/36=4444
P(B3|A')=.1991
P(B3)=.2
Total
P(A)=36/10000=.0036
P(A')=.9964
1.0000
5B-40
Bayes’ Theorem
Example: Hospital Trauma Centers
• Consider the following visual description of
the problem:
5B-41
Bayes’ Theorem
Example: Hospital Trauma Centers
• The initial sample space consists of 3
mutually exclusive and collectively
exhaustive events (hospitals B1, B2, B3).
5B-42
Bayes’ Theorem
Example: Hospital Trauma Centers
• As indicated by their relative areas, B1 is 50%
of the sample space, B2 is 30% and B3 is 20%.
50%
30%
20%
5B-43
Bayes’ Theorem
Example: Hospital Trauma Centers
• But, given that a malpractice case has been filed
(event A), then the relevant sample space is reduced
to the yellow area of event A.
• The revised probabilities are the relative areas within
event A.
P(B2 | A)
5B-44
P(B1 | A)
P(B3 | A)
Counting Rules
Fundamental Rule of Counting
• If event A can occur in n1 ways and event B
can occur in n2 ways, then events A and B
can occur in n1 x n2 ways.
• In general, m events can occur
n1 x n2 x … x nm ways.
5B-45
Counting Rules
Example: Stock-Keeping Labels
• How many unique stock-keeping unit (SKU)
labels can a hardware store create by using
2 letters (ranging from AA to ZZ) followed by
four numbers (0 through 9)?
• For example,
AF1078: hex-head 6 cm bolts – box of 12
RT4855: Lime-A-Way cleaner – 16 ounce
LL3319: Rust-Oleum primer – gray 15 ounce
5B-46
Counting Rules
Example: Stock-Keeping Labels
• View the problem as filling six empty boxes:
• There are 26 ways to fill either the 1st or 2nd
box and 10 ways to fill the 3rd through 6th.
• Therefore, there are 26 x 26 x 10 x 10 x 10 x
10 = 6,760,000 unique inventory labels.
5B-47
Counting Rules
Example: Shirt Inventory
• L.L. Bean men’s cotton chambray shirt
comes in 6 colors (blue, stone, rust, green,
plum, indigo), 5 sizes (S, M, L, XL, XXL) and
two styles (short and long sleeves).
• Their stock might include 6 x 5 x 2 = 60
possible shirts.
• However, the number of each type of shirt to
be stocked depends on prior demand.
5B-48
Counting Rules
Factorials
• The number of ways that n items can be
arranged in a particular order is n factorial.
• n factorial is the product of all integers from
1 to n.
n! = n(n–1)(n–2)...1
• Factorials are useful for counting the
possible arrangements of any n items.
• There are n ways to choose the first, n-1
ways to choose the second, and so on.
5B-49
Counting Rules
Factorials
• As illustrated below, there are n ways to choose
the first item, n-1 ways to choose the second, n-2
ways to choose the third and so on.
5B-50
Counting Rules
Factorials
• A home appliance service truck must make 3 stops
(A, B, C).
• In how many ways could the three stops be
arranged?
3! = 3 x 2 x 1 = 6
• List all the possible arrangements:
{ABC, ACB, BAC, BCA, CAB, CBA}
• How many ways can you arrange 9 baseball players
in batting order rotation?
9! = 9 x 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1 = 362,880
5B-51
Counting Rules
Permutations
• A permutation is an arrangement in a
particular order of r randomly sampled items
from a group of n items and is denoted by nPr
n!
n Pr 
(n  r )!
• In other words, how many ways can the r
items be arranged, treating each arrangement
as different (i.e., XYZ is different from ZYX)?
5B-52
Counting Rules
Example: Appliance Service Cans
• n = 5 home appliance customers (A, B, C, D, E) need
service calls, but the field technician can service
only r = 3 of them before noon.
• The order is important so each possible
arrangement of the three service calls is different.
• The number of possible permutations is:
n!
5!
5  4  3  2 1 120



 60
n Pr 
(n  r )! (5  3)!
2!
2
5B-53
Counting Rules
Example: Appliance Service Cans
• The 60 permutations with r = 3 out of the n =
5 calls can be enumerated.
• There are 10 distinct groups of 3 customers:
ABC
ABD
ABE
ACD
ACE
ADE
BCD
BCE
BDE
CDE
5B-54
• Each of these can be
arranged in 6 distinct ways:
ABC, ACB, BAC, BCA, CAB, CBA
• Since there are 10 groups of
3 customers and 6 arrangements per group, there are
10 x 6 = 60 permutations.
Counting Rules
Combinations
• A combination is an arrangement of r items
chosen at random from n items where the
order of the selected items is not important
(i.e., XYZ is the same as ZYX).
• A combination is denoted nCr
n!
nCr 
r !(n  r )!
5B-55
Counting Rules
Example: Appliance Service Calls Revisited
• n = 5 home appliance customers (A, B, C, D, E) need
service calls, but the field technician can service
only r = 3 of them before noon.
• This time order is not important.
• Thus, ABC, ACB, BAC, BCA, CAB, CBA would all be
considered the same event because they contain the
same 3 customers.
• The number of possible combinations is:
n!
5!
5  4  3  2 1 120



 10
nCr 
r !(n  r )! 3!(5  3)! (3  2 1)(2 1) 12
5B-56
Counting Rules
Example: Appliance Service Calls Revisited
• 10 combinations is much smaller than the 60
permutations in the previous example.
• The combinations are easily enumerated:
ABC, ABD, ABE, ACD, ACE,
ADE, BCD, BCE, BDE, CDE
5B-57
Applied Statistics in
Business & Economics
End of Chapter 5B
5B-58