Fundamental Principles of Counting

Transcript Fundamental Principles of Counting

Welcome to Probability and the Theory of Statistics
• This class uses nearly every type of mathematics that you have
studied so far as well as some possibly new ideas.
• For some of you, the idea of counting collections of abstract objects
may be new. You should not wait for the exam (Exam 2) to make
yourself familiar with the ideas involved in this type of counting.
• The most important aspect of this course involves solving
problems--both applied and theoretical problems. In order to
become proficient at solving problems, you must work at it every
day. Do the homework in a timely fashion!
• A new feature of the class this fall involves the Challenge Problems.
These problems should help to stretch your gray matter.
Probability
• Why do we need a theory of probability?
• Areas of application of such a theory include:
• business/economics
• physics
• biology
• other areas of mathematics such as geometry
and number theory
Experiments with known outcomes
• The set of all possible outcomes of an experiment (or
situation) is called the sample space of the experiment (or
situation) and is denoted by S.
• Example. The sample space for predicting tomorrow’s
weather might be
S  { shower, storm, clear, cloudy }.
• Example. Suppose we count the number of times a coin is
flipped until the first head appears. The set of possible
outcomes is the set of positive integers.
S  {1, 2, 3, ...}.
• Note that in the latter example, S is infinite.
Events
• Any subset E of the sample space S is known as an event.
• Example. In the previous weather example, possible
events include: {cloudy}, {shower, storm}, {storm,
cloudy}, {shower, storm, clear, cloudy}, and even the
empty set  .
• For any two events E and F, the new event EF is the set
of all outcomes that are in either E or F or in both E and F.
EF is called the union of E and F.
• Example. {shower, storm}{storm, cloudy}
= {shower, storm, cloudy}.
Events, continued
• For any two events E and F, the new event EF is the set of
all outcomes that are in both E and F. EF is called the
intersection of E and F.
• Example. {shower, storm}{storm, cloudy} = { storm}.
• If EF =  , then E and F are said to be disjoint or mutually
exclusive.
• Question. Which pairs of the following subsets of the
positive integers are mutually exclusive? E = set of even
numbers, F = set of odd numbers, H = set of multiples of 3,
T = set of multiples of 6.
Union of a sequence of events
• If we have a sequence of events, E1, E2, E3, ... , then the
union of these events is defined to be that event which
consists of all outcomes that are in En for at least one value
of n, n = 1, 2, 3, ... . This union is written as

n 1 E n .
• Example. Let S = R, the set of real numbers.
Let [a, b] = { x R | a  x  b } be a closed interval.
Let (a, b) = { x R | a < x < b } be an open interval. Let
En = [–1 + (1/n), 1 – (1/n)]. How would you describe

n 1 E n in the simplest possible way?
Intersection of a sequence of events and complement of an event
• If we have a sequence of events, E1, E2, E3, ... , then the
intersection of these events is defined to be that event
which consists of all outcomes that are in all the events En,
n = 1, 2, 3, ... . This intersection is written as  
n 1 E n .
• Example. Let En = (–1/n, 1/n). How would you describe

n 1 E n in the simplest possible way?
• For any event E, we define the new event Ec, referred to as
the complement of E, to consist of all outcomes in the
sample space S that are not in E.
• Example. If S is the set of positive integers and E is the set
of even numbers, then Ec is the set of _____ numbers.
Containment of events; laws of the algebra of sets
• For any two events, E and F, if all outcomes in E are also
in F, then we say that E is contained in F and we write
E
 F (or equivalently, F  E).
• It follows that E = F <=> E  F and F  E.
• Some of the rules of set algebra:
Commuativelaws
EFFE
EF  FE
Associative laws
(E  F)  G  E  (F  G)
(EF)G  E(FG)
Distributive laws
(E  F)G  EG  FG
EF  G  (E  G)(F  G)
• Venn diagrams are useful for showing the relations among
sets.
Frequency interpretation of probability
• The frequency interpretation of probability is the one of several
ways of interpreting the meaning of the concept of probability.
According to this interpretation the probability of a certain event is
the proportion of times this event occurs when the experiment is
conducted a very large number of times. For other interpretations,
see: http://en.wikipedia.org/wiki/Probability_interpretations
• For a fair coin, we say that the probability of a head showing up is
1/2. Under the frequency interpretation of probability, this means
that if the coin is tossed a very large number of times, the fraction
of times a head is obtained will be approximately 1/2.
• Although we often use the frequency interpretation when thinking
about probability, it does not lend itself to the formation of a theory
in which it is possible to prove theorems. Instead, an axiomatic
approach due to Kolmogorov is used. Based on these axioms, the
frequency interpretation is obtained as a theorem (see Chapter 11:
Laws of Large Numbers).
Axioms of Probability
• We take an abstract approach to defining probability by
stating some properties (axioms) that probability should have,
whatever it is that we mean by probability.
• Consider an experiment whose sample space is S. For each
event E in the sample space S, we assume that a real number
P(E) is defined and that the following axioms are satisfied:
Axiom 1.
0  P(E)
Axiom 2.
P(S) = 1
Axiom 3. For any sequence of mutually exclusive events
E1, E2, E3, ... (that is , events for which EiEj =  , i  j),
P( i1 E i )



i 1
P(E i ).
• We refer to P(E) as the probability of event E.
Some simple propositions which follow from the axioms
• P(  ) = 0
• P(E)  1
• P(Ec) = 1 – P(E)
• If E  F, then P(E)  P(F)
• P(E F) = P(E) + P(F) – P(EF)
• We call the latter proposition the inclusion-exclusion
identity. It has a generalization to n events (see textbook
for general case and next slide for n = 4).
Inclusion-Exclusion for n = 4.
• P(E1 E2  E3  E4) =
P(E1) + P(E2) + P(E3) + P(E4)
–[P(E1E2) + P(E1E3)+ P(E1E4)+ P(E2E3)+ P(E2E4)+ P(E3E4)]
+[P(E1E2E3) + P(E1E2E4) + P(E1E3E4) + P(E2E3E4)]
– P(E1E2E3E4)
How to assign probabilities on a finite sample space
• Consider the sample space S = {1, 2, 3, …, N}. If we
assign probabilities pi to singleton events {i}, then we can
compute the probability of any event E by adding the
probabilities of the elements of E.
• Example. S = {1, 2, 3, …, 8}. P(E)= p3+ p6 + p7.
∙p1 ∙p2
E
∙p3 ∙p4
∙p5 ∙p6 ∙p7 ∙p8
S
8
condition must
satisfy?
p

i 1
i
What
Example for inclusion-exclusion and Venn diagrams
• Judy is taking two books on her holiday vacation. With
probability 0.5 she will like the first book; with probability
0.4 she will like the second book; with probability 0.3 she
will like both books. What is the probability she will like
neither book?
B1
B2
.2 .3
.1
??
S
• Let Bi denote the event that Judy likes book i, i = 1,2. Then
the probability that she likes at least one of the books is
P(B1  B 2 )  P(B1 )  P(B2 )  P(B1B 2 )  0.5  0.4  0.3  0.6
• The probability that she likes neither book is the complement:
P((B1  B2 ) c )  1  P(B1  B2 )  1  0.6  0.4
Sample spaces having equally likely outcomes
• For many experiments or situations, it is natural to assume
that all outcomes are equally likely to occur. Let
S = {1, 2, ... , N} be a finite set containing N elements. It is
often (but not always) natural to assume that
P({1}) = P({2}) = ... = P({N}).
Then Axioms 2 and 3 imply that
P({i}) = 1/N, i = 1, 2, ... , N.
Next, Axiom 3 implies that for any event E,
number of outcomes in E
P(E) 
.
number of outcomes in S
Example for “equally likely”
• Problem. If two dice are rolled, what is the probability that
the sum of the upturned faces is 7?
• In this case, we assume that all 36 possible outcomes are
equally likely. The sample space is shown below.
die 1 \ die 2
1
2
3
4
5
6
1
2
(1,1)
(2,1)
(1,2)
(2,2)
(1,3)
(2,3)
(1,4)
(2,4)
(1,5)
( 2,5)*
(1,6)*
(2,6)
3
4
5
6
(3,1)
(4,1)
(5,1)
(6,1)*
(3,2)
(4,2)
(5,2)*
(6,2)
(3,3)
(4,3)*
(5,3)
(6,3)
(3,4)*
(4,4)
(5,4)
(6,4)
(3,5)
(4,5)
(5,5)
(6,5)
(3,6)
(4,6)
(5,6)
(6,6)
• For the event that the sum of the dice is 7, there are 6
outcomes (marked with an asterisk). The desired
probability is 6/36 = 1/6.
Probability versus Odds
• We say that the odds in favor of an event A are r to s if
r
P(A) 
.
rs
• Similarly, the odds against an event A are s to r if
s
P(A ) 
.
rs
c
• When the odds in favor of A are r to s, it follows that the
odds against A are s to r.
• Example. What are the odds against drawing an ace from
an ordinary deck of 52 cards? The odds in favor of drawing
an ace?
Probability as a continuous set function
• A sequence of events {En, n  1} is:
increasing when E1 E2  E3  ... and then lim En  n 1 E n
n 
decreasing when E1 E2  E3  ... and then lim En  n 1 E n
n 
• Proposition. If {En, n  1} is either an increasing or a
decreasing sequence of events, then
lim P(E n )  P( lim E n ).
n 
n 
• Example. Let S = [–1, 1] and let P([a,b]) = P((a, b)) =
(b – a)/2. Let En = [–1+1/n,1–1/n] and Fn = (0, 1/n). What are
the values of
P(n 1 E n ) and P( n 1 Fn ) ?
Probability as a measure of belief
• One way of thinking of probability is in terms of relative
frequency as described in the textbook. This way of thinking
requires events that can be repeated. However, there are times
when probability is used for events that can’t be repeated.
• Suppose you are on a jury and, in your mind, you assign a
probability of 0.9 to the event, “the defendant is guilty”. You
feel 90% sure that the defendant is guilty. Here, repetition of
the event doesn’t make sense. However, the axioms of
probability must still hold for this type of subjective probability
assignment. The sample space is S = {guilty, innocent}. It
follows that P({innocent}) = 1 – P({guilty}) = 0.1.
• Using the methods of Chapter 3, Conditional Probability and
Independence, subjective probabilities can be updated as more
information becomes available.
Random Selection of Points from Intervals
• A point is said to be randomly selected from an interval
(a, b) if any two subintervals of (a, b) that have the same
length are equally likely to include the point. The
probability associated with the event that the subinterval
(α, β) contains the point is defined to be (β – α)/(b – a).
• Problem. Pick a random time from the interval from
1:00am to 2:00am. What is the probability that the time
picked is later than 1:45am?
Solution. Using the above notation, a = 1, b = 2, α = 1.75,
β = 2, so the desired probability is 0.25. Also, if E is the
event that the time is later than 1:45am, then P(E) = 0.25.
Appendix 1: A useful formula proved by math induction
n
a(1
r
)
2
n -1
a  ar  ar  ...  ar 
, r  1, n  1, 2, 3,...
1- r
Basis step: Put n = 1 on both sides. l.h.s. = a, r.h.s. = a(1-r)/(1-r) = a
Since l.h.s. = r.h.s., the Basis Step is complete.
Induction hypothesis: Assume the formula is true for n = k.
a  ar  ar  ...  ar
2
k
k -1
a(1- r )

1- r
Now add the next term, ark, to both sides of the latter equation.
k
a(1
r
)
2
k -1
k
a  ar  ar  ...  ar  ar 
 ar k
1- r
Get original formula with n replaced by k+1: Simply add fractions.
a  ar  ar  ...  ar
2
k 1
k -1
a(1- r )
 ar 
1- r
k
Appendix 2: An infinite series called a “geometric series”
• Let Sn  a  ar  ar2  ... arn-1. Then Sn is called the
2
n
nth partial sum of the infinite series: a  ar  ar  ...  ar  ...
• The sum of the infinite series is defined to be S where Sn S
as n . When this limit exists as a real number, we say that
the series converges. When this limit does not exist, we say the
series diverges.
• When |r| < 1, the geometric series converges. Furthermore,
S = a/(1-r). This follows from the formula proved in Appendix 1.
• We write
a
a  ar  ar  ...  ar  ... 
, | r | 1.
1- r
2
n
• If we multiply through by r, we have
ar
ar  ar  ...  ar  ... 
, | r | 1.
1- r
2
n
Appendix 3. An example of a countably infinite sample space.
• Let S = {1, 2, 3, … }. That is, S is the set of positive
integers. Assign probabilities to singletons by:
P({i}) 
1
, i  1, 2, 3, ...
i
2
• Verify that P(S) =1.
• Let E be the set of odd integers. Evaluate P(E).
• Let En = {n, n+1, n+2, …}, n = 1, 2, 3, … . Evaluate P(En).
• Verify that lim P(E n )  P( lim E n ) in this example.
n 
n 