Lecture 2: Randomized algo for Approximate median and

Download Report

Transcript Lecture 2: Randomized algo for Approximate median and

Randomized Algorithms
CS648
Lecture 2
β€’ Randomized Algorithm for Approximate Median
β€’ Elementary Probability theory
1
RANDOMIZED MONTE CARLO ALGORITHM
FOR
APPROXIMATE MEDIAN
This lecture was delivered at slow pace and its flavor was that of a
tutorial.
Reason: To show that designing and analyzing a randomized
algorithm demands right insight and just elementary probability.
2
A simple probability exercise
There is a coin which gives HEADS with probability ¼ and TAILS with
probability ¾. The coin is tossed π‘˜ times. What is the probability that we get
at least π‘˜/2 HEADS ?
[Stirling’s approximation for Factorial: 𝑛! β‰ˆ
2πœ‹π‘›
(𝑛 𝑒)𝑛 ]
3
Probability of getting
β€œat least π‘˜/2 HEADS in π‘˜ tosses”
Probability of getting at least π‘˜/2 heads:
=
π‘˜
𝑖=π‘˜/2 𝐏𝐫[𝑖
=
π‘˜
𝑖=π‘˜/2
≀
=
≀
≀
=
≀
π‘˜
𝑖
HEADS appear in π‘˜ tosses]
(1 4)𝑖 (3 4)π‘˜βˆ’π‘–
π‘˜
π‘˜
(1 4)𝑖 (3 4)π‘˜βˆ’π‘–
𝑖=π‘˜/2
π‘˜/2
π‘˜
(3 4)π‘˜ π‘˜π‘–=π‘˜/2 (1 3)𝑖
π‘˜/2
Using Stirling’s approximation
π‘˜
π‘˜
1 4π‘˜/2
(3 4)π‘˜ (1 3)π‘˜/2 3 2
≀
2
π‘˜/2
π‘˜/2
4π‘˜/2 (3 4)π‘˜ (1 3)π‘˜/2
Since (3 4)5/2 ≀ 1 2 , so …
π‘˜/2
3
( 4)
Inverse exponential in π‘˜.
(1 2)π‘˜/5
4
Approximate median
Definition: Given an array A[] storing n numbers and Ο΅ > 0, compute an
element whose rank is in the range [(1- Ο΅)n/2, (1+ Ο΅)n/2].
Best Deterministic Algorithm:
β€’ β€œMedian of Medians” algorithm for finding exact median
β€’ Running time: O(n)
β€’ No faster algorithm possible for approximate median
Can you give a short proof ?
5
½ - Approximate median
A Randomized Algorithm
Rand-Approx-Median(A)
1. Let k οƒŸ c log n;
2. S οƒŸ βˆ…;
3. For i=1 to k
4.
x οƒŸ an element selected randomly uniformly from A;
5.
S οƒŸ S U {x};
6. Sort S.
7. Report the median of S.
Running time: O(log n loglog n)
6
Analyzing the error probability of
Rand-approx-median
n/4
Left Quarter
Elements of A arranged in
Increasing order of values
3n/4
Right Quarter
When does the algorithm err ?
To answer this question, try to characterize what
will be a bad sample S ?
7
Analyzing the error probability of
Rand-approx-median
n/4
Elements of A arranged in
Increasing order of values
Left Quarter
3n/4
Median of S
Right Quarter
Observation: Algorithm makes an error only if k/2 or more elements
sampled from the Right Quarter (or Left Quarter).
8
Analyzing the error probability of
Rand-approx-median
n/4
Elements of A arranged in
Increasing order of values
3n/4
Right Quarter
Left Quarter
Pr[ An element selected randomly from A is from Right quarter] = ??
¼
Pr[ Out of k elements sampled from A, at least k/2 are from Right quarter] = ??
≀ (1 2)π‘˜/5
log 𝑛 for π‘˜ = 10 log 𝑛
= (1 2)2Exactly
the same as the coin
= π‘›βˆ’2 tossing exercise we did !
9
Main result we discussed
Theorem: The Rand-approx-median algorithm fails to report ½ approximate median from array A[1.. 𝑛] with probability at
most 2 π‘›βˆ’2 .
Homework: Design a randomized Monte Carlo algorithm for
computing Ο΅-approximate median of array A[1.. 𝑛] with running
time O(log n loglog n) and error probability π‘›βˆ’π‘ for any given
constants Ο΅ and 𝑐.
[Do this homework sincerely without any friend’s help.]
10
ELEMENTARY PROBABILITY THEORY
(IT IS SO SIMPLE THAT YOU UNDERESTIMATE ITS ELEGANCE AND POWER)
11
Elementary probability theory
(Relevant for CS648)
β€’ We shall mainly deal with discrete probability theory in this course.
β€’ We shall take the set theoretic approach to explain probability theory.
Consider any random experiment :
o Tossing a coin 5 times.
o Throwing a dice 2 times.
o Selecting a number randomly uniformly from [1..n].
How to capture the following facts in the theory of probability ?
1. Outcome will always be from a specified set.
2. Likelihood of each possible outcome is non-negative.
3. We may be interested in a collection of outcomes.
12
Probability Space
Definition: Probability space associated with a random experiment is an
ordered pair (Ξ©,P), where
β€’ Ξ© is the set of all possible outcomes of the random experiment
β€’ P : Ξ© οƒ R such that
– P(Ο‰) β‰₯ 0 for each ωϡ Ξ©
– ωϡ Ξ© P(Ο‰) = 1
Ξ©
Elements of Ξ© are called elementary events or sample points.
13
Event in a Probability Space
Definition: An event A in a probability space (Ξ©,P) is a subset of Ξ©. The
probability of event A is defined as
P(Ο‰)
ωϡ A
A
Ξ©
For sake of compact notation, we extend P for events as described above.
14
Exercises
A randomized algorithm can also be viewed as a random experiment.
1. What is the sample space associated with Randomized Quick sort ?
2. What is the sample space associated with Rand-approx-median
algorithm ?
15
An Important Advice
In the following slides, we shall state well known equations
(highlighted in yellow boxes) from probability theory.
β€’ You should internalize them fully.
β€’ We shall use them crucially in this course.
β€’ Make sincere attempts to solve exercises that follow.
16
Union of two Events
Given two events A and B defined over a probability space (𝛀,P), what is
P(AUB) ?
A
B
Ξ©
P(AUB) = P(A) + P(B)
βˆ’ P(A∩B)
Try to prove it by showing the following:
Each Ο‰ Ο΅ AUB contributes exactly P(Ο‰) in the right hand side.
17
Union of three Events
Given three events A₁, Aβ‚‚, A₃, defined over a probability space (𝛀,P), what is
P(A₁ U Aβ‚‚ U A₃) ?
A
B
C
Ξ©
P(A₁ U Aβ‚‚UA₃) = P(A₁) + P(Aβ‚‚) + P( A₃)
βˆ’ P(Aβ‚βˆ©Aβ‚‚) βˆ’ P(Aβ‚‚βˆ©A₃) βˆ’ P(Aβ‚βˆ©A₃)
+ P(Aβ‚βˆ©Aβ‚‚βˆ©A₃)
Try to prove this equation as well by showing the following:
Each Ο‰ Ο΅ A₁ U Aβ‚‚UA₃ contributes exactly P(Ο‰) in the right hand side.
18
Exercises
β€’ For events 𝐀1 ,…, 𝐀𝑛 defined over a probability space (𝛀,P), prove that
P( 𝑛𝑖=1 𝐀𝑖 ) =
𝑖 P(𝐀 𝑖 )
βˆ’ 𝑖<𝑗 P(𝐀𝑖
𝐀𝑗 )
+
𝑖<𝑗<π‘˜ P(𝐀 𝑖
βˆ’β€¦
(βˆ’1)𝑛+1 P(𝐀1
𝐀𝑗
𝐀2 …
π€π‘˜ )
𝐀𝑛 )
β€’ There are 𝑛 letters 𝑛 envelopes. For each letter, there is a unique envelope
in which it should be placed. A careless postman places the letters
randomly into envelopes (one letter in each envelope). What is the
probability that no letter is placed correctly (into the envelope meant for
it) ?
19
Conditional Probability
Happening of some event influences the likelihood of happening of other events. This
notion is formally captured by conditional probability as follows.
Probability of event A conditioned on event B, compactly represented as P[A|B],
means the following.
Given that event B has happened, what is the probability that event A has also
happened ?
You might have seen and used the following equation for conditional probability.
P[A|B] =
𝐏[π€βˆ©π]
𝐏[𝐁]
Can you give suitable reason to justify the validity of the above equation ?
In particular, give justification for 𝐏[𝐀 ∩ 𝐁] in numerator and 𝐏[𝐁] in denominator in
this equation.
20
Exercises
β€’ A man possesses five coins, two of which are double-headed, one is
double-tailed, and two are normal. He shuts his eyes, picks a coin at
random, and tosses it. What is the probability that the lower face of the
coin is a head ? He opens his eyes and sees that the coin is showing heads;
what it the probability that the lower face is a head ? He shuts his eyes
again, and tosses the coin again. What is the probability that the lower
face is a head ? He opens his eyes and sees that the coin is showing heads;
what is the probability that the lower face is a head ? He discards this coin,
picks another at random, and tosses it. What is the probability that it
shows heads ?
21
Partition of sample space and
an β€œimportant Equation”
A set of events 𝐀1 ,…, 𝐀𝑛 defined over a probability space (𝛀,P) is said to
induce a partition of 𝛀 if
𝑛
β€’
𝑖=1 𝐀 𝑖 = 𝛀
β€’ 𝐀𝑖
𝐀𝑗 =βˆ… for all𝑖 β‰  𝑗
B
Ξ©
Given an event B, how can we express P(B) in terms of a given partition ?
P(B) =
𝑖 P(𝐀 𝑖 ∩B
)
22
Exercises
β€’ There are 𝑛 sticks each of different heights. There are 𝑛 vacant slots
arranged along a line and numbered from 1 to 𝑛 as we move from left to
right. The sticks are placed into the slots according to a uniformly random
permutation. A stick placed at 𝑖th slot is said to be a dominating stick if its
height is largest among all sticks placed in slots 1 to 𝑖 βˆ’ 1. Find the
probability that 𝑖th slot contains a dominating stick.
23
Independent Events
Two events A and B defined over a probability space (𝛀,P) are said to be
independent if happening of one of them has no influence on the probability
of the another event. Mathematically, it means that
P(A|B)= P(A) and P(B|A)=P(B)
The following equation also compactly captures independence of two events.
P(A ∩ B) = P(A) · P(B)
Question: Can two independent events ever be disjoint ?
24
Exercises
1.
Two fair dice are rolled. Show that the event that their sum is 7 is
independent of the score shown by the first die.
2.
Let (𝛀,P) be a probability space where 𝛀 = {1,2,…,p} for a given prime
number p, and each elementary event has probability 1/p. Show that if
two events A and B defined over (𝛀,P) are independent, then at least
one of A and B is either βˆ… or 𝛀.
25