Transcript Document

Divide and Conquer
The most well known algorithm design
strategy:
1. Divide instance of problem into two or
more smaller instances
2. Solve smaller instances recursively
3. Obtain solution to original (larger)
instance by combining these solutions
Divide-and-conquer technique
a problem of size n
subproblem 1
of size n/2
subproblem 2
of size n/2
a solution to
subproblem 1
a solution to
subproblem 2
a solution to
the original problem
Divide and Conquer
Examples





Sorting: mergesort and quicksort
Tree traversals
Binary search
Matrix multiplication-Strassen’s
algorithm
Convex hull-QuickHull algorithm
General Divide and Conquer
recurrence:
T(n) = aT(n/b) + f (n) where f (n) = Θ(nk)
1.
2.
3.
a < bk
a = bk
a > bk
T(n) = Θ(nk)
T(n) = Θ(nk lg n )
T(n) = Θ(nlog b a)
Mergesort
Algorithm:
 Split array A[1..n] in two and make
copies of each half
in arrays B[1.. n/2 ] and C[1.. n/2 ]
 Sort arrays B and C
 Merge sorted arrays B and C into array A
Mergesort
Algorithm:
 Merge sorted arrays B and C into array A as follows:
– Repeat the following until no elements remain in
one of the arrays:
 compare the first elements in the remaining
unprocessed portions of the arrays
 copy the smaller of the two into A, while
incrementing the index indicating the
unprocessed portion of that array
– Once all elements in one of the arrays are
processed, copy the remaining unprocessed
elements from the other array into A.
How Merging Works
A: 7 2 1 6 4 9 5
B: 7 2 1 6
C: 4 9 5
Split The List
B: 1 2 6 7
C: 4 5 9
Sort Each List
Merge the Lists
B: 1
2
4
5
6
7
9
Putting it Together
A: 7 2 1 6 4 9 5
B: 7 2 1 6
C: 4 9 5
Split The List
B: 1 2 6 7
C: 4 5 9
Sort Each List
Each List is sorted by recursively applying mergesort
to the sub-lists
D: 7 2 E: 1 6
H: 7 I: 2
J: 1 K: 6
F: 4 9 G: 5 Split Again
L: 4 M: 9
N: 5
Efficiency of mergesort




All cases have same time efficiency:
Θ( n log n)
Number of comparisons is close to theoretical
minimum for comparison-based sorting:
log (n !) ≈ n log n - 1.44 n
Space requirement: Θ( n ) (NOT in-place)
Can be implemented without recursion
(bottom-up)
Quicksort




Select a pivot (partitioning element)
Rearrange the list so that all the elements in the
positions before the pivot are smaller than or equal
to the pivot and those after the pivot are larger
than the pivot (See algorithm Partition in section
4.2)
Exchange the pivot with the last element in the first
(i.e., ≤ sublist) – the pivot is now in its final
position
Sort the two sublists
p
A[i]≤p
A[i]>p
The partition algorithm
Example: 8 1 12 2 6 10 14 15 4 13 9 11 3 7 5
Efficiency of quicksort

Best case: split in the middle
Θ( n log n)

Worst case: sorted array!
Θ( n2)

Average case: random arrays
Θ( n log n)
Efficiency of quicksort


Improvements:
– better pivot selection: median of three
partitioning avoids worst case in sorted
files
– switch to insertion sort on small subfiles
– elimination of recursion
these combine to 20-25% improvement
Considered the method of choice for internal
sorting for large files (n ≥ 10000)
QuickHull Algorithm
Inspired by Quicksort compute Convex Hull:
 Assume points are sorted by x-coordinate values
 Identify extreme points P1 and P2 (part of hull)
P2
P1
QuickHull Algorithm


Compute upper hull:
– find point Pmax that is farthest away from line P1P2
– compute the hull to the left of line P1Pmax
– compute the hull to the right of line P2Pmax
Compute lower hull in a similar manner
Pmax
P2
P1
Efficiency of QuickHull
algorithm


Finding point farthest away from line P1P2
can be done in linear time
This gives same efficiency as quicksort:
– Worst case: Θ( n2)
– Average case: Θ( n log n)
Efficiency of QuickHull
algorithm


If points are not initially sorted by xcoordinate value, this can be accomplished
in Θ( n log n) — no increase in asymptotic
efficiency class
Other algorithms for convex hull:
– Graham’s scan
– DCHull
also in Θ( n log n)
Closest-Pair Problem:
Divide and Conquer


Brute force approach requires comparing every point with
every other point
Given n points, we must perform 1 + 2 + 3 + … + n-2 + n-1
comparisons.
n 1
k 
k 1



(n  1)  n
2
Brute force  O(n2)
The Divide and Conquer algorithm yields  O(n log n)
Reminder: if n = 1,000,000 then
 n2 =
1,000,000,000,000 whereas
 n log n =
20,000,000
Closest-Pair Algorithm
Given: A set of points in 2-D
Closest-Pair Algorithm
Step 1: Sort the points in one D
Closest-Pair Algorithm
Lets sort based on the X-axis
O(n log n) using quicksort or mergesort
4
9
7
3
1
13
6
2
5
10
11
8
12
14
Closest-Pair Algorithm
Step 2: Split the points, i.e.,
Draw a line at the mid-point between 7 and 8
4
9
7
3
1
13
6
2
5
10
11
8
12
Sub-Problem 1
Sub-Problem 2
14
Closest-Pair Algorithm
Advantage: Normally, we’d have to compare
each of the 14 points with every other point.
(n-1)n/2 = 13*14/2 = 91 comparisons
4
9
7
3
1
13
6
2
5
10
11
8
12
Sub-Problem 1
Sub-Problem 2
14
Closest-Pair Algorithm
Advantage: Now, we have two sub-problems of
half the size. Thus, we have to do 6*7/2
comparisons twice, which is 42 comparisons
solution d = min(d1, d2)
4
2
9
d1
6
1
d2
7
3
5
13
10
11
8
12
Sub-Problem 1
Sub-Problem 2
14
Closest-Pair Algorithm
Advantage: With just one split we cut the
number of comparisons in half. Obviously, we
gain an even greater advantage if we split the
sub-problems.
d = min(d1, d2)
4
2
9
d1
6
1
d2
7
3
5
13
10
11
8
12
Sub-Problem 1
Sub-Problem 2
14
Closest-Pair Algorithm
Problem: However, what if the closest two
points are each from different sub-problems?
4
2
9
d1
6
1
d2
7
3
5
13
10
11
8
12
Sub-Problem 1
Sub-Problem 2
14
Closest-Pair Algorithm
Here is an example where we have to compare
points from sub-problem 1 to the points in subproblem 2.
4
2
d1
6
7
3
1
9
5
8
13
d2
10
11
12
Sub-Problem 1
Sub-Problem 2
14
Closest-Pair Algorithm
However, we only have to compare points inside
the following “strip.”
d = min(d1, d2)
4
2
9
d1
6
d
7
3
1
d
5
13
d2
10
11
8
12
Sub-Problem 1
Sub-Problem 2
14
Closest-Pair Algorithm
Step 3: But, we can continue the advantage by
splitting the sub-problems.
4
9
7
3
1
13
6
2
5
10
11
8
12
14
Closest-Pair Algorithm
Step 3: In fact we can continue to split until each
sub-problem is trivial, i.e., takes one comparison.
4
9
7
3
1
13
6
2
5
10
11
8
12
14
Closest-Pair Algorithm
Finally: The solution to each sub-problem is
combined until the final solution is obtained
4
9
7
3
1
13
6
2
5
10
11
8
12
14
Closest-Pair Algorithm
Finally: On the last step the ‘strip’ will likely be
very small. Thus, combining the two largest subproblems won’t require much work.
4
9
7
3
1
13
6
2
5
10
11
8
12
14
Closest-Pair Algorithm



In this example, it takes 22 comparisons to find the
closets-pair.
The brute force algorithm would have taken 91
comparisons.
But, the real advantage occurs when there are
millions of points.
4
9
7
3
1
13
6
2
5
10
11
8
12
14
Closest-Pair Problem:
Divide and Conquer


Here is another animation:
http://www.cs.mcgill.ca/~cs251/ClosestPair/Closes
tPairApplet/ClosestPairApplet.html
Remember

Homework is due on Friday
– In class and on paper

There is a talk today:
– 4PM RB 340 or 328
– Darren Lim (faculty candidate)
– Bioinformatics: Secondary Structure
Prediction in Proteins
Long Term




HW7 will be given on Friday.
It’ll be due on Wed.
It’ll be returned on Friday.
Exam2 will be on Monday the 22nd