Transcript Document

Chapter 4

Divide-and-Conquer

Copyright © 2007 Pearson Addison-Wesley. All rights reserved.

Tromino puzzle

Exercise 4.1.11. A tromino is an L-shape tile formed by 1 by-1 adjacent squares. The problem is to cover any 2 n -by-2 n chessboard with one missing square (anywhere on the board) with trominos. Trominos should cover all the squares except the missing one with on overlaps

Design a divide-and-conquer algorithm for this problem

Divide-and-Conquer

The most-well known algorithm design strategy: 1.

Divide instance of problem into two or more smaller instances 2.

Solve smaller instances ( recursively ) 3.

Obtain solution to original (larger) instance by combining these solutions Pergunta: sempre melhor que força bruta?

contra-exemplo?

Divide-and-Conquer Technique (cont.)

a problem of size n subproblem 1 of size n/2 a solution to subproblem 1 subproblem 2 of size n/2 a solution to subproblem 2 a solution to the original problem Como ficaria o uso em computadores paralelos?

Divide-and-Conquer Examples

Sorting: mergesort and quicksort

Binary tree traversals

Binary search (?)

Multiplication of large integers

Matrix multiplication: Strassen’s algorithm

Closest-pair and convex-hull algorithms

Casos: divisão por 2 e geral

Caso típico: divide por 2

Caso geral:

Instância de tamanho n é dividida em b instâncias de tamanho b/n das quais a devem ser resolvidas (a e b constantes, a>=1; b>1)

Assumindo que n é potência de b (para simplificar análise) temos

T(n) = a T (n/b) + f(n)

Onde f(n) responde pelo tempo que se leva para dividir o problema em menores e/ou para combinar as soluções

Ex: somar série com Divide and Conquer …

General Divide-and-Conquer Recurrence

T(n) = aT(n/b) + f (n) where f(n)

 

(n

d

) , d

0 Master Theorem: If a < b

d

, If a = b

d

, If a > b

d

, T(n)

 

(n

d

) T(n)

 

(n

d

log n) T(n)

 

(

n

log b

a

) Note: The same results hold with O instead of

Examples: T(n) = 4T(n/2) + n T(n) = 4T(n/2) + n 2

 

T(n) = 4T(n/2) + n 3

T(n)

T(n)

? ? T(n)

?

Ver Apêndice B para prova do teorema

General Divide-and-Conquer Recurrence

T(n) = aT(n/b) + f (n) where f(n)

 

(n

d

),

d

0 Master Theorem: If a < b

d

, If a = b

d

, If a > b

d

, T(n)

 

(n

d

) T(n)

 

(n

d

log n) T(n)

 

(

n

log b

a

) Examples: T(n) = T(n) = 2 2 T(n/2) + T(n/2) +

1 n

T(n) = 2 T(n/2) +

n

2 T(n) = 2 T(n/2) +

n

3

   

T(n)

T(n)

?

(a=2, b=2, f(n)=1, d=0)

?

(a=2, b=2, f(n)=n, d=1)

T(n)

T(n)

? ?

(a=2, b=2, f(n)=n (a=2, b=2, f(n)=n

2 3

, d =2) , d=3)

General Divide-and-Conquer Recurrence

T(n) = aT(n/b) + f (n) where f(n)

 

(n

d

),

d

0 Master Theorem: If a < b

d

, If a = b

d

, If a > b

d

, T(n)

 

(n

d

) T(n)

 

(n

d

log n) T(n)

 

(

n

log b

a

) Examples: T(n) = T(n) = 2 2 T(n/2) + T(n/2) +

1 n

T(n) = 2 T(n/2) +

n

2 T(n) = 2 T(n/2) +

n

3

   

T(n)

T(n)

?

(a=2, b=2, f(n)=1, d=0) n

?

(a=2, b=2, f(n)=n, d=1) n log n

T(n)

T(n)

? ?

(a=2, b=2, f(n)=n (a=2, b=2, f(n)=n

2 3

, d =2) n , d=3) n 3 2

Mergesort

  

Split array A[0..n-1] in two about equal halves and make copies of each half in arrays B and C Sort arrays B and C recursively Merge sorted arrays B and C into array A as follows:

Repeat the following until no elements remain in one of the arrays ( total de n elementos ):

compare the first elements in the remaining unprocessed portions of the arrays

copy the smaller of the two into A, while incrementing the index indicating the unprocessed portion of that array

Once all elements in one of the arrays are processed, copy the remaining unprocessed elements from the other array into A.

Pseudocode of Mergesort

Pseudocode of Merge

Mergesort Example

8 3 2 9 7 1 5 4 8 3 2 9 7 1 5 4 8 3 2 9 7 1 5 4 8 3 2 9 7 1 5 4 3 8 2 9 1 7 4 5 2 3 8 9 1 4 5 7 1 2 3 4 5 7 8 9

Analysis of Mergesort

All cases have same efficiency: Θ (n log n)

Number of comparisons in the worst case is close to theoretical minimum for comparison-based sorting:

log 2 n!

n log 2 n - 1.44n

Space requirement: Θ (n) (not in-place)

Can be implemented without recursion (bottom-up)

Estável ???

Quicksort

 

Select a pivot (partitioning element) – here, the first element Rearrange the list so that all the elements in the first s positions are smaller than or equal to the pivot and all the elements in the remaining n-s positions are larger than or equal to the pivot (see next slide for an algorithm)

p

  A[

i

] 

p

A[

i

] 

p

Exchange the pivot with the last element in the first (i.e.,

) subarray — the pivot is now in its final position Sort the two subarrays recursively

Partitioning Algorithm

< =

Quicksort Examples 5 3 1 9 8 2 4 7

3 4 5 6 7

Estável ???

Analysis of Quicksort

  

Best case: split in the middle — Θ (n log n)

n+1 comparações se indices cruzam, n se coincidem

C best (n) = 2 C best (n/2) +n p/ n>1, C best (1)=0

(Master Theorem)

Worst case: sorted array! — Θ (n

2

)

C worst (n) = (n+1) + n + … + 3 = (n+1)(n+2)/2 – 3 Average case: random arrays — Θ (n log n)

C avg (n) = 1/n ∑ s=0 n=1 [(n+1) + C avg (s) + C avg (n-1-s)] p/ n>1, C avg (0)=0, C avg (1)=0

 

Improvements (combine to 20-25% ):

better pivot selection: median of three partitioning

switch to insertion sort on small subfiles

elimination of recursion Considered the method of choice for internal sorting of large files (n ≥ 10000)

Binary Search

Very efficient algorithm for searching in sorted array:

K

vs A[0] . . . A[m] . . . A[n-1] If K = A[m], stop (successful search); otherwise, continue searching by the same method in A[0..m-1] if K < A[m] and in A[m+1..n-1] if K > A[m]

l

m

0; r while l

 

r do n-1

 

(l+r)/2

if K = A[m] return m else if K < A[m] r

else l

m+1 m-1 return -1

Analysis of Binary Search

Time efficiency

worst-case recurrence: solution: C

w

(n) =

C w

(n) = 1 + C

w

(

n/2

 

log 2 (n+1)

), C

w

(1) = 1 This is VERY fast: e.g., C

w

(10 6 ) = 20

Optimal for searching a sorted array

Limitations: must be a sorted array (not linked list)

Bad ( degenerate ) example of divide-and-conquer

Has a continuous counterpart called bisection method for solving equations in one unknown f(x) = 0 (see Sec. 12.4)

Binary Tree Algorithms

Binary tree is a divide-and-conquer ready structure!

Ex. 1: Classic traversals (preorder, inorder, postorder) Algorithm Inorder(T) if T

 

a a

Inorder(T

left

) print(root of T) Inorder(T

right

)

b c b c d e • • d e • • • •

Efficiency: Θ (n)

Binary Tree Algorithms (cont.)

Ex. 2: Computing the height of a binary tree

T

L

T

R h(T) = max{h(T L ), h(T R )} + 1 if T

 

and h(

) = -1 Efficiency: Θ(n)

Multiplication of Large Integers

Consider the problem of multiplying two (large) n-digit integers represented by arrays of their digits such as: A = 12345678901357986429 B = 87654321284820912836 The grade-school algorithm:

a

1

a

2 … a

n b

1

b

2 … b

n

(d 10 ) d 11

d

12 … d 1n (d 20 ) d 21

d

22 … d 2n … … … … … … … (d n0 ) d n1

d

n2 … d

nn

Efficiency:

n

2 one-digit multiplications

First Divide-and-Conquer Algorithm

A small example: A

B where A = 2135 and B = 4014 A = (21· 10 2 So, A

+ 35), B = (40 · B = (21 · 10 2 + 35)

10 2 + 14) (40 · 10 2 + 14) = 21

40 · 10 4 + (21

14 + 35

40) · 10 2 + 35

14 In general, if A = A 1 A 2 and B = B 1 B 2 (where A and B are

n

-digit, A 1 , A 2 , B 1 , B 2 are n/2 -digit numbers), A

B = A 1

B 1 · 10

n

+ (A 1

B 2 + A 2

B 1 ) · 10 n/2 + A 2

B 2 Recurrence for the number of one-digit multiplications M(n): M(n) = 4M(n/2), M(1) = 1 Solution: M(n) =

n

2

Second Divide-and-Conquer Algorithm

A

B = A 1

B 1 · 10

n

+ (A 1

B 2 + A 2

B 1 ) · 10 n/2 + A 2

B 2 The idea is to decrease the number of multiplications from 4 to 3: (A 1 + A 2 )

(B 1 + B 2 ) = A 1

B 1 + (A 1

B 2 + A 2

B 1 ) + A 2

B 2, I.e., (A 1

B 2 + A 2

B 1 ) = (A 1 + A 2 )

(B 1 + B 2 ) - A 1

B 1 - A 2

B 2, which requires only 3 multiplications at the expense of (4-1) extra add/sub.

Recurrence for the number of multiplications M(n): Solution: M(n) = 3M(n/2), M(1) = 1

(backward substitutions, a log b c = c log b a )

M(n) = 3 log 2

n

= n log 2 3 ≈

n

1.585

Example of Large-Integer Multiplication

2135

4014

Strassen’s Matrix Multiplication

Strassen [1969] observed that the product of two matrices can be computed as follows: C 00 C 01 C 10 C 11 A 00 A 01 = * A 10 A 11 B 00 B 01 B 10 B 11 M 1 + M 4 = - M 5 + M 7 M 2 + M 4 M 1 M 3 + M 5 + M 3 - M 2 + M 6

Requer 7 multiplicações enquanto força bruta requer 8

Formulas for Strassen’s Algorithm

M 1 = (A 00 + A 11 )

(B 00 +

B 11 )

M 2 = (A 10 + A 11 )

B 00 M 3 = A 00

(B 01 -

B 11 )

M 4 = A 11

(B 10 -

B 00 )

M 5 = (A 00 + A 01 )

 B 11

M 6 = (A 10 - A 00 )

(B 00 +

B 01 )

M 7 = (A 01 - A 11 )

(B 10 +

B 11 )

Analysis of Strassen’s Algorithm

If n is not a power of 2, matrices can be padded with zeros.

Number of multiplications: M(n) = 7M(n/2), M(1) = 1 Solution: M(n) = 7 log 2n = n log 27 ≈

n

2.807 vs.

n

3 of brute-force alg.

Algorithms with better asymptotic efficiency are known but they are even more complex.

Closest-Pair Problem by Divide-and-Conquer

Step 1 Divide the points given into two subsets S 1 and S 2 by a vertical line x = c so that half the points lie to the left or on the line and half the points lie to the right or on the line.

 

We can assume that the points are ordered on x coordinates (may use mergesort, O(nlogn)). Can use as c the median of x coordinates

Closest Pair by Divide-and-Conquer (cont.)

Step 2 Find recursively the closest pairs for the left and right subsets.

Step 3 Set d = min{d 1 , d 2 } We can limit our attention to the points in the symmetric vertical strip of width 2d as possible closest pair. Let C 1 and C 2 be the subsets of points in the left subset S 1 and of the right subset S 2 , respectively, that lie in this vertical strip. The points in C 1 and C 2 are stored in increasing order of their y coordinates, which is maintained by merging during the execution of the next step.

Step 4 For every point P(x,y) in C 1 , we inspect points in C 2 that may be closer to P than d. There can be no more than 6 such points (because d

d

2 )!

Closest Pair by Divide-and-Conquer: Worst Case

The worst case scenario is depicted below:

Efficiency of the Closest-Pair Algorithm

Running time of the algorithm is described by

(the time M(n) for merging solutions is O(n))

T(n) = 2T(n/2) + M(n) , where M(n)

O(n) By the Master Theorem (with a = 2, b = 2, d = 1) T(n)

O(n log n)

Brute force Convex hull

P 1 P 2 P 3 P 4 P 5 P 7 P 6 P 8

Brute force Convex hull

P 1 P 2 P 3 P P 4 5 P 7 P 8 P 6   

Simple but inefficient:

Dois pontos P1 e P2 fazem parte da borda se e só se todos os demais pontos estão de um único lado do segmento P1P2

(assumindo que não há 3 pontos na mesma linha) Linha (x1, x2), (x2, y2) definida por ax+by=c onde a=y2-y1, b=x1-x2, c=x1x2-y1y2 Essa linha divide o plano de 2 half-planes: para todos os pontos de un deles, ax+by > c, para o outro ax+by

Quickhull Algorithm

Convex hull: smallest convex set that includes given points

Assume points are sorted by x-coordinate values

  

Identify extreme points P 1 and P 2 (leftmost and rightmost) Compute upper hull recursively:

• • •

find point P max that is farthest away from line P 1

P

2 compute the upper hull of the points to the left of line P 1

P

max compute the upper hull of the points to the left of line P max

P

2 Compute lower hull in a similar manner

P max P 2 P 1

Efficiency of Quickhull Algorithm

 

Finding point farthest away from line P 1

P

2 can be done in linear time Time efficiency:

• •

worst case: Θ (n

2

) (as quicksort) average case: Θ (

????

) (under reasonable assumptions about distribution of points given)

If points are not initially sorted by x-coordinate value, this can be accomplished in O(n log n) time

Several O(n log n) algorithms for convex hull are known