Transcript Slides

CS 3343: Analysis of
Algorithms
Lecture 16: Binary search trees & redblack trees
Review: Hash tables
U
(universe of keys)
|U| >> K
&
|U| >> m
k2
0
h(k1)
h(k4)
k1
k4
K
(actual
keys)
T
k5
collision
k3
h(k2) = h(k5)
h(k3)
m-1
• Problem: collision
Chaining
• Chaining puts elements that hash to the
same slot in a linked list:
U
(universe of keys)
k6
k2
k1
k4 ——
k5
k2
——
——
——
k1
k4
K
(actual
k7
keys)
T
——
k5
——
k8
k3
k3 ——
k8
——
k6 ——
k7 ——
Hashing with Chaining
• Chained-Hash-Insert (T, x)
– Insert x at the head of list T[h(key[x])].
– Worst-case complexity – O(1).
• Chained-Hash-Delete (T, x)
– Delete x from the list T[h(key[x])].
– Worst-case complexity – proportional to length of list with
singly-linked lists. O(1) with doubly-linked lists.
• Chained-Hash-Search (T, k)
– Search an element with key k in list T[h(k)].
– Worst-case complexity – proportional to length of list.
Analysis of Chaining
• Assume simple uniform hashing: each key in
table is equally likely to be hashed to any slot
• Given n keys and m slots in the table, the
load factor  = n/m = average # keys per slot
• Average cost of an unsuccessful search for a
key is (1+) (Theorem 11.1)
• Average cost of a successful search is
(2 + /2) = (1 + ) (Theorem 11.2)
• If the number of keys n is proportional to the
number of slots in the table,  = n/m = O(1)
– The expected cost of searching is constant if  is
constant
Hash Functions:
The Division Method
• h(k) = k mod m
– In words: hash k into a table with m slots using the slot given by
the remainder of k divided by m
– Example: m = 31 and k = 78 => h(k) = 16.
• Advantage: fast
• Disadvantage: value of m is critical
– Bad if keys bear relation to m
– Or if hash does not depend on all bits of k
• Pick m = prime number not too close to power of 2 (or 10)
Hash Functions:
The Multiplication Method
• For a constant A, 0 < A < 1:
• h(k) = m (kA mod 1) =  m (kA - kA) 
Fractional part of kA
•
•
•
•
•
•
Advantage: Value of m is not critical
Disadvantage: relatively slower
Choose m = 2P, for easier implementation
Choose A not too close to 0 or 1
Knuth: Good choice for A = (5 - 1)/2
Example: m = 1024, k = 123, A  0.6180339887…
h(k) = 1024(123 · 0.6180339887 mod 1)
= 1024 · 0.018169...  = 18.
A Universal Hash Function
• Choose a prime number p that is larger than all possible
keys
• Choose table size m ≥ n
• Randomly choose two integers a, b, such that
1  a  p -1, and 0  b  p -1
• ha,b(k) = ((ak+b) mod p) mod m
• Example: p = 17, m = 6
h3,4 (8) = ((3*8 + 4) % 17) % 6 = 11 % 6 = 5
• With a random pair of parameters a, b, the chance of a
collision between x and y is at most 1/m
• Expected search time for any input is (1)
Today
• Binary search trees
• Red-black trees
Binary Search Trees
• Data structures that can support
dynamic set operations.
– Search, Minimum, Maximum, Predecessor,
Successor, Insert, and Delete.
• Can be used to build
– Dictionaries.
– Priority Queues.
• Basic operations take time proportional to
the height of the tree – O(h).
BST – Representation
• Represented by a linked data structure of nodes.
• root(T) points to the root of tree T.
• Each node contains fields:
– Key
– left – pointer to left child: root of left subtree (maybe nil)
– right – pointer to right child : root of right subtree.
(maybe nil)
– p – pointer to parent. p[root[T]] = NIL (optional).
– Satellite data
Binary Search Tree Property
• Stored keys must
satisfy the binary
search tree property.
–  y in left subtree of x,
then key[y]  key[x].
–  y in right subtree of
x, then key[y]  key[x].
12
56
26
200
28
18
24
27
190
213
Inorder Traversal
The binary-search-tree property allows the keys of a binary search
tree to be printed, in (monotonically increasing) order, recursively.
Inorder-Tree-Walk (x)
1. if x  NIL
2. then Inorder-Tree-Walk(left[x])
3.
print key[x]
4.
Inorder-Tree-Walk(right[x])
 How long does the walk take?
 (n)
56
26
28
18
12
200
24
27
190
213
Tree Search
Tree-Search(x, k)
1. if x = NIL or k = key[x]
2. then return x
3. if k < key[x]
4. then return Tree-Search(left[x], k)
5. else return Tree-Search(right[x], k)
Example: search for 27
56
26
200
28
18
Running time: O(h)
12
24
27
190
213
Iterative Tree Search
Iterative-Tree-Search(x, k)
1. while x  NIL and k  key[x]
2. do if k < key[x]
3.
then x  left[x]
4.
else x  right[x]
5. return x
56
26
200
28
18
12
24
190
27
The iterative tree search is more efficient on most computers.
The recursive tree search is more straightforward.
213
Finding Min & Max
The binary-search-tree property guarantees that:
» The minimum is located at the left-most node.
» The maximum is located at the right-most node.
Tree-Minimum(x)
1. while left[x]  NIL
2. do x  left[x]
3. return x
Q: How long do they take?
Tree-Maximum(x)
1. while right[x]  NIL
2.
do x  right[x]
3. return x
Predecessor and Successor
• Successor of node x is the node y such that key[y] is the
smallest key greater than key[x].
• The successor of the largest key is NIL.
• Search consists of two cases.
– If node x has a non-empty right subtree, then x’s successor is the
minimum in the right subtree of x.
– If node x has an empty right subtree, then:
• As long as we move to the left up the tree (move up through right
children), we are visiting smaller keys.
• x’s successor y is the node that x is the predecessor of (x is the
maximum in y’s left subtree).
• In other words, x’s successor y, is the lowest ancestor of x whose left
child is also an ancestor of x.
Pseudo-code for Successor
Tree-Successor(x)
1.
if right[x]  NIL
2.
then return Tree-Minimum(right[x])
3. y  p[x]
4. while y  NIL and x = right[y]
5. do x  y
6.
y  p[y]
7. return y
Example: successor of 56
56
26
28
18
Code for predecessor is symmetric.
Running time: O(h)
200
12
24
27
190
190
213
Pseudo-code for Successor
Tree-Successor(x)
1.
if right[x]  NIL
2.
then return Tree-Minimum(right[x])
3. y  p[x]
4. while y  NIL and x = right[y]
5. do x  y
6.
y  p[y]
7. return y
Example: successor of 28
56
26
28
18
Code for predecessor is symmetric.
Running time: O(h)
200
12
24
190
213
27
Lowest node whose left child is an ancestor of x.
BST Insertion – Pseudocode
• Change the dynamic set
represented by a BST.
• Ensure the binary-searchtree property holds after
change.
• Similar to Tree-Search
• Insert z in place of NIL
56
e.g. insert 195
26
200
28
18
12
24
27
190
213
Tree-Insert(T, z)
1. y  NIL
2. x  root[T]
3. while x  NIL
4.
do y  x
5.
if key[z] < key[x]
6.
then x  left[x]
7.
else x  right[x]
8. p[z]  y
9. if y = NIL
10.
then root[t]  z
11.
else if key[z] < key[y]
12.
then left[y]  z
13.
else right[y]  z
195
Running time: O(h)
Tree-Delete (T, x)
if x has no children
 case 0
then remove x
if x has one child
 case 1
then make p[x] point to child
if x has two children (subtrees)
 case 2
then swap x with its successor
perform case 0 or case 1 to delete it
 TOTAL: O(h) time to delete a node
Case 0
• X has no children
• e.g. delete 190
56
26
200
28
18
12
24
27
190
213
Case 1
• X has one child
• e.g. delete 28
56
26
200
28
18
12
24
27
190
213
Case 2
• X has two children
• e.g. delete 26
56
26
200
28
18
12
24
27
190
213
Case 2
• X has two children
• e.g. delete 26
56
27
200
28
18
12
24
Swap with successor
26
190
213
Case 2
• X has two children
• e.g. delete 26
56
27
200
28
18
12
Case 0
24
26
190
213
Case 2
• X has two children
• e.g. delete 26
56
26
200
33
18
12
24
27
28
190
213
Case 2
• X has two children
• e.g. delete 26
56
27
200
33
18
12
24
Swap with successor
26
28
190
213
Case 2
• X has two children
• e.g. delete 26
56
27
200
33
18
12
Case 1
24
26
28
190
213
Case 2
• X has two children
• e.g. delete 26
56
27
200
33
18
12
Case 1
24
28
190
213
Correctness of Tree-Delete
• How do we know case 2 should go to case 0 or
case 1 instead of back to case 2?
– Because when x has 2 children, its successor
is the minimum in its right subtree, and that
successor has no left child (hence 0 or 1
child).
• Equivalently, we could swap with predecessor
instead of successor. It might be good to
alternate to avoid creating lopsided tree.
Deletion – Pseudocode
Tree-Delete(T, z)
/* Determine which node to splice out: either z or z’s successor. */
1.
if left[z] = NIL or right[z] = NIL
2.
then y  z
// case 0 or 1
3.
else y  Tree-Successor[z] // case 2
/* Set x to a non-NIL child of x, or to NIL if y has no children. */
4. if left[y]  NIL
5.
then x  left[y]
6.
else x  right[y]
/* y is removed from the tree by manipulating pointers of p[y] and x
*/
7. if x  NIL
8.
then p[x]  p[y]
/* Continued on next slide */
Deletion – Pseudocode
Tree-Delete(T, z) (Contd. from previous slide)
9. if p[y] = NIL
10.
then root[T]  x
11.
else if y  left[p[i]]
12.
then left[p[y]]  x
13.
else right[p[y]]  x
/* If z’s successor was spliced out, copy its data into z */
14. if y  z
15.
then key[z]  key[y]
16.
copy y’s satellite data into z.
17. return y
Querying a Binary Search Tree
• All dynamic-set search operations can be
supported in O(h) time.
• h = (lg n) for a balanced binary tree (and for an
average tree built by adding nodes in random
order.)
• h = (n) for an unbalanced tree that resembles a
linear chain of n nodes in the worst case.
Red-black trees: Overview
• Red-black trees are a variation of binary
search trees to ensure that the tree is
balanced.
– Height is O(lg n), where n is the number of
nodes.
• Operations take O(lg n) time in the worst
case.
Red-black Tree
• Binary search tree + 1 bit per node: the
attribute color, which is either red or black.
• All other attributes of BSTs are inherited:
– key, left, right, and p.
• All empty trees (leaves) are colored black.
– We use a single sentinel, nil, for all the leaves
of red-black tree T, with color[nil] = black.
– The root’s parent is also nil[T ].
Red-black Tree – Example
26
17
41
30
47
38
nil[T]
Remember: every
internal node has two
children, even though
nil leaves are not
usually shown.
50
Red-black Properties
1.
2.
3.
4.
Every node is either red or black.
The root is black.
Every leaf (nil) is black.
If a node is red, then both its children are
black.
5. For each node, all paths from the node to
descendant leaves contain the same
number of black nodes.
Height of a Red-black Tree
• Height of a node:
– Number of edges in a longest path to a leaf.
• Black-height of a node x, bh(x):
– bh(x) is the number of black nodes (including
nil[T]) on the path from x to leaf, not counting x.
• Black-height of a red-black tree is the blackheight of its root.
– By Property 5, black height is well defined.
Height of a Red-black Tree
h=4
26 bh=2
• Example:
• Height of a node:
h(x) = # of edges in a 17
longest path to a
leaf.
• Black-height of a node
bh(x) = # of black
nodes on path from x
to leaf, not counting x.
• How are they related?
– bh(x) ≤ h(x) ≤ 2 bh(x)
h=1
bh=1
h=2 30
bh=1
h=3
41 bh=2
h=1
bh=1
38
nil[T]
h=2
47 bh=1
h=1 50
bh=1
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h  2 log(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
• Merge red nodes
into their black
parents.
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h  2 log(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
• Merge red nodes
into their black
parents.
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h  2 log(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
• Merge red nodes
into their black
parents.
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h  2 log(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
• Merge red nodes
into their black
parents.
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h  2 log(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
• Merge red nodes
into their black
parents.
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h  2 log(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
• Merge red nodes
h
into their black
parents.
• This process produces a tree in which each node
has 2, 3, or 4 children.
• The 2-3-4 tree has uniform depth h of leaves.
Proof (continued)
• We have
h  h/2, since
at most half
the leaves on any path
are red.
• The number of leaves
in each tree is n
 n  2h‘ – 1
 log(n + 1)  h'  h/2
 h  2 log(n + 1).
h
h
Operations on RB Trees
• All operations can be performed in O(lg n)
time.
• The query operations, which don’t modify
the tree, are performed in exactly the same
way as they are in BSTs.
• Insertion and Deletion are not
straightforward. Why?
Rotations
Left-Rotate(T, x)
x

Right-Rotate(T, y)
y

y


x


Rotations
• Rotations are the basic tree-restructuring operation
for almost all balanced search trees.
• Rotation takes a red-black-tree and a node,
• Changes pointers to change the local structure, and
• Won’t violate the binary-search-tree property.
• Left rotation and right rotation are inverses.
Left-Rotate(T, x)
x

Right-Rotate(T, y)
y

y


x


Left Rotation – Pseudo-code
Left-Rotate (T, x)
1. y  right[x] // Set y.
2. right[x]  left[y] //Turn y’s left subtree into x’s right subtree.
3. if left[y]  nil[T ]
4.
then p[left[y]]  x
5. p[y]  p[x]
// Link x’s parent to y.
6. if p[x] = nil[T ]
x
x
7.
then root[T ]  y
8.
else if x = left[p[x]]
x

 y
y
9.
then left[p[x]]  y
10.
else right[p[x]]  y
 
   
11. left[y]  x
// Put x on y’s left.
12. p[x]  y
y

Rotation
• The pseudo-code for Left-Rotate assumes that
– right[x]  nil[T ], and
– root’s parent is nil[T ].
• Left Rotation on x, makes x the left child of y, and
the left subtree of y into the right subtree of x.
• Pseudocode for Right-Rotate is symmetric:
exchange left and right everywhere.
• Time: O(1) for both Left-Rotate and Right-Rotate,
since a constant number of pointers are modified.
Reminder: Red-black Properties
1.
2.
3.
4.
Every node is either red or black.
The root is black.
Every leaf (nil) is black.
If a node is red, then both its children are
black.
5. For each node, all paths from the node to
descendant leaves contain the same
number of black nodes.
Insertion in RB Trees
• Insertion must preserve all red-black properties.
• Should an inserted node be colored Red? Black?
• Basic steps:
– Use Tree-Insert from BST (slightly modified) to insert a
node x into T.
• Procedure RB-Insert(x).
– Color the node x red.
– Fix the modified tree by re-coloring nodes and
performing rotation to preserve RB tree property.
• Procedure RB-Insert-Fixup.
Insertion
RB-Insert(T, z)
1.
y  nil[T]
2.
x  root[T]
3.
while x  nil[T]
4.
do y  x
5.
if key[z] < key[x]
6.
then x  left[x]
7.
else x  right[x]
8.
p[z]  y
9.
if y = nil[T]
10.
then root[T]  z
11.
else if key[z] < key[y]
12.
then left[y]  z
13.
else right[y]  z
RB-Insert(T, z) Contd.
14. left[z]  nil[T]
15. right[z]  nil[T]
16. color[z]  RED
17. RB-Insert-Fixup (T, z)
How does it differ from the
Tree-Insert procedure of BSTs?
Which of the RB properties
might be violated?
Fix the violations by calling
RB-Insert-Fixup.
Insertion – Fixup
• Problem: we may have one pair of
consecutive reds where we did the
insertion.
• Solution: rotate it up the tree and away…
Three cases have to be handled…
p[p[z]]
Case 1 – uncle y is red
new z
C
C
p[z]
y
A
D
z

B

•
•
•
•
•
•

A

D

z is a right child and p[z] is a left
child here. Similar if z is a left
 child or if p[z] is a right child.

B



p[p[z]] (z’s grandparent) must be black, since z and p[z] are both red and there
are no other violations of property 4.
Make p[z] and y black  now z and p[z] are NOT both red. But property 5
might now be violated.
Make p[p[z]] red  restores property 5.
What’s the new problem now?
The next iteration has p[p[z]] as the new z (i.e., z moves up 2 levels).
When to stop?
Case 2 – y is black, z is a right child
C
C
p[z]
p[z]
 y
A
 y
B
z

z
B




A

• Left rotate around p[z], p[z] and z switch roles  now z is
a left child, and both z and p[z] are red.
• Takes us immediately to case 3.
• Similar if z is a left child and p[z] is a right child
Case 3 – y is black, z is a left child
p[z] B
C
p[z]
 y
B
z
A

•
•
•
•
•
z A


C



Make p[z] black and p[p[z]] red.
Then right rotate on p[p[z]]. Ensures property 4 is
maintained.
No longer have 2 reds in a row.
p[z] is now black  no more iterations.
Similar if both z and p[z] are right children
 y
Insertion – Fixup
RB-Insert-Fixup (T, z)
1. while color[p[z]] = RED
2.
do if p[z] = left[p[p[z]]]
// p[z] is a left child
3.
then y  right[p[p[z]]]
// y: uncle
4.
if color[y] = RED
5.
then color[p[z]]  BLACK // Case 1
6.
color[y]  BLACK
// Case 1
7.
color[p[p[z]]]  RED // Case 1
8.
z  p[p[z]]
// Case 1
Insertion – Fixup
RB-Insert-Fixup(T, z) (Contd.)
9.
else if z = right[p[z]]
// color[y]  RED
10.
then z  p[z]
// Case 2
11.
LEFT-ROTATE(T, z)
// Case 2
12.
color[p[z]]  BLACK
// Case 3
13.
color[p[p[z]]]  RED
// Case 3
14.
RIGHT-ROTATE(T, p[p[z]]) // Case 3
15.
else (if p[z] = right[p[p[z]]])(same as 3-14
16.
with “right” and “left” exchanged)
17. color[root[T ]]  BLACK
Algorithm Analysis
• O(lg n) time to get through RB-Insert up to
the call of RB-Insert-Fixup.
• Within RB-Insert-Fixup:
– Each iteration takes O(1) time.
– Each iteration but the last moves z up 2 levels.
– O(lg n) levels  O(lg n) time.
– Thus, insertion in a red-black tree takes O(lg n)
time.
– Note: there are at most 2 rotations overall.
RB-Insert Example
IDEA: Insert x in tree. Color x red. Only redblack property 4 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
7
Example:
• Insert x =15.
3
18
10
8
22
11
26
15
Insertion into a red-black tree
IDEA: Insert x in tree. Color x red. Only redblack property 4 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
7
Example:
3
• Insert x =15.
• Recolor, move the
violation up the tree.
18
10
8
22
11
26
15
Case 1
Insertion into a red-black tree
IDEA: Insert x in tree. Color x red. Only redblack property 4 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
7
Example:
3
• Insert x =15.
• Recolor, move the
violation up the tree.
18
10
8
22
11
26
15
Insertion into a red-black tree
IDEA: Insert x in tree. Color x red. Only redblack property 4 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
7
Example:
3
• Insert x =15.
• Recolor, move the
violation up the tree.
• RIGHT-ROTATE(18).
18
10
8
22
11
26
15
Case 2
Insertion into a red-black tree
IDEA: Insert x in tree. Color x red. Only redblack property 4 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
7
Example:
3
• Insert x =15.
• Recolor, move the
violation up the tree.
• RIGHT-ROTATE(18).
10
8
18
11
15
22
26
Insertion into a red-black tree
IDEA: Insert x in tree. Color x red. Only redblack property 4 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
7
Example:
3
• Insert x =15.
• Recolor, move the
violation up the tree.
• RIGHT-ROTATE(18).
• LEFT-ROTATE(7)
10
8
18
11
15
Case 3
22
26
Insertion into a red-black tree
IDEA: Insert x in tree. Color x red. Only redblack property 4 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Example:
• Insert x =15.
• Recolor, move the
violation up the tree.
• RIGHT-ROTATE(18).
• LEFT-ROTATE(7)
10
18
7
3
8
11
15
22
26
Insertion into a red-black tree
IDEA: Insert x in tree. Color x red. Only redblack property 4 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Example:
7
• Insert x =15.
• Recolor, move the
3
violation up the tree.
• RIGHT-ROTATE(18).
• LEFT-ROTATE(7) and recolor.
10
18
8
11
15
22
26
Insertion into a red-black tree
IDEA: Insert x in tree. Color x red. Only redblack property 4 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Example:
7
• Insert x =15.
• Recolor, move the
3
violation up the tree.
• RIGHT-ROTATE(18).
• LEFT-ROTATE(7) and recolor.
10
18
8
11
15
Done!
22
26
Deletion
• Deletion, like insertion, should preserve all the RB
properties.
• The properties that may be violated depends on
the color of the deleted node.
– Red – OK. Why?
– Black?
• Steps:
– Do regular BST deletion.
– Fix any violations of RB properties that may result.
– We will skip. Read on your own.
Analysis
• O(lg n) time to get through RB-Delete up
to the call of RB-Delete-Fixup.
• Within RB-Delete-Fixup:
– O(lg n) time.