Lecture 6 - 2-4 Trees
Download
Report
Transcript Lecture 6 - 2-4 Trees
More Trees
Multiway Trees
and
2-4 Trees
Motivation of Multi-way Trees
Main memory vs. disk
◦ Assumptions so far:
◦ We have assumed that we
can store an entire data
structure in the main
memory of a computer.
◦ What if we have more
data than can fit in main
memory?
◦ Meaning that we must have
the data structure reside on
disk.
◦ The rules of the game
change, because the Big-Oh
model doesn’t apply if all
operations are not equal.
Motivation of Multi-way Trees
Main memory vs. disk
◦ Disk Accesses are
incredibly expensive.
◦ 1 disk access is worth
about 4,000,000
instructions.
◦ (See the book for
derivation)
◦ So we’re willing to do lots
of calculations just to save
disk accesses.
Motivation of Multi-way Trees
For example:
◦ Suppose we want to
access the driving
records of the citizens
of Florida.
10 million items.
Assume doesn’t fit in
main memory.
Assume in 1 sec, can
execute 25 million
instructions or perform
6 disk accesses.
The Unbalanced Binary tree
would be a disaster.
In the worst case, it has linear
depth and could require 10
mil disk accesses.
An AVL Tree
In the typical case, it has a
depth close to log N, log 10
mil ≈ 24 disk accesses,
requiring 4 sec.
The point is…
Reduce the # of disk
accesses to a very small
constant,
Such as 3 or 4
And we are willing to write
complicated code to do this,
because in comparison machine
instructions are essentially free.
As long as we’re not ridiculous.
We cannot go below
log N using a BST
◦ Even an AVL
Solution??
◦ More branching, less
height.
◦ Multiway Tree
Multiway Tree (or B-tree in book)
Multiway tree is similar to BST
◦ In a BST
We need 1 key to decide which of 2 branches to take.
◦ In an Multiway tree
We need M-1 keys to decide which branch to take, where M is the
order of the Multiway tree.
Also need to balance,
Otherwise like a BST it could degenerate into a linked list.
◦ Here is a Multiway tree of Order 5:
Specifications of a MultiwayTree
I2
10
20
30
All values
in S2 < I2,
but ≥ to I1.
All values
in S1 < I1.
S1
3
1
_
_
_
I1
If a node contains k items (I1, I2.. Ik),
it will contain k+1 subtrees.
Specifically subtrees S1 thru Sk+1
I3
7
3
4
5
_
7
8
_
_
All values in
S3 < I3, but ≥ I2.
S2
_
15
10
12
_
_
S3
_
15
17
_
_
All
values in
Sk+1 >Ik.
_
22
20
21
__
28
22
26
_
_
Sk+1
40
50
_
28
29
_
_
30
__
_
40
45
_
_
60
55
_
_
_
65
70
_
_
2-4 Trees
2-4 Trees
◦ Specific type of multitree.
1) Every node must have in between 2 and 4 children.
(Thus each internal node must store in between 1 and
3 keys)
2) All external nodes (the null children of leaf nodes) and
leaf nodes have the same depth.
Example of a 2-4 tree
10
3
1
_
_
_
7
3
4
5
_
_
7
8
_
_
15
10
12
_
_
_
15
17
_
_
20
_
30
22
20
21
__
28
22
26
_
_
_
28
29
_
_
40
30
__
_
50
40
45
_
_
60
55
_
_
_
65
70
_
_
Insert
Insert 4 into the 2-4 tree below.
Compare 4 to vals in root node.
4 < 10,
4 goes in the subtree to left of 10.
10
20
30
4 > 3 and 4 < 7,
4 goes in the subtree to
right of 3 and left of 7
3
1
_
_
7
4
5
_
5
_
_
8
_
_
15
12
_
_
_
17
_
_
_
22
21
__
__
28
26
_
_
_
28
29
__
40
_
_
_
50
45
_
_
60
55
_
_
65
70
_
Problems with Insert
What if the node
that a value gets
inserted into is full?
◦ We could just insert
the value into a new
level of the tree.
◦ BUT then not ALL of
the external nodes will
have the same depth
after this.
Insert 18 into the
following tree:
10
3
7
_
20
13
15
17
_
22
_
_
Problems with Insert
What if the node
that a value gets
inserted into is full?
◦ We could just insert
the value into a new
level of the tree.
◦ BUT then not ALL of
the external nodes will
have the same depth
after this.
Insert 18 into the
following tree:
10
3
7
_
20
13
15
17
18
_
22
_
_
The node has too many values.
You can send one of the
values to the parent (the
book’s convention is to sent
the 3rd value.
Problems with Insert
What if the node
that a value gets
inserted into is full?
◦ We could just insert
the value into a new
level of the tree.
◦ BUT then not ALL of
the external nodes will
have the same depth
after this.
Insert 18 into the
following tree:
10
3
7
_
17
13
15
_
20
18
__
__
22
_
_
Moving 17, forces you
to “split” the other
three values.
Other Problems with Insert
In the last example,
the parent node was
able to “accept” the
17.
◦ What if the parent
root node becomes
full?
Insert 12 into the 2-4
Tree below:
10
5
_
20
11
14
17
_
30
25
__
__
32
37
_
_
Other Problems with Insert
In the last example,
the parent node was
able to “accept” the
17.
Insert 12 into the 2-4
Tree below:
10
◦ What if the parent
root node becomes
full?
5
_
20
11
12
14
17
30
25
__
__
32
37
_
_
Using the rule from
before, let’s move 12
up.
Other Problems with Insert
In the last example,
the parent node was
able to “accept” the
17.
Insert 12 into the 2-4
Tree below:
10
◦ What if the parent
root node becomes
full?
5
_
14
11
12
20
17
__
__
30
25
__
__
32
37
_
_
Using the rule from
before, let’s move 14
up.
Other Problems with Insert
In the last example,
the parent node was
able to “accept” the
17.
◦ What if the parent
root node becomes
full?
Insert 12 into the 2-4
Tree below:
10
5
_
14
11
12
20
25
__
__
30
17
__
__
32
37
_
_
Now this has too many
parent nodes AND subtrees!
We can just repeat the process
and make a new root.
Other Problems with Insert
In the last example,
the parent node was
able to “accept” the
17.
Insert 12 into the 2-4
Tree below:
20
◦ What if the parent
root node becomes
full?
10
5
_
30
14
11
12
17
__
__
25
__
__
32
37
_
_
Now this has too many
parent nodes AND subtrees!
We can just repeat the process
and make a new root.
Deletion from a 2-4 Tree
Delete a non-leaf
value
◦ Replace that value
with the largest value
in its left subtree
Or smallest value in its
right subtree.
Deletion from a 2-4 Tree
Delete a leaf node
◦ In the standard case:
a value can simply be
removed from a leaf
node that stores more
than one value.
Requires no structural
change.
10
20
30
10
20
30
Delete 5
5
7
_
12
17
__
23
27
35
35
_
_
7
_
12
17
__
23
27
35
35
_
_
Deletion from a 2-4 Tree
BUT what if the leaf node has ONLY one value?
◦ If you get rid of it,
then it would violate the 2-4 property that all leaf nodes MUST be on the
same height of the tree.
Break up into 2 cases:
1) An adjacent sibling has more than one value stored in its node.
2) An adjacent sibling does NOT have more than one value stored
in its node, and a fusion operation MUST be performed.
Deletion from a 2-4 Tree
Case 1:
Take the 10 to replace the
5,
And then simply replace
the 10 with the smallest
value in its right subtree.
◦ Consider deleting 5 from
the following tree:
An adjacent sibling has more
than one value stored in its
node.
10
20
◦ This is okay, since there is
more than one value at this
subtree.
30
12
20
30
Delete 5
5
_
_
12
17
__
23
27
_
35
_
_
10
_
__
17
_
__
23
27
_
35
_
_
Deletion from a 2-4 Tree
Case 2:
◦ If an adjacent sibling does NOT have more than one value stored in its node,
and a fusion operation MUST be performed.
◦ The fusion operation is a little more difficult since it may result in needing
another fusion at a parent node.
10
20
30
10
20
30
10
Delete 5
5
_
_
15
_
__
25
_
_
35
_
_
20
30
Fuse empty
node with 15
_
_
15
_
__
25
_
_
35
_
_
15
_
__
25
_
_
35
_
_
We have 3 child nodes when we
should have 4. Thus we can drop
a value, we will drop 10.
Deletion from a 2-4 Tree
Case 2:
◦ If an adjacent sibling does NOT have more than one value stored in its node,
and a fusion operation MUST be performed.
◦ The fusion operation is a little more difficult since it may result in needing
another fusion at a parent node.
10
20
30
20
30
Drop one parent into
fused node
15
_
__
25
_
_
35
_
_
10
15
__
25
_
_
35
_
_
We have 3 child nodes when we
should have 4. Thus we can drop
a value, we will drop 10.
Examples on the Board