Tree-Structured Indexes
Download
Report
Transcript Tree-Structured Indexes
B+-Tree Index
Chapter 10
Modified by Donghui Zhang
Nov 9, 2005
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
1
Motivation
Suppose every disk page holds 133 records.
You are given 1334 = 0.3 billion records. They
occupy 1333 = 2.3 million disk pages.
You can utilize a small memory buffer of 134
pages.
You can build an index structure.
Given the key of a record, what is the
minimum guaranteed number of disk I/Os to
find the record?
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
2
Content
B+-tree index
Structure
Search
Insert
Delete
Bulk-loading a B+-tree
Aggregation Query
SB-tree
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
3
B+ Tree Structure
Insert/delete at log F N cost; keep tree heightbalanced. (F = fanout, N = # leaf pages)
Minimum 50% occupancy (except for root). Each
node contains d <= m <= 2d entries. The
parameter d is called the order of the tree.
Supports equality and range-searches efficiently.
Index Entries
(Direct search)
Data Entries
("Sequence set")
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
4
B+ Tree Equality Search
Search begins at root, and key comparisons
direct it to a leaf.
Search for 15*…
Root
13
2*
3*
5*
7*
14* 16*
17
24
19* 20* 22*
30
24* 27* 29*
33* 34* 38* 39*
Based on the search for 15*, we know it is not in the tree!
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
5
B+ Tree Range Search
Search all records whose ages are in [15,28].
Equality search 15*.
Follow sibling pointers.
Root
13
2*
3*
5*
7*
14* 16*
17
24
19* 20* 22*
30
24* 27* 29*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
33* 34* 38* 39*
6
B+ Trees in Practice
Typical order: 100. Typical fill-factor: 67%.
Can often hold top levels in buffer pool:
average fanout = 133
Level 1 =
1 page =
8 KB
Level 2 =
133 pages = 1 MB
Level 3 =
17,689 pages = 145 MB
Level 4 = 2,352,637 pages = 19 GB
With 1 MB buffer, can locate one record in 19
GB (or 0.3 billion records) in two I/Os!
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
7
Inserting a Data Entry into a B+ Tree
Find correct leaf L.
Put data entry onto L.
If L has enough space, done!
Else, must split L (into L and a new node L2)
• Redistribute entries evenly, copy up middle key.
• Insert index entry pointing to L2 into parent of L.
This can happen recursively
To split index node, redistribute entries evenly, but
push up middle key. (Contrast with leaf splits.)
Splits “grow” tree; root split increases height.
Tree growth: gets wider or one level taller at top.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
8
Inserting 8* into Example B+ Tree
Find leaf, in the same way as the Search
algorithm.
Handle overflow by splitting.
Root
13
2*
3*
5*
7*
14* 16*
17
24
19* 20* 22*
30
24* 27* 29*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
33* 34* 38* 39*
9
Inserting 8* into Example B+ Tree
Observe how
minimum
occupancy is
guaranteed in
both leaf and
index pg splits.
Note difference
between copyup and push-up;
be sure you
understand the
reasons for this.
Entry to be inserted in parent node.
(Note that 5 is
s copied up and
continues to appear in the leaf.)
5
2*
3*
5*
7*
8*
Entry to be inserted in parent node.
(Note that 17 is pushed up and only
appears once in the index. Contrast
this with a leaf split.)
17
5
13
24
30
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
10
Example B+ Tree After Inserting 8*
Root
17
5
2*
3*
24
13
5*
7* 8*
14* 16*
19* 20* 22*
30
24* 27* 29*
33* 34* 38* 39*
Notice that root was split, leading to increase in height.
In this example, we can avoid split by re-distributing
entries; however, this is usually not done in practice.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
11
Deleting a Data Entry from a B+ Tree
Start at root, find leaf L where entry belongs.
Remove the entry.
If L is at least half-full, done!
If L has only d-1 entries,
• Try to re-distribute, borrowing from sibling (adjacent
node with same parent as L).
• If re-distribution fails, merge L and sibling.
If merge occurred, must delete entry (pointing to L
or sibling) from parent of L.
Merge could propagate to root, decreasing height.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
12
Deleting 19* and 20*
Root
17
5
2*
3*
24
13
5*
7* 8*
14* 16*
19* 20* 22*
30
24* 27* 29*
33* 34* 38* 39*
Deleting 19* is easy.
Deleting 20* is done with re-distribution.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
13
After Deleting 19* and 20*
Root
17
5
2*
3*
27
13
5*
7* 8*
14* 16*
22* 24*
30
27* 29*
33* 34* 38* 39*
Notice, in re-distribution, how middle key is
copied up.
If delete 24*…
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
14
... And Then Deleting 24*
Must merge.
Observe `toss’ of
index entry (on right),
and `pull down’ of
index entry (below).
30
22*
27*
29*
33*
34*
38*
39*
Root
5
2*
3*
5*
7*
8*
13
14* 16*
17
30
22* 27* 29*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
33* 34* 38* 39*
15
Example of Non-leaf Re-distribution
Tree is shown below during deletion of 24*. (What
could be a possible initial tree?)
In contrast to previous example, can re-distribute
entry from left child of root to right child.
Root
22
5
2* 3*
5* 7* 8*
13
14* 16*
17
30
20
17* 18*
20* 21*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
22* 27* 29*
33* 34* 38* 39*
16
After Re-distribution
Intuitively, entries are re-distributed by `pushing
through’ the splitting entry in the parent node.
It suffices to re-distribute index entry with key 20;
we’ve re-distributed 17 as well for illustration.
Root
17
5
2* 3*
5* 7* 8*
13
14* 16*
20
17* 18*
20* 21*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
22
30
22* 27* 29*
33* 34* 38* 39*
17
Bulk Loading of a B+ Tree
If we have a large collection of records, and we
want to create a B+ tree on some field, doing so
by repeatedly inserting records is very slow.
Bulk Loading can be done much more efficiently.
Initialization: Sort all data entries, insert pointer
to first (leaf) page in a new (root) page.
Root
3* 4*
Sorted pages of data entries; not yet in B+ tree
6* 9*
10* 11*
12* 13* 20* 22* 23* 31* 35* 36*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
38* 41* 44*
18
Bulk Loading (Contd.)
Index entries for leaf
pages always
entered into rightmost index page just
3*
above leaf level.
Assume pages in the
rightmost path to
have double page
size.
Split when double
plus one.
Root
10
6
4*
3* 4*
6* 9*
12
Data entry pages
20
not yet in B+ tree
10* 11* 12* 13* 20*22* 23* 31* 35* 36* 38*41* 44*
Root
12
6
10
6* 9*
Data entry pages
not yet in B+ tree
20
23
10* 11* 12* 13* 20*22* 23* 31* 35* 36* 38*41* 44*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
19
Summary of Bulk Loading
Option 1: multiple inserts.
Slow.
Does not give sequential storage of leaves.
Option 2: Bulk Loading
Has advantages for concurrency control.
Fewer I/Os during build.
Leaves will be stored sequentially (and linked, of
course).
Can control “fill factor” on pages.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
20
Exercise for B+-tree with Buffering
r, w, p, u: logical read/write, pin, unpin.
R, W: physical read/write.
↑ : one pin.
1
O : a dirty page.
2
3
Draw the evolution
of an LRU buffer for
4
5
6
7 8 9
an insertion to page
5 which results page 5 to split, followed by a
search in page 8. Assume the buffer initially
contains page 1 with pincount=1.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
21