Tree-Structured Indexes

Download Report

Transcript Tree-Structured Indexes

B+-Tree Index
Chapter 10
Modified by Donghui Zhang
Nov 9, 2005
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
1
Motivation
Suppose every disk page holds 133 records.
 You are given 1334 = 0.3 billion records. They
occupy 1333 = 2.3 million disk pages.
 You can utilize a small memory buffer of 134
pages.
 You can build an index structure.
 Given the key of a record, what is the
minimum guaranteed number of disk I/Os to
find the record?

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
2
Content

B+-tree index




Structure
Search
Insert
Delete
Bulk-loading a B+-tree
 Aggregation Query
 SB-tree

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
3
B+ Tree Structure
Insert/delete at log F N cost; keep tree heightbalanced. (F = fanout, N = # leaf pages)
 Minimum 50% occupancy (except for root). Each
node contains d <= m <= 2d entries. The
parameter d is called the order of the tree.
 Supports equality and range-searches efficiently.

Index Entries
(Direct search)
Data Entries
("Sequence set")
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
4
B+ Tree Equality Search
Search begins at root, and key comparisons
direct it to a leaf.
 Search for 15*…

Root
13
2*
3*
5*
7*
14* 16*
17
24
19* 20* 22*
30
24* 27* 29*
33* 34* 38* 39*
 Based on the search for 15*, we know it is not in the tree!
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
5
B+ Tree Range Search

Search all records whose ages are in [15,28].
 Equality search 15*.
 Follow sibling pointers.
Root
13
2*
3*
5*
7*
14* 16*
17
24
19* 20* 22*
30
24* 27* 29*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
33* 34* 38* 39*
6
B+ Trees in Practice

Typical order: 100. Typical fill-factor: 67%.


Can often hold top levels in buffer pool:





average fanout = 133
Level 1 =
1 page =
8 KB
Level 2 =
133 pages = 1 MB
Level 3 =
17,689 pages = 145 MB
Level 4 = 2,352,637 pages = 19 GB
With 1 MB buffer, can locate one record in 19
GB (or 0.3 billion records) in two I/Os!
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
7
Inserting a Data Entry into a B+ Tree
Find correct leaf L.
 Put data entry onto L.



If L has enough space, done!
Else, must split L (into L and a new node L2)
• Redistribute entries evenly, copy up middle key.
• Insert index entry pointing to L2 into parent of L.

This can happen recursively


To split index node, redistribute entries evenly, but
push up middle key. (Contrast with leaf splits.)
Splits “grow” tree; root split increases height.

Tree growth: gets wider or one level taller at top.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
8
Inserting 8* into Example B+ Tree
Find leaf, in the same way as the Search
algorithm.
 Handle overflow by splitting.

Root
13
2*
3*
5*
7*
14* 16*
17
24
19* 20* 22*
30
24* 27* 29*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
33* 34* 38* 39*
9
Inserting 8* into Example B+ Tree


Observe how
minimum
occupancy is
guaranteed in
both leaf and
index pg splits.
Note difference
between copyup and push-up;
be sure you
understand the
reasons for this.
Entry to be inserted in parent node.
(Note that 5 is
s copied up and
continues to appear in the leaf.)
5
2*
3*
5*
7*
8*
Entry to be inserted in parent node.
(Note that 17 is pushed up and only
appears once in the index. Contrast
this with a leaf split.)
17
5
13
24
30
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
10
Example B+ Tree After Inserting 8*
Root
17
5
2*
3*
24
13
5*
7* 8*
14* 16*
19* 20* 22*
30
24* 27* 29*
33* 34* 38* 39*
 Notice that root was split, leading to increase in height.
 In this example, we can avoid split by re-distributing
entries; however, this is usually not done in practice.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
11
Deleting a Data Entry from a B+ Tree
Start at root, find leaf L where entry belongs.
 Remove the entry.



If L is at least half-full, done!
If L has only d-1 entries,
• Try to re-distribute, borrowing from sibling (adjacent
node with same parent as L).
• If re-distribution fails, merge L and sibling.
If merge occurred, must delete entry (pointing to L
or sibling) from parent of L.
 Merge could propagate to root, decreasing height.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
12
Deleting 19* and 20*
Root
17
5
2*
3*
24
13
5*
7* 8*
14* 16*
19* 20* 22*
30
24* 27* 29*
33* 34* 38* 39*
Deleting 19* is easy.
 Deleting 20* is done with re-distribution.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
13
After Deleting 19* and 20*
Root
17
5
2*
3*
27
13
5*
7* 8*
14* 16*
22* 24*
30
27* 29*
33* 34* 38* 39*
Notice, in re-distribution, how middle key is
copied up.
 If delete 24*…

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
14
... And Then Deleting 24*
Must merge.
 Observe `toss’ of
index entry (on right),
and `pull down’ of
index entry (below).

30
22*
27*
29*
33*
34*
38*
39*
Root
5
2*
3*
5*
7*
8*
13
14* 16*
17
30
22* 27* 29*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
33* 34* 38* 39*
15
Example of Non-leaf Re-distribution
Tree is shown below during deletion of 24*. (What
could be a possible initial tree?)
 In contrast to previous example, can re-distribute
entry from left child of root to right child.

Root
22
5
2* 3*
5* 7* 8*
13
14* 16*
17
30
20
17* 18*
20* 21*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
22* 27* 29*
33* 34* 38* 39*
16
After Re-distribution
Intuitively, entries are re-distributed by `pushing
through’ the splitting entry in the parent node.
 It suffices to re-distribute index entry with key 20;
we’ve re-distributed 17 as well for illustration.

Root
17
5
2* 3*
5* 7* 8*
13
14* 16*
20
17* 18*
20* 21*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
22
30
22* 27* 29*
33* 34* 38* 39*
17
Bulk Loading of a B+ Tree
If we have a large collection of records, and we
want to create a B+ tree on some field, doing so
by repeatedly inserting records is very slow.
 Bulk Loading can be done much more efficiently.
 Initialization: Sort all data entries, insert pointer
to first (leaf) page in a new (root) page.

Root
3* 4*
Sorted pages of data entries; not yet in B+ tree
6* 9*
10* 11*
12* 13* 20* 22* 23* 31* 35* 36*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
38* 41* 44*
18
Bulk Loading (Contd.)



Index entries for leaf
pages always
entered into rightmost index page just
3*
above leaf level.
Assume pages in the
rightmost path to
have double page
size.
Split when double
plus one.
Root
10
6
4*
3* 4*
6* 9*
12
Data entry pages
20
not yet in B+ tree
10* 11* 12* 13* 20*22* 23* 31* 35* 36* 38*41* 44*
Root
12
6
10
6* 9*
Data entry pages
not yet in B+ tree
20
23
10* 11* 12* 13* 20*22* 23* 31* 35* 36* 38*41* 44*
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
19
Summary of Bulk Loading

Option 1: multiple inserts.



Slow.
Does not give sequential storage of leaves.
Option 2: Bulk Loading




Has advantages for concurrency control.
Fewer I/Os during build.
Leaves will be stored sequentially (and linked, of
course).
Can control “fill factor” on pages.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
20
Exercise for B+-tree with Buffering
r, w, p, u: logical read/write, pin, unpin.
 R, W: physical read/write.
 ↑ : one pin.
1
 O : a dirty page.
2
3
 Draw the evolution
of an LRU buffer for
4
5
6
7 8 9
an insertion to page
5 which results page 5 to split, followed by a
search in page 8. Assume the buffer initially
contains page 1 with pincount=1.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
21