C-Store: An Introduction to Berkeley DB

Download Report

Transcript C-Store: An Introduction to Berkeley DB

C-Store: An Introduction to
Berkeley DB
Jianlin Feng
School of Software
SUN YAT-SEN UNIVERSITY
Mar. 13, 2009
Overview of Berkeley DB

Means the Berkeley Database



Embedded ?




An open-source, embedded transactional data
management system
A key/value store
As a library that is linked with an application
Hides data management from end-user
Scales from Bytes to Petabytes
Runs on everything from cell phone to large
servers.
Berkeley DB : Examples of Applications

Google Accounts

Store all user and service account information and
preferences.

Amazon’s user-customization

Berkeley DB has high reliability and high
performance.
Berkeley DB: A Brief History (1)

Began life in 1991 as a dynamic linear
hashing implementation.


Released as a library in the 4.4 BSD in 1992.


historic UNIX database libraries: dbm, ndbm and
hsearch
db-1.85 == Hash + B-Tree
The package LIBTP


Transactional Implementation of db-1.85
A research prototype that was never released.
Berkeley DB: A Brief History (2)

In 1996, Seltzer and Bostic started Sleepycat
Software.


Berkeley DB 2.0, Released in 1997



for use in the Netscape browser
Transactional implementation
the first commercial release
Berkeley DB 3.0, Released in 1999

Transformed into an Object-Oriented Handle and
Method style API.
Berkeley DB: A Brief History (3)

Berkeley DB 4.0, Released in 1999


Single-Master, Multiple-Reader Replication
High Availability


High Scalability



replicas can take over for a failed master
Read-only replicas can reduce master load
Similar ideas are adopted in C-Store.
In Feb. 2006, Oracle acquired Sleepycat.
Sleepycat Public License:
a Dual License

The code



Is open source
And may be downloaded and used freely
However, redistribution requires


Either the package using Berkeley DB be
released as open source
Or that the distributors obtain a commercial
license from Sleepycat (and now Oracle, acquired
in Feb. 2006).
Berkeley DB: Product Family Today


The original Berkeley DB library
Berkeley DB XML


Atop the library
Berkeley DB Java Edition

100% pure Java implementation
Berkeley DB :
Product Family Architecture
Berkeley DB: The Design Philosophy

Provide mechanisms without specifying
policies

For example, Berkeley DB is abstracted as a
store of <key, value> pairs.



Both keys and values are opaque byte-strings.
i.e., Berkeley DB has no schema,
And the application that embeds Berkeley DB is
responsible for imposing its own schema on the
data.
Advantages of <key, value> pairs

An application is free to store data in
whatever form is most natural to it.




Objects (like structures in C language)
Rows in Oracle, SQL Server
Columns in C-store
Different data formats can be stored in the
same databases.

As long as the application understands how to
interpret the data items.
Indexing Key Values

Indexing methods





B-Tree
Hash
Queue
A record-number-based index implemented atop
B-Tree
Data manipulation



Put,
store key/value pairs
Get,
retrieve key/value pairs
Delete, remove key/value pairs
How Applications Access key/value pairs?

Through handles on databases


Or through cursor handles



Similar to relational tables
Representing a specific place within a database
Used for iteration, i.e., fetch a key/value pair each
time.
Databases are implemented atop OS file
system.

A file may contain one or more databases.
Berkeley DB Replication:
A Log-Shipping System

A Replication Group





A single Master
One or more Read-Only Replicas.
All write operations must be processed
transactionally by the Master
The Master sends log records to each of the
Replicas.
The Replicas apply log records only when
they receive a transaction commit record.
Berkeley DB: Configuration Flexibility

Configuration flexibility is critical


Due to a wide range of applications
Three ways



Compile Time Configuration
Feature Set Selection
Runtime Configuration
Compile Time Configuration

Option 1:





-enable-smallbuild
For use in a cell phone
The compiled library contains only B-Tree index,
Omits replication, cryptography, statistics
collection, etc. The library is about 0.5 MB.
Option 2:



small footprint build
higher concurrency locking
-enable-fine-grained-lock-manager
For use in a Data Center
Lock-Based Concurrency Control
Feature Set Selection
The Data Store (DS) feature set
1.


Most similar to the original db-1.85 library
Good for temporary data storage
The Concurrent Data Store (CDS) feature set
2.


Acquires a single lock per API invocation
Good for Read-Most applications
The Transactional Data Store (TDS) feature set
3.


Currently the most widely used feature set
Acquires a single lock per page
The High Availability (HA) feature set
4.

Can continue running even after a site fails.
Runtime Configuration

Index Selection and Tuning


Trading off Durability and Performance




Applications can select the page size in an index
No-force log write
Extreme case: applications can run completely in
memory
Trading off Two-Phase Locking and
Multiversion Concurrency Control.
Note: C-Store adopts similar ideas for high
performance.
Challenges of Berkeley DB’s Flexibility

Need flexibility in Berkeley DB designers

Need flexibility in application developers
Any Dream? Any Idea?

iGoogle中国大学生创新设计大赛

中山大学软件学院第四届软件创新设计大赛

Some Research with Me?
References



M Seltzer . Berkeley DB: A Retrospective.
IEEE Data Engineering Bulletin, Pp. 21-28,
Volume 30, Number 3, September 2007
MA Olson, K Bostic, M Seltzer . Berkeley DB.
USENIX Annual Technical Conference, Pp.
183–192, June 6-11, 1999, Monterey,
California, USA.
Oracle Berkeley DB Site.
http://www.oracle.com/technology/products/b
erkeley-db