What is WiredTiger

Download Report

Transcript What is WiredTiger

Introduction to new high performance
storage engines in MongoDB 2.8 3.0
Henrik Ingo
Solutions Architect, MongoDB
Hi, I am
Henrik Ingo
@h_ingo
2
Introduction to new high performance
storage engines in MongoDB 2.8 3.0
Agenda:
- MongoDB and NoSQL
- Storage Engine API
- WiredTiger configuration + performance
Most popular NoSQL database
4
5 NoSQL categories
Redis, Riak
Key Value
Graph
Neo4j
5
Cassandra
Wide Column
Document
Map Reduce
Hadoop
MongoDB is a Document Database
Rich Queries
• Find Paul’s cars
• Find everybody in London with a car
built between 1970 and 1980
Geospatial
• Find all of the car owners within 5km of
Trafalgar Sq.
Text Search
• Find all the cars described as having
leather seats
Aggregation
• Calculate the average value of Paul’s
car collection
Map Reduce
• What is the ownership pattern of colors
by geography over time? (is purple
trending up in China?)
6
MongoDB
{
first_name: ‘Paul’,
surname: ‘Miller’,
city: ‘London’,
location:
[45.123,47.232],
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
}
}
Operational Database Landscape
7
MongoDB 3.0 & storage engines
Current state in MongoDB 2.6
Read-heavy apps
Write-heavy apps
• Great performance
• B-tree
• Low overhead
• Good scale-out perf
• Secondary reads
• Sharding
• Good scale-out perf
• Sharding
• Per-node efficiency wish-list:
• Doc level locking
• Write-optimized data
structures (LSM)
• Compression
Other
9
•
•
•
•
Complex transactions
In-memory engine
SSD optimized engine
etc...
Current state in MongoDB 2.6
Read-heavy apps
Write-heavy apps
• Great performance
• B-tree
• Low overhead
• Good scale-out perf
toreads
get
• How
Secondary
• Sharding
• Good scale-out perf
• Sharding
• Per-node efficiency wish-list:
• Doc level locking
the
above? data
• Write-optimized
structures (LSM)
• Compression
all of
Other
10
•
•
•
•
Complex transactions
In-memory engine
SSD optimized engine
etc...
MongoDB 3.0 Storage Engine API
Read-heavy app
MMAP
11
Write-heavy app
WiredTiger
Special app
3rd party
MongoDB 3.0 Storage Engine API
• One at a time:
– Many engines built into mongod
– Choose 1 at startup
– All dataapp
stored by Write-heavy
the same engine
Read-heavy
app
Special app
– Incompatible on-disk data formats (obviously)
– Compatible client API
• Compatible Oplog & Replication
– Same
replica set can mix
different engines
MMAP
WiredTiger
– No-downtime migration possible
12
3rd party
Some existing engines
• MMAPv1
– Improved MMAP (collection-level locking)
• WiredTiger
– Discussed next
• RocksDB
– LSM style engine developed by Facebook
– Based on LevelDB
• TokuMXse
– Fractal Tree indexing engine from Tokutek
13
Some rumored engines
• Heap
– In-memory engine
• Devnull
– Write all data to /dev/null
– Based on idea from famous flash animation...
– Oplog stored as normal
• SSD optimized engine (e.g. Fusion-IO)
• KV simple key-value engine
https://github.com/mongodb/mongo/tree/master/src/mongo/db/storage
14
WiredTiger
What is WiredTiger
• Modern NoSQL database engine
– flexible schema
• Advanced database engine
– Secondary indexes, MVCC, non-locking algorithms
– Multi-statement transactions (not in MongoDB 3.0)
• Very modular, tunable
– Btree, LSM and columnar indexes
– Snappy, Zlib, 3rd-party compression
– Index prefix compression, etc...
• Built by creators of BerkeleyDB
• Acquired by MongoDB in 2014
•
source.wiredtiger.com
16
Choosing WiredTiger at server startup
mongod --storageEngine wiredTiger
http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine
17
Main tunables exposed as MongoDB options
mongod --storageEngine wiredTiger
--wiredTigerCacheSizeGB 8
--wiredTigerDirectoryForIndexes /data/indexes
--wiredTigerCollectionBlockCompressor zlib
--syncDelay 30
http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine
18
All WiredTiger options via configString (hidden)
mongod --storageEngine wiredTiger
--wiredTigerEngineConfigString
"cache_size=8GB,eviction=(threads_min=4,threads_max=8),
checkpoint(wait=30)"
--wiredTigerCollectionConfigString
"block_compressor=zlib"
--wiredTigerIndexConfigString
"type=lsm,block_compressor=zlib"
--wiredTigerDirectoryForIndexes /data/indexes
See docs for wiredtiger_open() & WT_SESSION::create()
http://source.wiredtiger.com/2.5.0/group__wt.html#ga9e6adae3fc6964ef837a62795c7840ed
http://source.wiredtiger.com/2.5.0/struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb
19
Also via createCollection(), createIndex()
db.createCollection( "users",
{ storageEngine: {
wiredTiger: {
configString: "block_compressor=none" }
}
)
http://docs.mongodb.org/master/reference/method/db.createCollection/#db.createCollection
http://docs.mongodb.org/master/reference/method/db.collection.createIndex/#db.collection.createIndex
20
More...
• db.serverStatus()
• db.collection.stats()
21
Understanding and Optimizing
WiredTiger
WiredTiger SE
Understanding WiredTiger architecture
Btree
LSM
Columnar
Cache (default: 50%)
None
Snappy
Zlib
OS Disk Cache (Default: 50%)
Physical disk
23
WiredTiger SE
Covering 90% of your optimization needs
Btree
LSM
Columnar
Cache (default: 50%)
None
Snappy
Zlib
Decompression time
OS Disk Cache (Default: 50%)
Disk seek time
Physical disk
24
WiredTiger SE
Strategy 1: fit working set in Cache
Btree
LSM
Columnar
cache_size = 80%
Cache (default: 50%)
None
Snappy
Zlib
OS Disk Cache (Default: 50%)
Physical disk
25
WiredTiger SE
Strategy 2: fit working set in OS Disk Cache
Btree
LSM
Columnar
cache_size
= 10%
Cache
(default:
50%)
None
Snappy
Zlib
OS
Disk
Cache
(Remaining:
90%)
OS
Disk
Cache
(Default: 50%)
Physical disk
26
WiredTiger SE
Strategy 3: SSD disk + compression to save €
Btree
LSM
Columnar
Cache (default: 50%)
None
Snappy
Zlib
OS Disk Cache (Default: 50%)
Physical
SSD disk
27
WiredTiger SE
Strategy 4: SSD disk (no compression)
Btree
LSM
Columnar
Cache (default: 50%)
None
Snappy
Zlib
OS Disk Cache (Default: 50%)
Physical
SSD disk
28
What problem is solved by LSM indexes?
Easy:
No indexes
Performance
Easy:
Add indexes
Hard:
Smart schema design (hire a consultant)
LSM index structures (or columnar)
Fast reads
29
Both
Fast writes
2B inserts (with 3 secondary indexes)
30
http://smalldatum.blogspot.fi/2014/12/read-modify-write-optimized.html