Falcon is not InnoDB Kevin Lewis, Falcon Team Lead
Download
Report
Transcript Falcon is not InnoDB Kevin Lewis, Falcon Team Lead
Falcon is not InnoDB
Presented by,
MySQL & O’Reilly Media, Inc.
Kevin Lewis, Falcon Team Lead
Ann Harrison, Falcon Team
[email protected], [email protected]
Falcon is not InnoDB
Different design
Different concurrency methods
Different sweet spots
Different quirks
Different performance
Design differences
InnoDB is modeled after Oracle
Clustered storage
Old versions stored in log
Mixed MVCC and locking
Influenced by MySQL
Statement based logging
File per table / index
Table name rules
Design differences
Falcon derives loosely from Rdb and InterBase
Starkey design
Pure MVCC
Originally had no log – careful write for durability
Designed for large memory multi-processors
Page cache plus record cache
Finely grained multi-threading
Synchronization (read/write) on shared structures
Concurrency
Both default to Repeatable Read
Neither is exactly Repeatable Read per
ISO/SQL
Differ from each other in implementation
InnoDB mixes MVCC and locking
Falcon is pure MVCC
Differ from each other in quirks
Different sweet spots
True of all databases
If you design an application to make best use of
Database A, moving to Database B will be hard
The more you know about A, the harder it will
be to move to B
InnoDB is part of the MySQL family and will be
into the future
Different quirks
Serializable is serializable
Everything else is quirky in different ways
Standard repeatable read
Select returns the same results
Plus any insert that gets committed
Equivalent to pure locking
read locks + write locks w/o predicate locks
Mix in MVCC and you get write anomalies
Falcon quirk 1
On two tables
Insert into t1 (f1) select count (*) from t1M
Falcon quirk 1
Trans1:
Insert into t1 (f1)
select count (*)
from t1
Trans2:
Insert into t1 (f1)
select count (*)
from t1
Repeat
mysql> select * from t1;
+------+
| f1 |
+------+
| 0|
| 0|
| 1|
| 1|
+------+
4 rows in set (0.00 sec)
Falcon quirk 1
Innodb makes
transaction 2 wait for
transaction 1’s commit,
then stores the “right”
values in the table
Transaction 2 has an
inconsistent view of data
mysql> select count(*) from t1;
+----------+
| count(*) |
+----------+
|
0|
+----------+
mysql> insert into t1 (f1) select count(*)
from t1;
mysql> select count(*) from t1;
+----------+
| count(*) |
+----------+
|
1|
+----------+
mysql> select * from t1;
+------+
| f1 |
+------+
| 2|
+------+
Falcon quirk 2
Exchange values between two tables using two
transactions
Neither “sees” the other’s changes
Falcon quirk 2
Transaction1:
mysql> select * from dinner_menu;
mysql> update lunch_menu
+--------+-------+
->
set price =
| entree | price |
->
(select price * 0.5 from
+--------+-------+
->
from dinner_menu where
| steak | 25.00 |
->
dinner_menu.entree =
+--------+-------+
->
lunch_menu.entree);
mysql> select * from lunch_menu;
Transaction 2:
+--------+-------+
mysql> update dinner_menu
| entree | price |
->
set price =
+--------+-------+
->
(select price * 0.5
| steak | 5.00 |
->
from lunch_menu where
+--------+-------+
->
lunch_menu.entree =
->
dinner_menu.entree);
Falcon quirk 2
mysql> select * from lunch_menu;
+--------+-------+
| entree | price |
+--------+-------+
| steak | 12.50 |
+--------+-------+
mysql> select * from dinner_menu;
+--------+-------+
| entree | price |
+--------+-------+
| steak | 10.00 |
+--------+-------+
InnoDB transaction 2
waits for transaction 1 to
commit, then gets the
“correct” result
InnoDB quirk 1
mysql> select * from t1;
Select for update sees a
different scope than
normal select
| f1 |
+------+
+------+
|
1|
+------+
mysql> select * from t1 for update;
+------+
With consistent-read off,
Falcon does the same
| f1 |
+------+
|
5|
+------+
InnoDB quirk 2
InnoDB does implicit
“select for update” in
some subqueries
mysql> select * from t1;
+------+
| f1 |
+------+
|
1|
+------+
Falcon does not
mysql> create table t2 as select * from t1;
mysql> select * from t2;
+------+
| f1 |
+------+
|
5|
+------+
Falcon Architecture – short form
Serial
Log
Record cache
Gophers
Back end
Page Cache
I/O Threads
Front end
Tablespace
Files
Performance
Where we were last year
Performance peaks were good
Standard deviation excessive
25000
20000
15000
InnoDB(2008-4-16)
Falcon(2008-2-27)
10000
5000
0
8
16
32
64
128
Performance problem 1
Problem
Quick benchmarks had bad results
Symptom
Auto-commit / select * was slow
Solution
Reuse read-only transactions
Reduce the cost of transaction startup
Non-blocking scavenge
Falcon performance problem 2
Problem
DBT2 times degraded badly
Symptom
Running a monitoring task improved performance
Solution
First, slow down the front end
Put a limit on the number of Active transactions
that can be committed but not “write complete”
Second, speed up the back end
Falcon’s back end
From Log to Page Cache - Gopher threads
Add a pool (‘herd’) of Gophers threads
From Page Cache to disk
Add a pool of I/O threads
Direct IO
Page Consolidation
Thread Prioritization
Falcon performance problem 3
Problem
Index access (read and insert) was disappointing
Symptom
Significant (>10%) time spent in locating index entry
points
Solution
Add Supernode lookup to each index page
Compromise between density of prefix compression and
speed of binary search
Falcon performance problem 4
Problem
Dbt2 tests were disappointing and erratic
Symptom
Some tests were OK, many weren’t
Standard deviation was large
Solution
Hold the mutex in sync object to avoid missing a wakeup call between recognizing the need to wait and going
to sleep.
Falcon performance April 2008
CPU bound performance is better.
10 warehouse DBT2 (900 Mb)
16-way, 32GB Intel Caneland, 4 disk RAID 10
25000
20000
15000
Falcon(2008-4-9)
Falcon(2008-2-27)
10000
InnoDB(2008-4-16)
5000
0
8
16
32
64
128
Falcon vs InnoDB - Dec, 2007
18000
16000
14000
12000
10000
INNODB-20080219
FALCON-20071224
8000
6000
4000
2000
0
8
16
32
64
128
192
After Transaction Fix
18000
16000
14000
12000
INNODB-20080219
10000
FALCON-20071224
FALCON-20080215
8000
FALCON-20080219
6000
4000
2000
0
8
16
32
64
128
192
No Thread Stalls !
18000
16000
14000
12000
INNODB-20080219
10000
FALCON-20071224
FALCON-20080215
8000
FALCON-20080219
FALCON-20080326
6000
4000
2000
0
8
16
32
64
128
192
Today
18000
16000
14000
12000
INNODB-20080219
FALCON-20071224
10000
FALCON-20080215
FALCON-20080219
8000
FALCON-20080326
FALCON-20080414
6000
4000
2000
0
8
16
32
64
128
192
Record Cache / Page Cache
100 Warehouses (9GB) with 2GB Falcon Cache
3500
3000
2500
2000
350M/1650M
700M/1300M
1000M/1000M
1500
1300M/700M
1000
500
0
8
16
32
64
128
Falcon Feature Preview
http://forge.mysql.com/wiki/Falcon_Feature_Preview
Questions ?