One Size Almost Fits All - H

Download Report

Transcript One Size Almost Fits All - H

SYSTEM IS
THE ONLY THING I
CAN
TRUST
@ANDY_PAVLO
Thirty Years Ago…
2
INTERACTIVE
TRANSACTIONS
SMALL # OF CPU
CORES
SMALL MEMORY
SIZES
TPC-C BENCHMARK
Warehouse Order Processing
APPLICATION
NewOrder Transaction
1.Check item stock
level.
2.Create new order
information.
4
TPC-C BENCHMARK
Warehouse Order Processing
20,000
MySQL
Postgres
15,000
10,000
5,000
0
TXN/SEC
1
2
3
4
5
6 7 8
CPU CORES
9
10
11
12
5
TRADITIONAL DBMS
Measured CPU Cycles
30%
30%
28%
12%
OLTP THROUGH THE LOOKING GLASS,
AND WHAT WE FOUND THERE
SIGMOD, pp. 981-992, 2008.
BUFFER
POOL
LOCKING
RECOVERY
6
HARDWARE UPGRADE
REPLICATION
DISTRIBUTED CACHE
SHARDING MIDDLEWARE
NOSQL
HOW TO SCALE
UP WITHOUT
GIVING UP
TRANSACTIONS?
Distributed Main Memory
Transaction Processing System
H-STORE: A HIGH-PERFORMANCE, DISTRIBUTED
MAIN MEMORY TRANSACTION PROCESSING SYSTEM
Proc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.
xMAIN MEMORY STORAGE
DISK ORIENTED
iSERIAL EXECUTION
CONCURRENT EXECUTION
/
COMPACT LOGGING
HEAVYWEIGHT RECOVERY
STORED
PROCEDURE
PARTITIONS
Transaction
Applicati
on
Procedure Name
VoteCount:
Transaction
Input
ParametersExecution
SELECT
ResultCOUNT(*)
FROM votes
WHERE phone_num = ?;
InsertVote:
INSERT INTO votes
VALUES (?, ?, ?);
run(phoneNum, contestantId, currentTime) {
result = execute(VoteCount, phoneNum);
if (result > MAX_VOTES) {
return (ERROR);
}
execute(InsertVote, phoneNum,
contestantId,
currentTime);
return (SUCCESS);
}
SINGLE-THREADED
CMD
SNAPSHOTS
LOG
EXECUTION ENGINES
11
TPC-C BENCHMARK
Warehouse Order Processing
20,000
MySQL
Postgres
H-Store
15,000
40x
10,000
5,000
0
TXN/SEC
1
2
3
4
5
6 7 8
CPU CORES
9
10
11
12
12
DISTRIBUTE
D
TRANSACTIO
TPC-C BENCHMARK
8 Cores per Node
10% Distributed Transactions
40,000
H-Store
30,000
20,000
10,000
0
TXN/SEC
1
2
NODES
3
4
14
DISTRIBUTED
TRANSACTIONS
Applicati
on
Query Count
P1
P2
P3
P4
15
KNOW WHAT
TRANSACTIONS
WILL DO BEFORE THEY
START
BUT PEOPLE
ALWAYS GIVE ME
BAD ADVICE
DON’T GET
INVOLVED
WITH
COMPUTERS.
YOU’LL
NEVER MAKE
ANY MONEY.
DON’T GET A
PHD.
EVERYONE WILL
THINK YOU ARE
A JERK.
THE DATABASE
SYSTEM
ALWAYS HAS MORE
INFORMAT
DO USE MACHINE
LEARNING TO
PREDICT
TRANSACTION
BEHAVIOR.
ON PREDICTIVE MODELING FOR OPTIMIZING
TRANSACTION EXECUTION IN PARALLEL OLTP SYSTEMS
Proc. VLDB Endow., Vol 5, Iss. 2, pp. 85-96, 2011
PREDICTIVE MODELS
22
SELECT * FROM
SELECT * FROM
WAREHOUSE
WHERE
SELECT * FROM
WAREHOUSE
WHERE
W_ID
= 10; * FROM
SELECT
WAREHOUSE
WHERE
W_ID
= 10;
WAREHOUSE
WHERE
W_ID
= 10; DISTRICT
SELECT
*
FROM
SELECT
W_ID* FROM
= 10; DISTRICT
WHERE
D_W_ID
= 10D_ID
AND
INSERT
INTO
ORDERS
D_W_ID
= 10 AND
INSERT
INTO
ORDERS
D_ID
=9;
O_D_ID,
=9;(O_W_ID,
(O_W_ID,
O_D_ID,
O_C_ID)
VALUES
(10, 9,
INSERT
INTO
O_C_ID)
VALUES
INSERT
INTOORDERS
ORDERS(10, 9,
12345);
(O_W_ID,
O_D_ID,
12345); O_D_ID,
(O_W_ID,
⋮
O_C_ID,…)
O_C_ID) VALUES
⋮ (10, 9,
(10,12345);
9, 12345,…);
Feature
Clusterer
⋮⋮
Decision Tree
Feature2
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
______
Model
Generator
Classifier
Feature1
Feature2
Markov Models
23
Applicati
on
DISTRIBUTED
TRANSACTIONS
24
TPC-C BENCHMARK
8 Cores per Node
10% Distributed Transactions
25,000
60,000
Naïve
OPTIMAL
Houdini
20,000
45,000
2x
15,000
30,000
10,000
15,000
5,000
0
TXN/SEC
1
2
NODES
3
4
25
Applicati
on
DISTRIBUTED
TRANSACTIONS
SP1 - Waiting for Query Result SP2 - Waiting for Query Request
SP3 - Two-Phase Commit
26
TRANSACTION STALL
POINTS
BASE PARTITION
18%
45%
37%
SP1 - Waiting for Query Result
SP3 - Two-Phase Commit
REMOTE
PARTITION
73%
5%
22%
SP2 - Waiting for Query Request
Real Work
27
DO SOMETHING
USEFUL
WHEN STALLED
DON’T BE
SURPRISED IF YOU
& KB DON’T LAST
THROUGH
GRAD SCHOOL.
DON’T BE STAN’S
STUDENT
IF YOU GO TO
BROWN.
DO USE MACHINE
LEARNING TO
SCHEDULE
SPECULATIVE TASKS.
THE ART OF SPECULATIVE EXECUTION
In Progress (August 2013)
SERIALIZABLE SCHEDULE
Distributed Transaction
Zzzz…
VERIFY
C
Single-Partition
Transaction
Speculative Transaction
C
C
Speculative Transaction
Single-Partition
Transaction
C
C
32
SPECULATIVE
TRANSACTIONS
…
Transaction Queue
Speculation Candidate:
WRITE X
Distributed Transaction:
READ X
READ X
33
SPECULATIVE QUERIES
QueryY:
SELECT S_QTY FROM STOCK
WHERE S_W_ID = ?
AND S_I_ID = ?;
Distributed Transaction:
34
Transaction Parameters:
w_id=0
i_w_ids=[1,0]
i_ids=[1001,1002]
GetWarehouse:
CheckStock:
SELECT *
FROMFROM
WAREHOUSE
S_QTY
STOCK
WHERE W_ID
= ?
S_W_ID
= ?
AND S_I_ID = ?;
35
VERIFICATION
Distributed Transaction
Speculative Transactions
Query1
Query2
Query3
Query1
Query3
Query3 Query3
Query1 Query2
Query3
36
TPC-C BENCHMARK
8 Cores per Node
10% Distributed Transactions
50,000
None
Spec Queries
Spec Txns
All
40,000
30,000
20,000
10,000
0
TXN/SEC
1
2
NODES
3
4
37
Optimize Single-Partition Execution
H-STORE: A HIGH-PERFORMANCE, DISTRIBUTED MAIN
MEMORY TRANSACTION PROCESSING SYSTEM
Proc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.
Minimize Distributed Transactions
SKEW-AWARE AUTOMATIC DATABASE PARTITIONING IN
SHARED-NOTHING, PARALLEL OLTP SYSTEMS
Proceedings of SIGMOD, 2012.
Identify Distributed Transactions
ON PREDICTIVE MODELING FOR OPTIMIZING TRANSACTION
EXECUTION IN PARALLEL OLTP SYSTEMS
Proc. VLDB Endow., vol. 5, pp. 85-96, 2011.
Utilize Transaction Stalls
THE ART OF SPECULATIVE EXECUTION
In Progress (August 2013)
FUTURE
WORK
One Size
Almost
Fits All
N
N-
HSSTORE STORE STORE
41
One Size
Almost™
Fits All
N
N-
HSSTORE STORE STORE
43
Escape From
Planet
Zdonik
(i.e., Andy Needs to Get
Tenure)
Beyond the ‘Stores
• Non-Partitionable
Workloads.
• The Poor Man’s Spanner.
• Scientific Databases.
DON’T MESS IT UP
WITH KB.
Stan “The Thrill” Sam
Ugur
David
Zdonik Stonebraker Madden Cetintemel DeWitt
Evan
Saurya
Xin
Jones Velagapudi Jia
Carlo Justin
Yang
Curino DeBrabant Zou
Dan
Abadi
Visawee Ning
Angkana. Shi
John
Meehan