Making Database Applications Perform Using Program Analysis Alvin Cheung Samuel Madden Armando Solar-Lezama MIT Owen Arden Andrew C.

Download Report

Transcript Making Database Applications Perform Using Program Analysis Alvin Cheung Samuel Madden Armando Solar-Lezama MIT Owen Arden Andrew C.

Making Database Applications Perform
Using Program Analysis
Alvin Cheung
Samuel Madden
Armando Solar-Lezama
MIT
Owen Arden
Andrew C. Myers
Cornell
Developing Database
Applications
Application
Logic
Query
Java
SQL Database
Application Server
2/1/13
SQL
NEDB '13
2
Developing Database
Applications
Application
Logic
Query
Java
Stored
Procedures
PL/SQL
SQL Database
Application Server
2/1/13
SQL
NEDB '13
3
Developing Database
Applications
Application
Logic
SQL
Query
Java
Language Choice for Application Logic
Stored
Procedures
PL/SQL
Program analysis to the rescue!
Application Distribution
SQL Database
Application Server
2/1/13
NEDB '13
4
StatusQuo
• Express application logic in ways that
programmers are comfortable with
• Job of compiler & runtime to determine the
most efficient implementation
2/1/13
NEDB '13
5
Two Key Technologies
• Infer queries from imperative code
• Migrate computation between servers for
optimal performance
2/1/13
NEDB '13
6
Relational Operations in
Imperative Code
List getUsersWithRoles () {
List users = getUsersFromDB();
List roles = getRolesFromDB();
List results = new ArrayList();
for (User u : users) {
for (Role r : roles) {
if (u.roleId == r.id)
results.add(u); }}
return results; }
SELECT * FROM
user
SELECT * FROM
role
List getUsersWithRoles () {
return executeQuery(
“SELECT u FROM users u, roles r
convert to
WHERE
u.roleId == r.id
ORDER BY u.roleId, r.id”; }
2/1/13
NEDB '13
7
Relational Operations in
Imperative Code
List getUsersWithRoles () {
List users = getUsersFromDB();
List roles = getRolesFromDB();
List results = new ArrayList();
for (User u : users) {
for (Role r : roles) {
if (u.roleId == r.id)
results.add(u); }}
results
return results;
}
output variable
Goal
Find a variable that
we can rewrite into a
SQL expression
List getUsersWithRoles () {
return executeQuery(
“SELECT u FROM users u, roles r
convert to
WHERE
u.roleId == r.id
ORDER BY u.roleId, r.id”; }
2/1/13
NEDB '13
8
Query By Synthesis (QBS)
• Identify potential code fragments
– i.e., regions of code that fetches persistent
data and return values
• Find SQL expressions for output variables
• Try to prove that those expressions
preserve program semantics
– if so, convert the code!
2/1/13
NEDB '13
9
Search for Output Expressions
List getUsersWithRoles () {
users
List users
= query(select * from users);
roles
List roles
= query(select * from roles);
List results = [];
for (User u : users) {
for (Role r : roles) {
if (u.roleId == r.id)
results = results : [] }}
results
return results;
}
Infinite search space size!
2/1/13
NEDB '13
11
Constraints for
Output Expressions
List getUsersWithRoles () {
users
List users
= query(select * from users);
roles
List roles
= query(select * from roles);
List results = [];
loop invariant
for (User u : users) {
results = πouter
user( users[0 .. i] ⨝roleId = id roles )
for (Role r : roles) {
if (u.roleId == r.id)
results = results : [] }}
results =output
πuser( expression
users ⨝roleId = id roles )
results
return results;
}
If outer loop invariant
is true and outer loop terminates
then output expression is true
Still need a smarter
way to search
2/1/13
NEDB '13
12
Search for Output Expressions
and Invariants
• Use program synthesis as search engine
Symbolic desc. of
search space
Expression that
satisfies all the
constraints
Solution
constraints
Symbolic manipulation
Counter-example driven search
2/1/13
NEDB '13
13
Experiments
2/1/13
NEDB '13
14
Real-world Evaluation
Wilos (project management application) – 62k LOC
Operation type
# Fragments found
# Fragments converted
Projection
1
1
Selection
13
10
Join
7
7
Aggregation
11
10
Total
33
28
2/1/13
NEDB '13
15
Performance Evaluation:
Join Query
1000K
Execution time (ms)
original (lazy)
inferred (lazy)
100K
10K
Nested-loop join  Hash join!
O(n2)
O(n)
1K
100
0
2/1/13
20K
40K
60K
80K
Number of roles / users in DB
NEDB '13
100K
16
Developing Database
Applications
Application
Logic
Query
Java
Stored
Procedures
SQL
PL/SQL
Application Distribution
SQL Database
Application Server
2/1/13
NEDB '13
17
Running Example
discount = executeQuery("select discount from
customers
where id = " + cid);
totalAmount = orderTotal * (1 – discount);
credit = executeQuery("select credit from customers
where id = " + cid);
if (credit < totalAmount)
printToConsole("Only " + credit + " in account!");
else
executeUpdate("update customer set credit = " +
(credit – totalAmount) + "
where id = " + cid);
2/1/13
NEDB '13
18
Actual Execution
DB
APP
DB
APP
DB
discount = executeQuery("select discount from
customers
where id = " + cid);
totalAmount = orderTotal * (1 – discount);
credit = executeQuery("select credit from customers
where id = " + cid);
if (credit < totalAmount)
printToConsole("Only " + credit + " in account!");
else
executeUpdate("update customer set credit = " +
(credit – totalAmount) + "
where id = " + cid);
2/1/13
NEDB '13
19
Actual Execution
DB
discount = executeQuery("select discount from
customers
where id = " + cid);
network communication
APP
totalAmount = orderTotal * (1 – discount);
network communication
DB
credit = executeQuery("select credit from customers
where id = " + cid);
network communication
APP
DB
if (credit < totalAmount)
printToConsole("Only " + credit + " in account!");
else
network communication
executeUpdate("update customer set credit = " +
(credit – totalAmount) + " where
id = " + cid);
2/1/13
NEDB '13
20
Speeding up Execution
discount = executeQuery("select discount from
customers
where id = " + cid);
totalAmount = orderTotal * (1 – discount);
DB
credit = executeQuery("select credit from customers
where id = " + cid);
APP
DB
if (credit < totalAmount)
printToConsole("Only " + credit + " in account!");
else
executeUpdate("update customer set credit = " +
(credit – totalAmount) + " where
id = " + cid);
2/1/13
NEDB '13
21
Speeding up Execution
discount = executeQuery("select discount from
customers
where id = " + cid);
DB
APP
DB
totalAmount = orderTotal * (1 – discount);
data dependency
credit = executeQuery("select credit from customers
where id = " + cid);
control dependency
if (credit < totalAmount)
printToConsole("Only " + credit + " in account!");
else
executeUpdate("update customer set credit = " +
(credit – totalAmount) + "
where id = " + cid);
2/1/13
NEDB '13
22
Speeding up Execution
discount = executeQuery("select discount from
customers
where id = " + cid);
DB
APP
DB
totalAmount = orderTotal * (1 – discount);
data dependency
DB Server
credit = executeQuery("select credit from customers
where id = " + cid);
control dependency
if (credit < totalAmount)
printToConsole("Only " + credit + " in account!");
else
executeUpdate("update customer set credit = " +
(credit – totalAmount) + " where
id = " + cid);
2/1/13
NEDB '13
23
Introducing Pyxis
• “Store-procedurizes” DB apps and pushes
computation to the DB
• Adaptively controls the amount of
computation pushed to DB for optimal
performance
• No programmer intervention required
2/1/13
NEDB '13
24
Using Pyxis
2/1/13
NEDB '13
25
How Pyxis Works
Java
SQL
Java
SQL
Java
SQL
Java
Java
SQL
Java
SQL
Java
SQL
Java
Partition
Instrument
Java
control transfer
SQL
Java
SQL
Java
SQL
Deploy
Java
Monitor
App Server
2/1/13
DB Server
NEDB '13
26
How Pyxis Works
Java
SQL
Java
SQL
Java
SQL
Java
Java
SQL
Java
SQL
Java
SQL
Java
Partition
Instrument
Java
control transfer
SQL
Java
SQL
Deploy
Java
SQL
Java
Monitor
App Server
2/1/13
DB Server
NEDB '13
27
Generating Program Partitions
• Deploy and profile application as-is
• Construct a dependence graph of program
statements
– captures both control and data flow
• Formulate linear program from profile data
and dependence graph
– solution gives a partitioning of the source
code
2/1/13
NEDB '13
28
Experiments
2/1/13
NEDB '13
30
TPC-C on 16-core DB machine
JDBC
Manual
25
Average Latency (ms)
Pyxis
Pyxis generated implementation:
3x latency reduction
1.7x thruput increase
20
15
10
5
100
2/1/13
300
500
700
900
Average Thruput (xact / s)
NEDB '13
1100
1300
32
StatusQuo
Ease DB application development
Convert imperative program statements into
declarative SQL
Fully automatic code partitioning using
application and server characteristics
db.csail.mit.edu/statusquo
2/1/13
NEDB '13
33
Backup
2/1/13
NEDB '13
34