xCalls: Safe I/O in Memory Transactions Haris Volos, Andres Jaan Tack, Neelam Goyal+, Michael Swift, Adam Welc§ University of Wisconsin - Madison + §

Download Report

Transcript xCalls: Safe I/O in Memory Transactions Haris Volos, Andres Jaan Tack, Neelam Goyal+, Michael Swift, Adam Welc§ University of Wisconsin - Madison + §

xCalls: Safe I/O in Memory
Transactions
Haris Volos,
Andres Jaan Tack, Neelam Goyal+,
Michael Swift, Adam Welc§
University of Wisconsin - Madison
+
§
Transactional Memory (TM)
• CMP leads to more concurrency within programs
• Synchronizing access to shared data via locks is hard
Thread 1
Thread 2
Isolation:
• atomic construct simplifies synchronizing
access
OBSERVE both
atomic {
atomic {
store(A)
atomic {
LOCK(L)
Atomicity:
to shared data
B = B – 10; red transaction’s
A = A – 20; store(B)
x = x - c;
PERFORM both
updates to A
updates to A
A
=
A
+
10;
B = B + 20; store(A)
y = y + c;
and B or none
and B or none
}
}
} Conflict
UNLOCK(L)
Abort blue
ABORT conflicting transactions
2
A challenging world…
• Real world programs frequently take actions
outside of their own memory
– Firefox: ~1% critical sections do system call [Baugh TRANSACT ’07]
• Most TM systems apply only to user-level memory
Thread 1
Memory updates
atomic { dropped
item = procItem(queue);
1a
write (file, principal);
write (file, item->header); 1b
}
Abort
memory
file
1a
2b
2b
1a
Thread 2
Interleaved
atomic
{
writes
item = procItem(queue);
2a
write (file, principal);
writeFile
(file,
item->header); 2b
writes
} not dropped
3
State of the art
• Defer
• Undo
• Global Lock
atomic {
item = procItem(queue);
write (file, principal);
write (file, item->header);
send (socket, item->body);
}
Ignore failures
NAS
Stop the world
LAN
Defer
Internet
COMMIT
Perform send
4
Contribution
Transactional Program
xCalls
Legacy
calls
xCall Library
Runs in user mode
System call interface
Transaction-unaware kernel
• xCall programming interface
– Exposes transactional semantics to programmer
– Enables I/O within transactions w/o stopping the world
– Exposes all failures to the program
5
Outline
•
•
•
•
Motivation
xCall Design & Implementation
Evaluation
Conclusion
6
Design overview
• Principles
1. As early as possible but not earlier
2. Expose all failures
• Components
– Atomic execution
– Isolation
– Error handling
7
Atomic execution
• Provide abort semantics for kernel data and I/O
Expose to programmer when action is performed
Can reverse action?
Need result?
Execution
Example
Yes
--
In-place
x_write()
No
No
Defer
x_write_pipe()
No
Yes
Global lock
ioctl()
atomic {
item = procItem(queue);
x_write (file, principal);
x_write (file, item->header);
}
Abort
file
buffers
8
Isolation
• Prevent conflicting changes to kernel data made
within a transaction
Sentinels
– Revocable user-level locks
– Lock logical kernel state visible through system calls
Thread 1
memory
atomic {
item = procItem(queue);
x_write (file, principal);
x_write (file, item->header);
}
file
Thread 2
atomic {
item = procItem(queue);
Conflict
x_write (file,
principal);
x_write (file, item->header);
}
9
Error handling
• Some errors might not happen until transaction
commits or aborts
Inform programmer when failures happen
– Handle errors after transaction completes
atomic {
item = procItem(queue);
x_write (file, principal, &err1);
x_write (file, item->header, &err2);
x_send (socket, item->body, &err3);
}
Handle error
COMMIT
if (err1 || err2
|| err3) {
here
Perform
/* CLEANUP
*/ send
}
Deferred send: FAILED
err3 = error
LAN
Defer
Internet
10
Example: file write
ssize_t x_write (int fd, void *buf,
ssize_t nbytes, int *result ) {
void *localbuf;
ssize_t bytes;
get_sentinel(fd);
localbuf = alloc_local_buf(nbytes);
read(fd, localbuf, nbytes);
bytes = write(fd, buf, nbytes);
if (bytes != -1)
compensate(x_undo_write, fd, localbuf,
bytes, result);
}
int x_undo_write (int fd, void *buf,
ssize_t nbytes, int *result){
off_t ret1, ret2;
ret1 = lseek(fd, -nbytes, SEEK_CUR);
if (ret1)
ret2 = pwrite(fd, buf, nbytes, ret1);
if (ret1 == -1 || ret2 == -1)
*result = errno;
return (ret1 == -1 || ret2 == -1);
}
Error handling
Isolation
Atomic execution
Atomic execution
Atomic execution
Error handling
11
Summary
• xCall API exposes transactional semantics
– Atomicity
– Isolation
– Error handling
• Prototype implementation
– Executes as user-mode library
– Relies on Intel STM for transactional memory
– Provides 27 xCalls including file handling,
communication, threading
12
Outline
• Motivation
• xCall Design & Implementation
• Evaluation
– Benefit of xCalls over global lock
– Benefit of TM over locks
• Conclusion
13
Evaluation platform
• Transactified three large multithreaded apps
– Berkeley DB
– BIND
– XMMS
• Configurations
– Native : locks +
– STM : transactions +
– xCalls : transactions +
system calls
system calls + global lock
xCalls
• Run on 4 quad-core 2 GHz AMD Barcelona
14
Performance: Berkeley DB (1/2)
• Workload: TPC-C
• xCalls scales better than STM with global lock
• TM worse than locks due to STM overhead
15
Performance: Berkeley DB (2/2)
• Workload: Lockscale
• xCalls improve concurrency over coarse grain lock
• Global lock kills optimistic concurrency
16
Performance: BIND
• Workload: QueryPerf
• Transactions scale better than coarse grain locks
• xCalls enable additional concurrency
17
Performance summary
• xCalls benefit programs with I/O concurrency
• TM benefits programs with contended but not
conflicting critical sections
18
Conclusion
• xCall programming interface
– Brings common OS services to TM programs
– Requires no kernel modifications
– Improves scalability over state of the art
Questions?
19