Transcript PPTX

AFix Automated
Atomicity-Violation
Fixing
Guoliang Jin, Linhai Song, Wei Zhang,
Shan Lu, and Ben Liblit
University of Wisconsin–Madison
1
Needs
Needs
to Find
to Find
andConcurrency
Fix Concurrency
BugsBugs
 Multicore era is coming already here
 Programmers struggle to reason about concurrency
 More and more concurrency bugs
 Many concurrency bugs can be automatically detected
Thread 1
if (ptr != NULL) {
ptr->field = 1;
}
 But bugs need to be
2
fixed
Thread 2
ptr = NULL;
Segmentation
Fault
Bug-fixing
 Bug-fixing process is lengthy and resource consuming
Understand bug1
…
Understand
a bug
…
Understand bugn
Generate a patch
Review
& test
the patch
Correctness
Performance
Readability
 Nearly 70% of patches are buggy in their first releases
 Automated fixing is desired, but difficult in general
3
Automated Concurrency-Bug Fixing
 Concurrency bugs are feasible to be fixed automatically
 Program is correct in most interleavings.
Thread 1
if (ptr != NULL) {
ptr->field = 1;
}
Thread 2
Thread 1
ptr = NULL;
if (ptr != NULL) {
ptr->field = 1;
}
 Only need to remove some bad interleavings.
Thread 1
if (ptr != NULL) {
Thread 2
ptr = NULL;
ptr->field = 1;
}
4
Segmentation
Fault
Thread 2
ptr = NULL;
AFix: Automated Atomicity-Violation Fixing
 Why atomicity-violation bugs?
 One of the most common types of concurrency bug
 Strategy
 Statically adding locks to remove buggy interleavings.
 Goal
 Automate the whole bug-fixing process
 Provides best-effort atomicity-violation patches
• Correctness
• Performance
• Readability
5
AFix Overview
Input
from
CTrigger
Bug
understanding
Manual Bug Fixing Progress
6
CTrigger Bug-Detector Review
 A single-variable atomicity-violation detection & testing tool
 It reports a list of buggy instruction triples
 Abbreviated as {(p1, c1, r1), …, (pn, cn, rn)}
Thread 1
previous access
if (ptr != NULL) {
current access
ptr->field = 1;
Thread 2
}
ptr = NULL;
7
remote access
AFix Overview
Input
from
CTrigger
(p1, c1, r1)
…
...
…
(pn, cn, rn)
Bug
understanding
patch1
…
…
…
patchn
merged
patch1
…
merged
patchm
adding
runtime
support
Patch generation
Manual Bug Fixing Progress
8
patch
testing
Patch
testing
Outline
 Motivation
 Overview
 AFix




One bug patching
Patch Merging
Runtime support
Patch testing
 Evaluation
 Conclusion
9
One
(p, c,Bug
r) Patching
Patching
 Make the p-c code region mutually exclusive with r
 Put p and c into a critical section
 Put r into a critical section
 Select or introduce a lock for the two critical sections
p
r
c
10
Put p and c into a Critical Section: naïve
 A naïve solution
 Add lock on edges reaching p
 Add unlock on edges leaving c
 Potential new bugs
p
p
c
c
 Could lock without unlock
 Could unlock without lock
 etc.
11
Put p and c into a Critical Section: AFix
 Assume p and c are in the same function f
 Step 1: find protected nodes in critical section
 In f’s CFG, find nodes on any p  c path
 Step 2: add lock operations
 protected node
 protected node  unprotected node
 unprotected node
 Avoid those potential bugs mentioned
12
p
c
p and c Adjustment
 p and c adjustment when they are in different functions
 Observation: people put lock and unlock in one function
 Find the longest common prefix of p’s and c’s stack traces
 Adjust p and c accordingly
void newlog()
{
…
p: close();
c: open();
…
}
13
void close() {
close()
…
newlog()
p: log = CLOSE;
…
}
void open() {
…
c: log = OPEN;
}
open()
newlog()
…
(p, c, r) Patching: put r into a critical section
 Lock-acquisition before r, lock-release after r
 Only if r cannot be reached from the p–c critical section
fpc() {
lock(L1)
p
...
r
…
c
unlock(L1)
}
case 1
14
fpc() {
lock(L1)
p
...
foo() {…r}
…
c
unlock(L1)
}
r’s call stack: … fpc foo …r
case 2
(p, c, r) Patching: select or introduce a lock
 Use the same lock for the critical sections
 Lock type:
 Lock with timeout
: in case of potential new deadlock
 Reentrant lock
: in case of recursion
 Otherwise: normal lock
 Lock instance:
 Global lock instances are easy to reuse
15
Patch Merging
 One programming mistake can lead to multiple bug reports
 They should be fixed all together
void buf_write() {
p1 int tmp = buf_len + str_len;
if (tmp > MAX)
return;
p1
c1
p2
r1
c2, r2
c1 p2 memcpy(buf[buf_len], str, str_len);
r1 c2, r2 buf_len = tmp;
}
 Too many lock/unlock operations
 Potential new deadlocks
 May hurt performance and readability
16
Patch Merging: redundant patch
 Redundant patch, when p1–c1, p2–c2 critical sections
 are in the same function: redundant when one protected
region is a subset of the other
 are in different functions: consulting the stack trace again
lock(L1)
p1
lock(L2)
p2
c2
unlock(L2)
c1
unlock(L1)
17
lock(L1)
r1
unlock(L1)
lock(L1)
lock(L2)
r2
unlock(L2)
unlock(L1)
Patch Merging: related patch
 Related patch
 Merge if p, c, or r is in some other patch’s critical sections
lock(L1)
p1
lock(L2)
p2
c1
unlock(L1)
c2
unlock(L2)
unlock(L1)
18
lock(L1)
r1
unlock(L1)
lock(L1)
lock(L2)
r2
unlock(L2)
unlock(L1)
Runtime Support and Testing
 Runtime support to handle deadlock
 Lightweight patch-oriented deadlock detection
•
•
•
•
Whether timeout is caused by potential deadlock?
Only detect deadlocks caused by the patches
Has low-overhead, and suitable for production runs
Help patch refinement
 Traditional deadlock detection
 In-house patch testing
19
Outline
 Motivation
 Overview
 AFix




One bug patching
Patch Merging
Runtime support
Patch testing
 Evaluation
 Conclusion
20
Evaluation: Overall Patch Quality
Bug
Naïve
Unmerged
Merged
Manual
Apache
-
-


MySQL 1
-



MySQL 2
-



Mozilla 1
-



Mozilla 2
-



Cherokee
-



FFT
-
-


PBZIP2
-
-
-

 Patched failure rates: 0%
 Patched overheads: <0.6%
(except PBZIP2 and FFT)
(except PBZIP2)
 With timeout triggered deadlock detection
21
Conclusion
 Atomicity violations are feasible to be fixed automatically
 By removing bad interleavings
 Must be careful in the details
 Use some heuristics, and excellent results in practice
 Completely eliminates detected bugs in targeted class
 Overheads too low to reliably measure
 Produces small, simple, understandable patches
 Future research should do detector and fixer co-design
22
Questions about AFix*?
*DISCLAIMER FROM AFIX: “I REPRESENT
HUMANS’ EFFORTS TOWARDS FIXING THE
WORLD AUTOMATICALLY USING TOOLS.
HOWEVER, THE WORLD IS SO IMPERFEC T
THAT I DO NOT KNOW WHETHER THE WORLD
IS FULLY FIXABLE , THUS I MAKE NO 100%
GUARANTEE.”
23