Transcript PPTX

Instrumentation and
Sampling Strategies
for
Cooperative
Concurrency Bug
Isolation
Guoliang Jin, Aditya Thakur, Ben Liblit, Shan Lu
University of Wisconsin–Madison
1
Cooperative Concurrency Bug Isolation
• They are synchronization mistakes in multithread 1
thread 2
threaded programs.
read(x)
• Several types:
write(x)
– Atomicity violation
– Data race
– Deadlock, etc.
read(x)
read(x)
write(x)
J?
L
J?
J
2
Concurrency bugs are common in the fields
• Developers are poor at parallel programming
• Interleaving testing is inefficient
• Applications with concurrency bugs shipped to
the users
3
Concurrency bug lead to failures in the field
• Disasters in the past
– Therac-25, Northeastern Blackout 2003
• More threats in multi-core era
4
Failure diagnosis is critical
5
Concurrency Bug Failure Example
L
Concurrency Bug from Apache HTTP Server
6
Concurrency Bug Failure Example
thread 1
thread 2
…
log_writer() {
idx
…
memcpy(&buf[idx], s, strlen(s));
…
temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
…
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
…
temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
Concurrency Bug from Apache HTTP Server
J
7
Concurrency Bug Failure Example
thread 1
thread 2
…
…
log_writer() {
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
idx
…
memcpy(&buf[idx], s, strlen(s));
…
temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
…
temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
}
…
L
Concurrency Bug from Apache HTTP Server
8
Diagnosing Concurrency Bug Failure is Challenging
• The failure is non-deterministic and rare
– Programmers have trouble to repeat the failure
• The root cause involves more than one thread
9
Existing work and their limitations
• Failure replay
– High runtime overhead
– Developers need to manually locate faults
• Run-time bug detection
– (mostly) High runtime overhead
– Not guided by the failure
• Many false positives
How to achieve
low-overhead &
accurate
failure diagnosis?
10
Our work: CCI
• Goal: diagnosing production run concurrency bug failures
• Major components:
– predicates instrumentor
– sampler
– statistical debugging
Predicates
True in most
failure runs,
false in most
correct runs.
Predictors
Program
Source
Sampler
Compiler
Statistical
Debugging
Counts
& J/L
11
CCI Overview
Overhead
• Three different types of predicates.
• Each predicate has its supporting
sampling strategy.
• Same statistical debugging as in CBI.
• Experiments show CCI is effective in
Prev
diagnosing concurrency failures.
Havoc
FunRe
Capability
12
Outline
• Motivation
• CCI Overview
• CCI Predicates and Sampling Strategies
– CCI-Prev and its sampling strategy
– CCI-Havoc and its sampling strategy
– CCI-FunRe and its sampling strategy
• Evaluation
• Conclusion
13
CCI-Prev Intuition
Atomicity Violation
thread 1
thread 2
read(x)
thread 1
thread 2
Data Race
thread 1
thread 2
thread 1
thread 2
read(x)
read(x)
write(x)
read(x)
write(x)
read(x)
write(x)
J
write(x)
read(x)
L
J
L
Just record which thread accessed last time.
14
CCI-Prev Predicate
It tracks whether two successive accesses to
a shared memory location were by two
distinct threads or were by the same thread.
Overhead
Prev
Capability
15
CCI-Prev Predicate on the Correct Run
thread 1
thread 2
…
log_writer() {
…
Predicate J
L
memcpy(&buf[idx], s, strlen(s));
…
…
I temp = idx;
remoteI
0
0
idx = temp + strlen(s);
…
localI
0
1
2
0
return SUCCESS;
…
…
}
…
…
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
…
I temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
Concurrency Bug from Apache HTTP Server
J
16
CCI-Prev Predicate on the Failure Run
thread 1
thread 2
…
…
log_writer() {
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
…
I temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
…
memcpy(&buf[idx], s, strlen(s));
…
I temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
J
L
remoteI
0
0
1
localI
2
0
1
Predicate
…
…
L
Concurrency Bug from Apache HTTP Server
17
CCI-Prev Predicate Instrumentation
…thread 1
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
…
I
thread 2
}
…
L
remoteI
0
0
1
localI
2
1
…
…
log_writer() {
…
}
…
lock(glock);
remote = test_and_insert(& idx, curTid);
record(I, remote);
temp = idx;
unlock(glock);
…
idx = temp + strlen(s);
return SUCCESS;
…
J
Predicate
L
…
a global hash table
address
ThreadID
…
…
& idx
2
1
…
…
Concurrency Bug from Apache HTTP Server
18
CCI-Prev Sampling Strategy
thread 1
thread 2
…
…
log_writer() {
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
• Thread-coordinated
• Bursty
…
memcpy(&buf[idx], s, strlen(s));
…
temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
…
I temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
}
…
Does traditional sampling work?
NO.
19
Outline
• Motivation
• CCI Overview
• CCI Predicates and Sampling Strategies
– CCI-Prev and its sampling strategy
– CCI-Havoc and its sampling strategy
– CCI-FunRe and its sampling strategy
• Evaluation
• Conclusion
20
CCI-Havoc Intuition
thread 1
thread 2
…
…
log_writer() {
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
…
memcpy(&buf[idx], s, strlen(s));
…
temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
…
I temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
}
…
Just record what value was
observed during last access.
21
CCI-Havoc Predicate
Overhead
It tracks whether the value of a given
shared location changes between two
consecutive accesses by one thread.
Only uses thread
local information
Prev
Havoc
Capability
22
CCI-Havoc Predicate on the Correct Run
thread 1
thread 2
…
log_writer() {
Predicate
J
L
…
…
memcpy(&buf[idx], s, strlen(s));
…
unchangedI
0
1
2
0
I temp = idx;
idx = temp + strlen(s);
changedI
0
0
…
…
return SUCCESS;
…
}
…
…
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
…
I temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
23
Concurrency Bug from Apache HTTP Server
J
CCI-Havoc Predicate on the Failure Run
thread 1
thread 2
…
…
log_writer() {
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
…
I temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
…
memcpy(&buf[idx], s, strlen(s));
…
I temp = idx;
idx = temp + strlen(s);
…
Predicate
return SUCCESS;
…
…
}
…
unchangedI
changedI
L
J
L
2
0
1
0
0
1
…
Concurrency Bug from Apache HTTP Server
24
CCI-Havoc Predicate Instrumentation
thread 1
…
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
…
I temp = idx;
changed = test(& idx, temp);
record(I, changed);
insert (& idx, temp);
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
thread 2
J
L
unchangedI
2
1
changedI
0
0
1
Predicate
…
…
log_writer() {
…
}
…
…
hash table for
thread1
L
address
value
…
…
& idx
idx+len2
idx
…
…
Concurrency Bug from Apache HTTP Server
25
CCI-Havoc Sampling Strategy
thread 1
thread 2
…
…
log_writer() {
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
• Bursty
• Thread-independent
…
memcpy(&buf[idx], s, strlen(s));
…
temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
…
temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
}
…
26
Outline
• Motivation
• CCI Overview
• CCI Predicates and Sampling Strategies
– CCI-Prev and its sampling strategy
– CCI-Havoc and its sampling strategy
– CCI-FunRe and its sampling strategy
• Evaluation
• Conclusion
27
CCI-FunRe Predicate
It tracks whether the execution of one
function overlaps with the execution of the
same function from a different thread.
Overhead
Prev
Havoc
FunRe
Capability
28
CCI-FunRe Predicate Example
thread 1
…
log_writer() {
…
return SUCCESS;
}
…
J
thread 2
thread 1
…
log_writer() {
…
…
log_writer() {
…
return SUCCESS;
}
…
return SUCCESS;
}
…
J
L
NonReentlog_writer
2
1
Reentlog_writer
0
1
Predicate
…
…
thread 2
…
log_writer() {
…
return SUCCESS;
}
…
L
29
CCI-FunRe Predicate Instrumentation
thread 1
…
log_writer() {
thread 2
L
NonReentlog_writer
2
0
1
Reentlog_writer
0
0
1
…
oldCount = atomic_inc(Count);
record(“log_writer”, oldCount);
…
log_writer() {
…
J
Predicate
…
oldCount = atomic_inc(Count);
record(“log_writer”, oldCount);
…
atomic_dec(Count);
return SUCCESS; FuncName
atomic_dec(Count);
return SUCCESS;
}
…
}
…
L
Counter
…
…
log_writer
2
0
1
…
…
30
CCI-FunRe Sampling Strategy
thread 1
…
log_writer() {
thread 2
…
log_writer() {
oldCount = atomic_inc(Count);
record(“log_writer”, oldCount);
…
…
atomic_dec(Count);
return SUCCESS;
return SUCCESS;
}
…
L
}
…
FuncName
Counter
…
…
log_writer
0
…
…
Function execution accounting is not suitable
for sampling, so this part is unconditional.
31
CCI-FunRe Sampling Strategy
• Function execution accounting:
–unconditional
• FunRe predicate recording:
–thread-independent
–non-bursty
32
Outline
• Motivation
• CCI Overview
• CCI Predicates and Sampling Strategies
– CCI-Prev and its sampling strategy
– CCI-Havoc and its sampling strategy
– CCI-FunRe and its sampling strategy
• Evaluation
• Conclusion
33
Experimental Evaluation
• Implementation
– Static instrumentor based on the CBI framework
• Real world concurrency bug failure from:
– Apache HTTP server, Cherokee
– Mozilla-JS, PBZIP2
– SPLASH-2: FFT, LU
• Parameter used
– Roughly 1/100 sampling rate
34
Failure Diagnosis Evaluation
• Methodology
– Using concurrency bug failures occurred in real-world
– Each app. runs 3000 times on a multi-core machine
• Add random sleep to get some failure runs
– Sampling is enabled
– Statistical debugging then return a list of predictors
• Which predictor in the list can diagnose failure?
35
Failure Diagnosis Results (with sampling)
Program
CCI-Prev
CCI-Havoc
CCI-FunRe
Apache-1
 top1
 top1
 top1
Apache-2
 top1
 top1

Cherokee

 top2

FFT
 top1


LU
 top1


Mozilla-JS-1

 top2
 top1
Mozilla-JS-2
 top1
 top1
 top1
Mozilla-JS-3
 top2
 top1
 top1
PBZIP2
 top1
 top1

FunRe
Havoc Prev
Capability
36
Runtime Overhead
Prev
Havoc
FunRe
No
Sampling
Sampling
No
Sampling
Sampling
No
Sampling
Sampling
Apache-1
62.6%
1.9%
27.4%
2.8%
1.1%
1.8%
Apache-2
8.4%
0.5%
4.2%
0.4%
0.2%
0.2%
Cherokee
19.1%
0.3%
2.1%
0.0%
0.3%
0.4%
FFT
169 %
24.0%
33.5%
5.5%
72.8%
30.0%
LU
57857 %
949 %
1693 %
8.9%
1682 %
926 %
Mozilla-JS 11311 %
606 %
7587 %
356 %
123 %
97.0%
0.2%
0.2%
0.2%
0.3%
0.2%
PBZIP2
0.2%
FunRe Havoc
Overhead
Prev
37
Overhead
Conclusion
• CCI is capable and suitable to
diagnose many production-run
concurrency bug failures.
• Future predicates can leverage our
effective sampling strategies.
Prev
• Experiments confirm design
tradeoff.
Havoc
FunRe
Capability
38
Questions about
CCI
?
Overhead
Prev
Havoc
FunRe
Capability
39
Questions about
CCI
?
Overhead
Prev
Havoc
FunRe
Capability
40
CBI on Concurrency Bug Failures
thread 1
thread 2
…
…
log_writer() {
log_writer() {
…
memcpy(&buf[idx], s, strlen(s));
idx
…
memcpy(&buf[idx], s, strlen(s));
…
temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
CBI does not work!
…
temp = idx;
idx = temp + strlen(s);
…
return SUCCESS;
…
}
…
L
Concurrency Bug from Apache HTTP Server
To}… diagnose production-run concurrency bug failures,
interleaving related events should be tracked!!!
41
CCI-Prev Predicate Instrumentation
with Sampling
if (gsample) {
lock(glock);
changed = test_and_insert(& cnt, curTid);
record(I, changed);
temp = cnt;
unlock(glock);
} else {
temp = cnt;
}
[[ gsample = true; iset = curTid; lLength=gLength=0;]]?
42
CCI-Prev Predicate Instrumentation
with Sampling
if (gsample) {
lock(glock);
changed = test_and_insert(& cnt, curTid);
curTid, &stale);
record(I, changed);
record(stale
? P1 : P2, changed);
temp = cnt;
unlock(glock);
gLength++;
lLength++;
if (( iset == curTid && lLength > lMAX) || gLength > gMAX)
{ clear (); iset = unusedTid; gsample = false; }
} else {
temp = cnt;
[[ gsample = true; iset = curTid; lLength=gLength=0;]]?
}
43
CCI-Havoc Predicate Instrumentation
with Sampling
if (sample) {
changed = test(& cnt, cnt, &stale);
record(stale ? P1 : P2, changed);
temp = cnt;
insert (& cnt, cnt);
length++;
if (length > lMAX) {
clear ();
sample = false;
}
} else {
No global lock used!!!
temp = cnt;
[[ sample = true; length=0;]]?
}
44
Failure Diagnosis Results (with sampling)
Program
CBI
CCI-Prev
CCI-Havoc
CCI-FunRe
Apache-1

 top1
 top1
 top1
Apache-2

 top1
 top1

Cherokee


 top2

FFT

 top1


LU

 top1


Mozilla-JS-1


 top2
 top1
Mozilla-JS-2

 top1
 top1
 top1
Mozilla-JS-3

 top2
 top1
 top1
PBZIP2

 top1
 top1

FunRe
Havoc Prev
Capability
45
Failure diagnosis is critical
46