RacerX: Effective, Static Detection of Race Conditions and

Download Report

Transcript RacerX: Effective, Static Detection of Race Conditions and

RacerX: Effective, Static Detection of
Race Conditions and Deadlocks
by Dawson Engler & Ken Ashcraft
(published in SOSP03)
Hong,Shin
2015-07-21
Hong,Shin @ PSWLAB
RacerX: Effective, Static Detection of Race Conditions and Deadlocks
1
/ 20
Contents
•
•
•
•
•
•
Introduction
Overview
Lockset Analysis
Deadlock Checking
Datarace Checking
Conclusion
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
2
/ 20
Introduction
1/2
• Finding data races and deadlocks is difficult.
• There have been many approaches to detect these errors.
– Dynamic detecting tool (e.g. Erase)
• These tools can only find errors on executed paths.
– Model checking
• Model checking is not scalable (state explosion problem)
– Static tool
• Many static tools make heavy use of annotations to inject knowledge
into the analysis.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
3
/ 20
Introduction
2/2
• Approach
– Do not need annotations except for an indication as to what
functions are used to acquire and release locks.
– Minimize the impact of false positives(false alarms)
– Must scale to large industrial program both in speed and in its
ability to report complex errors.
A static tool that uses flow-sensitive, interprocedural analysis to
detect both race conditions and deadlock
It aggressively infer checking informations
(e.g. which locks protect which operations, which code contexts are
multithreaded, which shared accesses are dangerous)
 The tool sorts errors from most to least severe
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
4
/ 20
Overview
1/3
• At a high level, checking a system with RacerX involves five
phases:
(1)
(2)
(3)
(4)
(5)
2015-07-21
Retargeting a system to system-specific locking function
Extracting a control flow graph from the system
Analysis
Ranking errors
Inspection
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
5
/ 20
Overview
2/3
(1) Retargeting a system to system-specific locking function
–
–
Users supply a table specifying the functions used to
acquire/release locks, and disable/enable interrupts.
Users may optionally specify a function is single-threaded, multithreaded, or interrupt handler
(2) Extracting a control flow graph from the system
–
–
–
2015-07-21
The tool extracts a CFG from the system and stores it in a file.
The CFG contains all function calls, uses of global variables, uses of
parameter pointer variables, and optionally uses of all local
variables, concurrency operations.
The CFG includes the symbolic information for these objects, such
as their names, types, whether an access is read or write, whether
a variable is a parameter or not, whether a function or variable is
static or not, the line number, etc.
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
6
/ 20
Overview
3/3
(3) Analysis
– The tool reads the emitted CFG and constructs a linked whole
system CFG. And traverse the whole system CFG checking for
deadlocks or data races.
– The traversal is depth-first, flow-sensitive, and interprocedural and it
tracks the set of locks held at any point.
– At each program statement, the race checker or deadlock checker
are passed the current statement, the current lockset, etc.
(4) Ranking errors
– Compute ranking information for error messages
– Ranking sorts error messages based on two features: the likelihood
of being false positive, and the difficulty of inspection
(5) Inspection
– Present the ranked error messages to users
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
7
/ 20
Lockset Analysis
1/5
• The tool compute locksets at all program points using a
top-down, flow-sensitive, context-sensitive, interprocedural
analysis.
– Top-down: it starts the root of each call graph and does a DFS
traversal down the CFG.
– Flow-sensitive: the analysis effects of each path rather than
conflate paths at join points.
– Context-sensitive: analyzes the lockset at each actual callsite.
• In the DFS traversal over the CFG, the tool
(1) adds and removes locks as needed, and
(2) calls the race and deadlock checkers on each statement.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
8
/ 20
Lockset Analysis
2/5
• Caching
– Statement cache:
The tool caches the locksets that have reached each statement in
CFG.
– Summary cache:
The tool caches the effect of each function by recording for each
lockset l that entered function f , the set of locksets (l1, … , ln) that
was produced.
– Caching works because the analysis is deterministic – two
executions that both start from the same statement with the same
lockset will always produce the same result.
– Since the analysis is flow-sensitive, a function could produce an
exponential number of locksets. However, in practice, their effect
are more modest.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
9
/ 20
Lockset Analysis
• Pseudo-code for interprocedural lockset algorithm (1/2)
3/5
void traverse_cfg(set of nodes roots)
foreach r in roots
traverse_fn(r, {}) ;
end
set of locksets traverse_fn(fn, ls)
foreach edge x in fn->cache
Check summary cache
if (x->entry_lockset == ls) return
x->exit_locksets ;
a
if (fn->on_stack_p) return {} ;
Break recursive call
fn->on_stack_p = 1 ;
x = new edge ;
x->entry_lockset = lockset ;
x->exit_locksets=traverse_stmts(fn->entry,ls,ls);
fn->on_stack_p = 0 ;
Cache update
fn->cache = fn->cache union x ;
return x->exit_locksets ;
end
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
10
/ 20
Lockset Analysis
• Pseudo-code for interprocedural lockset algorithm (2/2)
4/5
set of locksets traverse_stmts(s, entry_ls, ls)
if ((entry_ls, ls) in s->cache) return {}
Check statement cache
s->cache = s->cache union (entry_ls, ls) ;
Cache update
if (s is end-of-path) return ls ;
if (s is lock acquire operation) ls = add_lock(ls, s) ;
if (s is lock release operation) ls = remove_lock(ls, s) ;
if (s is not resolved call) worklist = {ls}
else worklist = traverse_fn(s->fn, ls) ;
Lockset update
summ = {} ;
foreach l in worklist
foreach k in s->succ
summ = summ union traverse_stmts(k,entry_ls, l) ;
return sum ;
DFS traversal
end
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
11
/ 20
Lockset Analysis
5/5
• Limitations
– Do not do alias analysis.
The tool represent local and parameter pointer
variables by their type and name rather than their
variable name.
(e.g. a parameter foo that is a pointer to a structure of
type bar will be named “local:struct bar”)
– Do only simple function pointer resolution
Record all functions ever assigned to a function
pointer of a given type. And each call site, assume that
all of the function could be invoked.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
12
/ 20
Deadlock Checking
(1)
(2)
(3)
(4)
(5)
2015-07-21
1/9
Computing locking cycles
Ranking
Increasing analysis accuracy
Handling lockset mistakes
Experience result
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
13
/ 20
Deadlock Checking
2/9
Computing locking cycles
(1) Constraint extraction
At every lock acquisition, emit the lock ordering constraints
produced by the current lock acquisition.
(e.g. if the current lockset is {l1, l2} and the current ly acquired
lock is l3, then emit l1l3, and l2l3)
(2) Constraint solving
Reads in the emitted locking constraints and computes the
transitive closure of all dependencies. It records the shortest
path between any cyclic lock depdendencies.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
14
/ 20
Deadlock Checking
3/9
Ranking
• Rank error messages based on three criteria:
(1) The number of threads involved.
- Errors with fewer threads are preferred to one with many threads.
(2) Whether the lock involved are local or global
- Global lock errors are preferred over local one.
(3) The depth of the call chain
- Short call chains are better than longer ones.
•
Use these ranking criteria hierarchically to sort error
message: (1) > (2) > (3)
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
15
/ 20
Deadlock Checking
4/9
Example: Error message of simple deadlock between two global locks
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
16
/ 20
Deadlock Checking
5/9
Increasing analysis accuracy (1/2)
• There are two significant sources of false lock dependencies:
(1) Semaphores used to enforce scheduling dependency
- A semaphore may be used to implement scheduling dependencies.
- Signal-wait semaphores have two behavior patterns:
they are almost never paired, more lock than unlock
- Statistical approach:
(1) Calculate how often true locks satisfies these two behaviors by counting
the number of lock acquisitions, lock releases, and unlock errors.
(2) And discard semaphores below some probability threshold.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
17
/ 20
Deadlock Checking
6/9
Increasing analysis accuracy (2/2)
(2) “Release-on-block” locks
• Many operating systems such as FreeBSD and Linux use global, coarsegrained locks(e.g. big kernel lock) that have “release-on-block”
semantics.
<Thread1>
lock_kernel() ;
down(sem) ;
<Thread2>
down(sem) ;
lock_kernel();
down(sem) {
…
while( down(sem) would block ) {
unlock_kernel() ;
schedule() ;
lock_kernel() ;
}
…
}
2015-07-21
<Thread1>
lock_kernel() ;
<Thread2>
down(sem) ;
down(sem) ;
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
lock_kernel() ;
/* No deadlock */
Hong,Shin @ PSWLAB
18
/ 20
Deadlock Checking
7/9
Handling lockset mistakes
• The most of deadlock false positives are caused by invalid
locksets.
• And almost all invalid locksets arise from a data-dependent lock
release, or correlated branches.
e.g. void foo(int x) {
if (x) lock(l) ;
…
if (x) unlock(l) ;
}
Without path-sensitive analysis, the tool will believe there are four
paths through foo.
 Use simple and novel propagation techniques to minimize the
propagation of invalid locksets.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
19
/ 20
Deadlock Checking
8/9
• Cutting off lock-error paths
- Cut off the lockset on paths that contains a locking error.
• Downward-only lockset propagation
- A significant source of false positives occur when it falsely believe that a lock
is held on function exit when it is actually not.
- Propagate locksets downward from caller to callee but never upward.
- Cause false negatives for wrapper functions.
• Selecting the right summary
- Majority summary selection: Rather than following all locksets a function call
with generates, we take the one produced by the largest number of exit point
within the function.
- Minimum-size summary selection
• Unlockset analysis
- At program statement s, remove any lock l in the current lockset if there
exists no successor statement s’ reachable from s that contains an unlock
operation of l.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
20
/ 20
Deadlock Checking
Experience result
9/9
Ex. Deadlock: acquired lock is released and then reacquired by the same thread.
scsiLock
handleArrayLock
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
21
/ 20
Data Race Checking
1/6
• Dataracer checker is called by the lockset analysis on each
statement.
• The checker can be run in three modes:
(1) Simple checking
- only flags global accesses that occur without any lock held.
(2) Simple statistical
- infer which non-global variables and functions must be
protected by some lock.
(3) Precise statistical
- infer which specified lock protects an access and flag when an
access occurs when the lockset does not contain the lock.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
22
/ 20
Data Race Checking
2/6
• The tool uses a set of heuristics to rank data race errors by
a scoring function.
• Heuristics are to answer following questions:
-
Is the lockset valid?
Is code multithreaded?
Does x need to be protected?
Does x need to be protected by L?
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
23
/ 20
Data Race Checking
3/6
- Is code multithreaded?
Two methods of determining a code is multithreaded:
(1) Multithreading inference
– Any concurrency operation (e.g. lock acquire/release, atomic
operations) implies that the programmer believes the surrounding
code is multithreaded.
– The tool marks a function as multithreaded if concurrency
operations occur anywhere within its body, or anywhere above it in
a call chain.
(2) Programmer written automatic annotator
– Users can mark a function as single threaded, a function that
should be ignored, multithreaded, interrupt handler.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
24
/ 20
Data Race Checking
4/6
- Does x need to be protected?
• There are three approaches to answer this question:
(1) Eliminating accesses unlikely to be dangerous,
- Avoid flagging data races on variables that are private to a thread.
- Demote errors where data appears to be written only during
initialization and only read afterwards.
(2) Promoting accesses that have a good chance of being unsafe
- Favor errors that write data over errors that read data
- Flag unprotected variables that cannot be read or written
(e.g. 64-bit variables on 32-bit machine)
atomically
(3) Inferring which variables programmers believe must not be
accessed without a lock.
- Count how many times each variable is accessed with a lock held and
versus not.
- Variables the programmer believes should be protected will have a
relatively high number of locked accesses and few unlocked accesses.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
25
/ 20
Data Race Checking
5/6
- Does x need to be protected by L?
• The tool infers whether a given lock protects a variable (or a
function) using statistical approaches.
• For each variable (or function)
(1) the number of accesses to a variable(function)
(2) the number of times these accesses held a specific lock
• And then pick a single best lock out of all the candidates and
then do an interprocedural checking with this information.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
26
/ 20
Data Race Checking
Experience result
6/6
Ex. Datarace error
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
27
/ 20
Conclusion
• RacerX is a static tool that uses flow-sensitive,
interprocedural analysis to detect both data races and
deadlocks.
• RacerX found errors in large commercial codes such as
FreeBSD, and Linux.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
28
/ 20
Further Work
• Chord , by Mayur Naik and Alex Aiken , POPL07
Static race detection system for Java.
Flow-insensitive , context-sensitive static analysis tool.
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
29
/ 20
Reference
[1] RacerX: Effective, Static Detection of Race Conditions and
Deadlocks, Dawson Engler & Ken Ashcraft, SOSP03
2015-07-21
RacerX: Effective, Static Detection of Race Conditions and
Deadlocks
Hong,Shin @ PSWLAB
30
/ 20