02-softwarebugs.pptx

Download Report

Transcript 02-softwarebugs.pptx

15-740/18-740
Oct. 17, 2012
Stefan Muller


Problem: Software is buggy!
More specific problem: Want to make sure
software doesn’t have bad property X.
◦ X could be: double frees, uses freed memory, race
condition, buffer overflow, security vulnerability…

2 solutions:
◦ Static analysis – analyze the code to see if it can
have property X (last paper, sort of)
◦ Dynamic analysis – watch the code as it runs and
stop it if it shows property X (first 3 papers)
10/17/2012
2





Prevents use-after-free
errors by dynamically
adding instructions.
For each pointer, stores
an identifier in hardware.
When a pointer is dereferenced, check that
the identifier is valid.
When a pointer is freed, invalidate its
identifier.
Optimizations for fast checking of identifiers.
10/17/2012
3


Race Detection in Software and Hardware
Uses vector clocks to track which reads and
writes to memory are guaranteed to happen
before now in all threads.
◦ Each core stores a clock for every core including
itself.
◦ Each byte(*) of memory is associated with clocks
showing when it was last read and written.
◦ Cached along with the contents of memory. Stored
in software when data is evicted.
10/17/2012
4

Associate a lifeguard with a running thread
◦ Lifeguard checks execution of the thread for bugs
◦ Run lifeguard in parallel on another core

Running many threads+lifeguards
in parallel causes problems.
◦ Atomicity of accesses to lifeguard
metadata
◦ Out-of-order execution: in some cases,
it matters that events that happen
first are seen first by lifeguard.
Thread 1
Thread 2
x = *p
free(p)
10/17/2012
5

How to reduce the penalty to access metadata?
◦ Caching!

Use existing architecture features
◦ RADISH uses cache coherence messages to update
clocks.
◦ ParaLog uses cache coherence messages to ascertain
dependences between events.

Modify other features to aid analysis
◦ Watchdog has a separate cache for identifier info.
◦ RADISH adds additional logic and hardware state to store
and compute with per-core clocks.
◦ ParaLog maintains a TLB mapping commonly used
application data to the location of related metadata.
10/17/2012
6
Watchdog
RADISH
ParaLog
Type of error
Use-after-free
Race condition
Many
Metadata
Identifiers
Clocks
Varies
Where stored
Registers +
Memory
Caches +
Software
Memory
Where run
Application
thread
Application
thread
Separate core
Runtime penalty
24%
0-100%
Varies
(51% and 28% for
two lifeguards on
eight cores)
10/17/2012
7


How much metadata to store
Hardware vs. Software
◦ Hardware is fast, but software is flexible and allows
a reduction in space usage.
◦ We’ve seen ways to store some metadata in
hardware, but use a different system (maybe
software) when that overflows.

Where to run checks
◦ Use a separate core and run application in ~realtime or instrument application with runtime checks?
10/17/2012
8

Works backward from (potential) failures to
find concurrency errors that trigger them.
◦ Identify failure sites (e.g. assert failures, bad
outputs…) Static
◦ Identify a critical read that affects the value of local
memory at that failure site. Static
◦ Find alternative interleavings that might result in
different values at critical read by observing a
(probably correct) run of the program. Dynamic
 Use vector clocks to identify other writes that may
produce such alternate values
10/17/2012
9
Thread 1
Trigger
p = malloc(sizeof(some_t));
for (int i = 0; i < 5; i++)
a[i] = 0;
Failure
Thread 2
p = NULL;
Other write
to be
interleaved
assert (p != NULL);
10/17/2012
10


ConSeq is only run during testing, so no
production runtime overhead.
However, give up two properties:
◦ Soundness: no false negatives
◦ Completeness: no false positives
◦ Can usually only have one of these anyway. ConSeq
instead seeks to balance both with performance.

Is this a good tradeoff?
10/17/2012
11