Cork: Dynamic Memory Leak Detection with Garbage

Download Report

Transcript Cork: Dynamic Memory Leak Detection with Garbage

Department of Computer Sciences
Dynamic Shape Analysis
via Degree Metrics
Maria Jump & Kathryn S. McKinley
Department of Computer Sciences
The University of Texas at Austin
{mjump,mckinley}@cs.utexas.edu
Software is Dynamic
 Software is always changing
 New expectations
 New algorithms
 New applications
 New users
 New notions of what is cool
 Limited only by human ambition
First Law of Software: Software is a gas!
It expands to fit the container it is in!
[Nathan Myhrvold, Former CTO, Microsoft]
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Software Complexity
Property of a system that is directly proportional to the
difficulty one has in comprehending the system
at a level and detail necessary to make changes
without introducing instability or functional regressions
[Peter Rosser, Microsoft Software Engineer]
 Developed by large teams
 Few, if any, understand all parts
 Need tools to help
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Program Analysis
Program =
Code
+
Data
Software complexity leads to heap complexity
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Heap Complexity
 Exacerbated by modern languages
 Objects are smaller and more numerous
 Most objects allocated on the heap
 Objects encode program state
 Heap contains semantic, memory, and
concurrency bugs
Program analysis is incomplete
without heap analysis
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
An Opportunity
 Automatic memory management
examines live data on a regular basis
 Understand heap usage
 Detect heap-based bugs
 Optimize program based on heap usage
Opportunity to perform
heap analysis dynamically
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Shape Analysis
Characterize the “shape” of heap-allocated
pointer-based data structures and
verify shape-preserving properties
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Shape Analysis
Static
Dynamic
 Conservatively characterizes
 Discover shape of current
all “possible” shapes at every
program point
 Only works on small
programs
Program =
Code
20-Jun-2009
Department of Computer Sciences
+
data structure
 Generate assertions
for verification
 Monitor shape during
entire execution
Data
Jump & McKinley
Shape Analysis
HomeBrewed
Custom
Library
Library
80
LinkedHashMap: 99.3%
60
BinaryTree: 35.8%
HashMap: 59.5%
40
20
20-Jun-2009
Department of Computer Sciences
Custom: 33%
Library: 58%
average
xalan
pmd
lusearch
luindex
jython
fop
eclipse
chart
bloat
antlr
jack
mpegaudio
javac
raytrace
jess
0
OctTree: 98%
compress
% of Heap
100
Jump & McKinley
ShapeUp
 Desired goals
 Discover shape of current data structure
 Discover dynamic invariants
 Monitor expected shape of current data
structure during execution
 Heap summarization graph [POPL ‘07]
 Class field-wise graph (CFWG)
 Identifies recursive data structures (RDS)
 Summarize dynamic degree invariants
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Class Field-Wise Graph (CFWG)
Heap
CFWG
Home-brewed data structure from SPECjbb2000 (simplified)
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Class Field-Wise Graph (CFWG)
Heap
CFWG
Home-brewed data structure from SPECjbb2000 (simplified)
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Class Field-Wise Graph (CFWG)
Heap
CFWG
Degree Metrics
Home-brewed data structure from SPECjbb2000 (simplified)
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Degree Metrics
 Degree metrics per instance:
 Out-degree: # of outgoing references
 In-degree: # of incoming references
 Only count edges between objects
of same class
 Use word in header to track in-degree of
backbone objects
 Summarize by class per data structure
instance in a degree profile
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=1
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=1
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=1
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=2
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=2
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=2
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=3
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=3
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=3
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=4
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=5
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=6
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=6
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=6
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=7
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=8
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=9
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=9
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=9
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=10
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=10
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=10
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=11
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=12
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=13
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=13
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=13
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=14
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating Degree Profile
n=15
CFWG
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Calculating
Degree Profile
constant
n
CFWG
range
20-Jun-2009
Complete Binary Tree
Department of Computer Sciences
Jump & McKinley
Calculating
Degree Profile
constant
CFWG
range
left
20-Jun-2009
Complete Binary Tree
Department of Computer Sciences
right
Jump & McKinley
Interpreting Degree Profile
 Monitoring degree profile shows errors:
 Error: constant violation
 Warning: range violation [min-,max+]
 Indicate problem with RDS during
development and testing
 Monitor shape after deployment for
dynamically introduced errors
What kind of errors can be detected?
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Implementation and Methodology
 Jikes RVM
 SPECjvm98, DaCapo, SPECjbb2000
 Heap composition
 Showed degree metric are stable by class
 Overheads
 <8% total time
 <1% space overhead [used bits in header]
 Microbenchmarks
 Single RDS in isolation
 Random error injection
20-Jun-2009
Department of Computer Sciences
Singly-linked list
Doubly-linked list
Binary Tree
Binary Tree w/ PP
Jump & McKinley
Looking for errors
TRAINING
 Ran microbenchmark 100 times
 Merged together dynamic invariants
ERROR DETECTION
 Created RDS with 100,000 nodes
 Errors: 1, 2, 3, 4, 5, 10, 50, and 100
 Detected dynamic invariant violations
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Doubly-linked List
=0
=1
=2
in
0
2
n-2
out
0
2
n-2
circle
disconnect
20-Jun-2009
Department of Computer Sciences
cyclic
skip
Jump & McKinley
Doubly-linked List
1
2
=0
=1
=2
in
0
2
n-2
out
0
2
n-2
3
4
5
10
50
100
100
% Detection
90
80
70
60
50
40
30
20
10
20-Jun-2009
Department of Computer Sciences
Random
Skip
Cyclic
Disconnect
Circle
0
Jump & McKinley
Binary Tree
=0
=1
=2
Complete Binary Tree
in
out
1
n-1
0
[50.0,50.2]
0
[49.8,50.0]
Full Binary Tree
in
out
1
n-1
0
[50.0,50.1]
0
[49.9,50.0]
Random Binary Tree
in
out
1
n-1
0
[33.6,35.7]
[28.7,35.1]
[32.3,35.6]
link
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Binary Tree
=0
=1
=2
Complete Binary Tree
in
out
1
n-1
0
[50.0,50.2]
0
[49.8,50.0]
Full Binary Tree
in
1
2
3
4
5
10
50
out
100
100
n-1
0
[50.0,50.1]
0
[49.9,50.0]
Random Binary Tree
90
80
in
70
60
out
50
1
n-1
0
[33.6,35.7]
[28.7,35.1]
[32.3,35.6]
40
30
20
10
BTRandom
BTFull
0
BTComplete
% Detection
1
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Binary Tree w/ PP
=1
=2
=3
Complete Binary Tree w/ PP
in
[50.0,50.0]
2
[49.9,50.0]
out
[50.0,50.0]
2
[49.9,50.0]
link
Full Binary Tree w/ PP
in
[50.0,50.1]
1
[49.9,50.0]
out
[50.0,50.1]
1
[49.9,50.0]
Random Binary Tree w/ PP
in
[33.8,35.2]
[30.0,32.6]
[33.6,34.8]
out
[33.8,35.2]
[30.0,32.6]
[33.6,34.8]
disconnect
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
20-Jun-2009
Department of Computer Sciences
50
BTPRandom
Random
10
BTPFull
Random
5
BTPComplete
Random
4
BTPRandom
Disconnect
3
BTPFull
Disconnect
2
BTPComplete
Disconnect
1
BTPRandom
Link
BTPFull Link
BTPComplete
Link
% Detection
Microbenchmark: Binary Tree w/ PP
100
100
90
80
70
60
50
40
30
20
10
0
Jump & McKinley
ShapeUp’s Contributions




Performs dynamic heap analysis to determine shape
of current data structure
Summarizes degree metrics using a class field-wise
graph with low overheads
Shows whole-heap degree metrics are not stable, but
class degree metrics are
Introduces degree profiles for RDS


Regular RDS have more invariants than random RDS
Degree profiles can detect some shape errors
Thank You!
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Thank You!
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Space Overhead
jess
# of
types
bm+VM
Eclipse
Geomean
1744
3365
avg
318
667
max
319
775
# of
avg
edges max
844
4090
861
7585
1142
0.094%
0.167%
0.233%
0.233%
Increased Alloc %
20-Jun-2009
Department of Computer Sciences
19%
2.7X
1747
334
346
904
Jump & McKinley
Normalized Total Time
Time Overhead
Heap Size Relative to Minimum
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Singly-linked List
=0
=1
in
1
n-1
out
1
n-1
1 Error
head to tail
 cyclic := creates a
cycle from tail to
random node
90
80
Detection %
 circle := attaches
100
70
60
50
40
30
20
10
20-Jun-2009
Department of Computer Sciences
SLL Cyclic
SLL Circle
0
Jump & McKinley
Microbenchmark: HashMap
=0
=1
=2
in
[31.9,51.6]
[48.4,68.1]
{0}
out
[31.9,51.6]
[48.4,68.1]
{0}
1
2
3
4
5
10
50
100
100
90

link := creates a connection
between buckets from a null
ptr to a random entry
80
70
60
50
40
30
20
10
HM Link
0
20-Jun-2009
Department of Computer Sciences
Jump & McKinley
Data Structures in Benchmarks
Benchmark
RDS Node
Raytrace
OctNode
H
78.0
Jack
LinkedHashMap$LinkedHashEntry
L
44.2
RuntimeNfaState
H
9.4
Antlr
Object[]
H
76.2
Bloat
HashMap$HashEntry
L
56.6
CallMethodExpr
H
3.7
LinkedHashmap$LinkedHashEntry
L
59.0
AND_AND_Expression
H
0.4
HashMap$HashEntry
L
51.4
PropertyList
H
4.4
Jython
Pyframe
H
94.6
Luindex
LinkedHashMap$LinkedHashEntry
L
99.3
Lusearch
WeakHashMap$WeakBucket
L
47.5
HitDoc
H
2.0
ChildIterator
H
34.6
Eclipse
Fop
Xalan
20-Jun-2009
Department of Computer Sciences
L or H?
%
Jump & McKinley