380C Lecture 17 • Where are we & where we are going – Managed languages • Dynamic compilation • Inlining • Garbage collection – Why you need.
Download
Report
Transcript 380C Lecture 17 • Where are we & where we are going – Managed languages • Dynamic compilation • Inlining • Garbage collection – Why you need.
380C
Lecture 17
• Where are we & where we are going
– Managed languages
• Dynamic compilation
• Inlining
• Garbage collection
– Why you need to care about workloads &
experimental methodology
– Alias analysis
– Dependence analysis
– Loop transformations
– EDGE architectures
1
Today
• Garbage Collection
– Why use garbage collection?
– What is garbage?
• Reachable vs live, stack maps, etc.
– Allocators and their collection mechanisms
• Semispace
• Marksweep
• Performance comparisons
– Incremental age based collection
• Write barriers: Friend or foe?
• Generational
• Beltway
– More performance
2
Basic VM Structure
Program/Bytecode
Executing
Program
Class Loader
Verifier, etc.
Heap
Thread
Scheduler
Dynamic Compilation
Subsystem
Garbage
Collector
3
True or False?
• Real programmers use languages with
explicit memory management.
– I can optimize my memory management much
better than any garbage collector
4
True or False?
• Real programmers use languages with
explicit memory management.
– I can optimize my memory management much
better than any garbage collector
– Scope of effort?
5
Why Use
Garbage Collection?
• Software engineering benefits
– Less user code compared to explict memory management (MM)
– Less user code to get correct
– Protects against some classes of memory errors
• No free(), thus no premature free(), no double free(), or
forgetting to free()
• Not perfect, memory can still leak
– Programmers still need to eliminate all pointers to objects the
program no longer needs
• Performance: space time tradeoff
– Time proportional to dead objects (explicit mm, reference
counting) or live objects (semispace, marksweep)
– Throughput versus pause time
• Less frequent collection, typically reduces total time but increases
space requirements and pause times
– Hidden locality benefits?
6
What is Garbage?
• In theory, any object the program will never
reference again
– But compiler & runtime system cannot figure that out
• In practice, any object the program cannot reach
is garbage
– Approximate liveness with reachability
• Managed languages couple GC with “safe” pointers
– Programs may not access arbitrary addresses in memory
– The compiler can identify and provide to the garbage
collector all the pointers, thus
– “Once garbage, always garbage”
– Runtime system can move objects by updating pointers
– “Unsafe” languages can do non-moving GC by assuming
anything that looks like a pointer is one.
7
Reachability
• Compiler produces a stack-map at GC safe-points and Type
Information Blocks
• GC safe points: new(), method entry, method exit, & backedges (thread switch points)
• Stack-map: enumerate global variables, stack variables, live
registers -- This code is hard to get right! Why?
• Type Information Blocks: identify reference fields in objects
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
8
Reachability
• Compiler produces a stack-map at GC safe-points and Type
Information Blocks
• Type Information Blocks: identify reference fields in objects
for each type i (class) in the program, a map
TIBi
A
B
C
globals
0
{
2
3
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
9
Reachability
• Tracing collector (semispace, marksweep)
– Marks the objects reachable from the roots live, and
then performs a transitive closure over them
mark
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
10
Reachability
• Tracing collector (semispace, marksweep)
– Marks the objects reachable from the roots live, and
then performs a transitive closure over them
mark
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
11
Reachability
• Tracing collector (semispace, marksweep)
– Marks the objects reachable from the roots live, and
then performs a transitive closure over them
mark
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
12
Reachability
• Tracing collector (semispace, marksweep)
– Marks the objects reachable from the roots live, and
then performs a transitive closure over them
• All unmarked objects are dead, and can be
reclaimed
mark
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
13
Reachability
• Tracing collector (semispace, marksweep)
– Marks the objects reachable from the roots live, and
then performs a transitive closure over them
• All unmarked objects are dead, and can be
reclaimed
sweep
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
14
Today
• Garbage Collection
– Why use garbage collection?
– What is garbage?
• Reachable vs live, stack maps, etc.
– Allocators and their collection mechanisms
• Semispace
• Marksweep
• Performance comparisons
– Incremental age based collection
• Write barriers: Friend or foe?
• Generational
• Beltway
– More performance
15
Semispace
• Fast bump pointer allocation
• Requires copying collection
• Cannot incrementally reclaim memory, must
free en masse
• Reserves 1/2 the heap to copy in to, in case
all objects are live
to space
from space
heap
16
Semispace
• Fast bump pointer allocation
• Requires copying collection
• Cannot incrementally reclaim memory, must
free en masse
• Reserves 1/2 the heap to copy in to, in case
all objects are live
to space
from space
heap
17
Semispace
• Fast bump pointer allocation
• Requires copying collection
• Cannot incrementally reclaim memory, must
free en masse
• Reserves 1/2 the heap to copy in to, in case
all objects are live
to space
from space
heap
18
Semispace
• Fast bump pointer allocation
• Requires copying collection
• Cannot incrementally reclaim memory, must
free en masse
• Reserves 1/2 the heap to copy in to, in case
all objects are live
to space
from space
heap
19
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
from space
to space
heap
20
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
from space
to space
heap
21
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
from space
to space
heap
22
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
from space
to space
heap
23
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
– reclaims “from space” en masse
from space
to space
heap
24
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
– reclaims “from space” en masse
– start allocating again into “to space”
from space
to space
heap
25
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
– reclaims “from space” en masse
– start allocating again into “to space”
from space
to space
heap
26
Semispace
• Notice:
fast allocation
locality of contemporaneously allocated objects
locality of objects connected by pointers
wasted space
from space
to space
heap
27
Marksweep
• Free-lists organized by size
– blocks of same size, or
– individual objects of same size
• Most objects are small < 128 bytes
4
8
12
16
...
...
128
...
free lists
heap
28
Marksweep
• Allocation
– Grab a free object off the free list
4
8
12
16
...
...
128
...
free lists
heap
29
Marksweep
• Allocation
– Grab a free object off the free list
4
8
12
16
...
...
128
...
free lists
heap
30
Marksweep
• Allocation
– Grab a free object off the free list
4
8
12
16
...
...
128
...
free lists
heap
31
Marksweep
• Allocation
–
–
–
–
Grab a free object off the free list
No more memory of the right size triggers a collection
Mark phase - find the live objects
Sweep phase - put free ones on the free list
4
8
12
16
...
...
128
...
free lists
heap
32
Marksweep
• Mark phase
– Transitive closure marking all the live objects
• Sweep phase
– sweep the memory for free objects populating free list
4
8
12
16
...
...
128
...
free lists
heap
33
Marksweep
• Mark phase
– Transitive closure marking all the live objects
• Sweep phase
– sweep the memory for free objects populating free list
4
8
12
16
...
...
128
...
free lists
heap
34
Marksweep
• Mark phase
– Transitive closure marking all the live objects
• Sweep phase
– sweep the memory for free objects populating free list
4
8
12
16
...
...
128
...
free lists
heap
35
Marksweep
• Mark phase
– Transitive closure marking all the live objects
• Sweep phase
– sweep the memory for free objects populating free list
– can be made incremental by organizing the heap in blocks and
sweeping one block at a time on demand
4
8
12
16
...
...
128
...
free lists
heap
36
Marksweep
space efficiency
Incremental object reclamation
relatively slower allocation time
poor locality of contemporaneously allocated
objects
4
8
12
16
...
...
128
...
free lists
heap
37
How do these differences
play out in practice?
Marksweep
space efficiency
Incremental object reclamation
relatively slower allocation time
poor locality of contemporaneously allocated
objects
Semispace
fast allocation
locality of contemporaneously allocated objects
locality of objects connected by pointers
wasted space
38
Methodology
[SIGMETRICS 2004]
• Compare Marksweep (MS) and Semispace (SS)
• Mutator time, GC time, total time
• Jikes RVM & MMTk
• replay compilation
• measure second iteration without compilation
• Platforms
• 1.6GHz G5 (PowerPC 970)
• 1.9GHz AMD Athlon 2600+
• 2.6GHz Intel P4
• Linux 2.6.0 with perfctr patch & libraries
– Separate accounting of GC & Mutator counts
• SPECjvm98 & pseudojbb
39
Allocation Mechanism
• Bump pointer
– ~70 bytes IA32 instructions, 726MB/s
• Free list
– ~140 bytes IA32 instructions, 654MB/s
• Bump pointer 11% faster in tight loop
– < 1% in practical setting
– No significant difference (?)
40
Mutator Time
41
jess
jess mutator time
1.35
Normalized mutator time
1.3
1.25
1.2
MarkSweep
SemiSpace
1.15
1.1
1.05
1
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
42
jess
jess L1 misses
2
Normalized L1 misses
1.8
1.6
MarkSweep
SemiSpace
1.4
1.2
1
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
43
jess
jess L2 misses
9
8
Normalized L2 misses
7
6
MarkSweep
5
SemiSpace
4
3
2
1
1
1.21
1.44
1.93
2.47
3.07
3.72
4.43
5.19
6
Normalized Heap Size
44
jess
jess TLB misses
Normalized TLB misses
3
2.5
MarkSweep
2
SemiSpace
1.5
1
1
1.21
1.44
1.93
2.47
3.07
3.72
4.43
5.19
6
Normalized Heap Size
45
javac
javac L2 misses
1.8
1.15
1.6
MarkSweep
1.1
SemiSpace
1.05
Normalized L2 misses
Normalized mutator time
javac mutator time
1.2
1
MarkSweep
1.4
SemiSpace
1.2
1
1
1.21
1.44
1.93
2.47
3.07
3.72
4.43
5.19
6
1
1.21
1.44
Normalized Heap Size
2.47
3.07
3.72
4.43
5.19
6
Normalized Heap Size
javac L1 misses
javac TLB misses
1.5
1.8
1.3
MarkSweep
SemiSpace
1.2
1.1
1
Normalized TLB misses
1.4
Normalized L1 misses
1.93
1.6
MarkSweep
1.4
SemiSpace
1.2
1
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
46
pseudojbb
jbb mutator time
jbb L2 misses
1.25
1.7
1.6
1.15
MarkSweep
SemiSpace
1.1
Normalized L2 misses
Normalized mutator time
1.2
1.05
1.5
1.4
MarkSweep
SemiSpace
1.3
1.2
1.1
1
1
1
1.21
1.44
1.93
2.47
3.07
3.72
4.43
5.19
6
1
1.21
1.44
Normalized Heap Size
1.93
2.47
3.07
3.72
4.43
5.19
6
Normalized Heap Size
jbb L1 misses
jbb TLB misses
1.4
1.6
Normalized L1 misses
MarkSweep
1.2
SemiSpace
1.1
Normalized TLB misses
1.5
1.3
1.4
MarkSweep
1.3
SemiSpace
1.2
1.1
1
1
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
47
Geometric Mean
Mutator Time
1.2
Mutator Time (Normalized to Best)
1.18
MarkSweep
1.16
SemiSpace
1.14
1.12
1.1
1.08
1.06
1.04
1.02
1
1
2
3
4
5
6
Heap Size (Relative to Min)
48
Garbage Collection Time
49
Garbage Collection Time
13
31
MarkSweep
javac
9
7
5
SemiSpace
pseudojbb
21
16
11
6
3
1
1
1
2
31
3
4
5
1
6
MarkSweep
GC Time (Normalized to Best)
jess
16
11
1
3
4
Heap Size (Relative to Min)
5
6
6
Geometric
mean
5
1
5
SemiSpace
7
3
2
4
MarkSweep
9
6
1
3
Heap Size (Relative to Min)
11
SemiSpace
21
2
13
Heap Size (Relative to Min)
26
GC Time (Normalized to Best)
MarkSweep
26
SemiSpace
GC Time (Normalized to Best)
GC Time (Normalized to Best)
11
1
2
3
4
Heap Size (Relative to Min)
5
6
50
Total Time
51
Total Time
10
3.5
3
javac
2.5
MarkSweep
8
SemiSpace
Time (Normalized to Best)
Time (Normalized to Best)
9
MarkSweep
2
1.5
pseudojbb
7
6
SemiSpace
5
4
3
2
1
1
10
1
2
3
4
1
6
Time (Normalized to Best)
SemiSpace
jess
7
6
5
3
4
5
4
3
2
6
Heap Size (Relative to Min)
2.4
MarkSweep
8
2
2.6
Heap Size (Relative to Min)
9
Time (Normalized to Best)
5
MarkSweep
SemiSpace
2.2
Geometric
mean
2
1.8
1.6
1.4
1.2
1
1
2
3
4
Heap Size (Relative to Min)
5
6
1
1
2
3
4
Heap Size (Relative to Min)
5
52
6
MS/SS Crossover: 1.6GHz PPC
3
1.6GHz PPC SemiSpace
1.6GHz PPC MarkSweep
Normalized Total Time
2.5
2
1.5
1
1
2
3
4
5
6
Heap Size Relative to Minimum
53
MS/SS Crossover: 1.9GHz AMD
3
1.6GHz PPC SemiSpace
1.6GHz PPC MarkSweep
1.9GHz AMD SemiSpace
Normalized Total Time
2.5
1.9GHz AMD MarkSweep
2
1.5
1
1
2
3
4
5
6
Heap Size Relative to Minimum
54
MS/SS Crossover: 2.6GHz P4
3
1.6GHz PPC SemiSpace
1.6GHz PPC MarkSweep
1.9GHz AMD SemiSpace
Normalized Total Time
2.5
1.9GHz AMD MarkSweep
2.6GHz P4 SemiSpace
2.6GHz P4 MarkSweep
2
1.5
1
1
2
3
4
5
6
Heap Size Relative to Minimum
55
MS/SS Crossover: 3.2GHz P4
3
1.6GHz PPC SemiSpace
1.6GHz PPC MarkSweep
1.9GHz AMD SemiSpace
Normalized Total Time
2.5
1.9GHz AMD MarkSweep
2.6GHz P4 SemiSpace
2.6GHz P4 MarkSweep
3.2GHz P4 SemiSpace
2
3.2GHz P4 MarkSweep
1.5
1
1
2
3
4
5
6
Heap Size Relative to Minimum
56
MS/SS Crossover
3
1.6GHz PPC SemiSpace
1.6GHz PPC MarkSweep
1.9GHz AMD SemiSpace
Normalized Total Time
2.5
1.9GHz AMD MarkSweep
2.6GHz P4 SemiSpace
locality
2.6GHz P4 MarkSweep
space
3.2GHz P4 SemiSpace
2
3.2GHz P4 MarkSweep
2.6GHz
1.5
3.2GHz
1
1
2
1.6GHz
1.9GHz
3
4
5
6
Heap Size Relative to Minimum
57
Today
• Garbage Collection
– Why use garbage collection?
– What is garbage?
• Reachable vs live, stack maps, etc.
– Allocators and their collection mechanisms
• Semispace
• Marksweep
• Performance comparisons
– Incremental age based collection
• Enabling mechanisms
– write barrier & remembered sets
• Heap organizations
– Generational
– Beltway
– Performance comparisons
58
One Big Heap?
Pause times
– it takes to long to trace the whole heap at once
Throughput
– the heap contains lots of long lived objects, why collect
them over and over again?
Incremental collection
– divide up the heap into increments and collect one at a time.
to space
from space
Increment 1
to space
from space
Increment 2
59
Incremental Collection
Ideally
• perfect pointer knowledge of live pointers between
increments
• requires scanning whole heap, defeats the purpose
to space
from space
Increment 1
to space
from space
Increment 2
60
Incremental Collection
Ideally
• perfect pointer knowledge of live pointers between
increments
• requires scanning whole heap, defeats the purpose
to space
from space
Increment 1
to space
from space
Increment 2
61
Incremental Collection
Ideally
• perfect pointer knowledge of live pointers between
increments
• requires scanning whole heap, defeats the purpose
to space
from space
Increment 1
to space
from space
Increment 2
62
Incremental Collection
Ideally
• perfect pointer knowledge of live pointers between
increments
• requires scanning whole heap, defeats the purpose
Mechanism: Write barrier
• records pointers between increments when the mutator
installs them, conservative approximation of reachability
to space
from space
Increment 1
to space
from space
Increment 2
63
Write barrier
compiler inserts code that records pointers between
increments when the mutator installs them
// original program
p.f = o;
// compiler support for incremental collection
if (incr(p) != incr(o) {
remembered set (incr(o)) U p.f;
}
p.f = o;
remset1 ={w}
a b c
remset2 ={f,g}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
64
Write barrier
Install new pointer d -> v
// original program
p.f = o;
// compiler support for incremental collection
if (incr(p) != incr(o) {
remembered set (incr(o)) U p.f;
}
p.f = o;
remset1 ={w}
a b c
remset2 ={f,g}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
65
Write barrier
Install new pointer d -> v, then update d-> y
// original program
p.f = o;
// compiler support for incremental collection
if (incr(p) != incr(o) {
remembered set (incr(o)) = p.f;
}
p.f = o;
remset1 ={w}
a b c
remset2 ={f,g,d}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
66
Write barrier
Install new pointer d -> v, then update d-> y
// original program
p.f = o;
// compiler support for incremental collection
if (incr(p) != incr(o) {
remembered set (incr(o)) = p.f;
}
p.f = o;
remset1 ={w}
a b c
remset2 ={f,g,d,d}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
67
Write barrier
At collection time
• collector re-examines all entries in the remset for
the increment, treating them like roots
• Collect Increment 2
remset1 ={w}
a b c
remset2 ={f,g,d,d}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
68
Write barrier
At collection time
• collector re-examines all entries in the remset for
the increment, treating them like roots
• Collect Increment 2
remset1 ={w}
a b c
remset2 ={f,g,d,d}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
69
Summary of the costs of
incremental collection
• write barrier to catch pointer stores crossing
boundaries
• remsets to store crossing pointers
• processing remembered sets at collection time
• excess retention
remset1 ={w}
a b c
remset2 ={f,g,d,d}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
70
Heap Organization
What objects should we put where?
• Generational hypothesis
– young objects die more quickly than older ones [Lieberman &
Hewitt’83, Ungar’84]
– most pointers are from younger to older objects [Appel’89,
Zorn’90]
Organize the heap in to young and old, collect young objects
preferentially
to space
Young
to space
from space
Old
71
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
72
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
73
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
74
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
75
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
76
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
77
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
78
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
79
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
80
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
81
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
82
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
from space
to space
Old
83
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces - ignore remembered sets
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
from space
to space
Old
84
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces - ignore remembered sets
– Generalizing to m generations
• if space n < m fills up, collect 1 through n-1
to space
Young
from space
to space
Old
85
Generational
Write Barrier
Unidirectional barrier
record only older to younger pointers
no need to record younger to older pointers, since we
never collect the old space independently
• most pointers are from younger to older objects [Appel’89,
Zorn’90]
• track the barrier between young objects and old spaces
address
barrier
to space
Young
to space
from space
Old
86
Generational
Write Barrier
unidirectional boundary barrier
// original program
p.f = o;
// compiler support for incremental collection
if (p > barrier && o < barrier) {
remsetnursery U p.f;
}
p.f = o;
to space
Young
to space
from space
Old
87
Generational
Write Barrier
Unidirectional
record only older to younger pointers
no need to record younger to older pointers, since we
never collect the old space independently
– most pointers are from younger to older objects [Appel’89,
Zorn’90]
– most mutations are to young objects [Stefanovic et al.’99]
to space
Young
to space
from space
Old
88
Results
89
Garbage Collection Time
181
MarkSweep
GC Time (Normalized to Best)
161
SemiSpace
141
GenMS
GenCopy
121
101
81
61
41
21
1
1
2
3
4
5
6
Heap Size (Relative to Min)
90
Mutator Time
1.2
MarkSweep
Mutator Time (Normalized to Best)
1.18
SemiSpace
1.16
GenMS
1.14
GenCopy
1.12
1.1
1.08
1.06
1.04
1.02
1
1
2
3
4
Heap Size (Relative to Min)
5
6
91
Total Time
3
MarkSweep
2.8
SemiSpace
Time (Normalized to Best)
2.6
GenMS
2.4
GenCopy
2.2
2
1.8
1.6
1.4
1.2
1
1
2
3
4
Heap Size (Relative to Min)
5
6
92
Recap
• Copying improves locality
• Incrementality improves responsiveness
• Generational hypothesis
– Young objects: Most very short lived
• Infant mortality: ~90% die young (within 4MB of alloc)
– Old objects: most very long lived (bimodal)
• Mature morality: ~5% die each 4MB of new allocation
• Help from pointer mutations
– In Java, pointers go in both directions, but older to younger
pointers across many objects are rare
• less than 1%
– Most mutations among young objects
• 92 to 98% of pointer mutations
McKinley, UT
380C
• Where are we & where we are going
– Managed languages
• Dynamic compilation
• Inlining
• Garbage collection
–
–
–
–
–
– Can we get mutator locality, space efficiency, and collector efficiency
all in one collector?
– Read: Blackburn and McKinley, Immix: A Mark-Region Garbage
Collector with Space Efficiency, Fast Collection, and Mutator
Performance, ACM SIGPLAN Conference on Programming Language
Design and Implementation (PLDI), pp. 22-32, Tucson AZ, June 2008.
Why you need to care about workloads & methodology
Alias analysis
Dependence analysis
Loop transformations
EDGE architectures
94