380C Lecture 17 • Where are we & where we are going – Managed languages • Dynamic compilation • Inlining • Garbage collection – Why you need.

Download Report

Transcript 380C Lecture 17 • Where are we & where we are going – Managed languages • Dynamic compilation • Inlining • Garbage collection – Why you need.

380C
Lecture 17
• Where are we & where we are going
– Managed languages
• Dynamic compilation
• Inlining
• Garbage collection
– Why you need to care about workloads &
experimental methodology
– Alias analysis
– Dependence analysis
– Loop transformations
– EDGE architectures
1
Today
• Garbage Collection
– Why use garbage collection?
– What is garbage?
• Reachable vs live, stack maps, etc.
– Allocators and their collection mechanisms
• Semispace
• Marksweep
• Performance comparisons
– Incremental age based collection
• Write barriers: Friend or foe?
• Generational
• Beltway
– More performance
2
Basic VM Structure
Program/Bytecode
Executing
Program
Class Loader
Verifier, etc.
Heap
Thread
Scheduler
Dynamic Compilation
Subsystem
Garbage
Collector
3
True or False?
• Real programmers use languages with
explicit memory management.
– I can optimize my memory management much
better than any garbage collector
4
True or False?
• Real programmers use languages with
explicit memory management.
– I can optimize my memory management much
better than any garbage collector
– Scope of effort?
5
Why Use
Garbage Collection?
• Software engineering benefits
– Less user code compared to explict memory management (MM)
– Less user code to get correct
– Protects against some classes of memory errors
• No free(), thus no premature free(), no double free(), or
forgetting to free()
• Not perfect, memory can still leak
– Programmers still need to eliminate all pointers to objects the
program no longer needs
• Performance: space time tradeoff
– Time proportional to dead objects (explicit mm, reference
counting) or live objects (semispace, marksweep)
– Throughput versus pause time
• Less frequent collection, typically reduces total time but increases
space requirements and pause times
– Hidden locality benefits?
6
What is Garbage?
• In theory, any object the program will never
reference again
– But compiler & runtime system cannot figure that out
• In practice, any object the program cannot reach
is garbage
– Approximate liveness with reachability
• Managed languages couple GC with “safe” pointers
– Programs may not access arbitrary addresses in memory
– The compiler can identify and provide to the garbage
collector all the pointers, thus
– “Once garbage, always garbage”
– Runtime system can move objects by updating pointers
– “Unsafe” languages can do non-moving GC by assuming
anything that looks like a pointer is one.
7
Reachability
• Compiler produces a stack-map at GC safe-points and Type
Information Blocks
• GC safe points: new(), method entry, method exit, & backedges (thread switch points)
• Stack-map: enumerate global variables, stack variables, live
registers -- This code is hard to get right! Why?
• Type Information Blocks: identify reference fields in objects
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
8
Reachability
• Compiler produces a stack-map at GC safe-points and Type
Information Blocks
• Type Information Blocks: identify reference fields in objects
for each type i (class) in the program, a map
TIBi
A
B
C
globals
0
{
2
3
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
9
Reachability
• Tracing collector (semispace, marksweep)
– Marks the objects reachable from the roots live, and
then performs a transitive closure over them
mark
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
10
Reachability
• Tracing collector (semispace, marksweep)
– Marks the objects reachable from the roots live, and
then performs a transitive closure over them
mark
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
11
Reachability
• Tracing collector (semispace, marksweep)
– Marks the objects reachable from the roots live, and
then performs a transitive closure over them
mark
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
12
Reachability
• Tracing collector (semispace, marksweep)
– Marks the objects reachable from the roots live, and
then performs a transitive closure over them
• All unmarked objects are dead, and can be
reclaimed
mark
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
13
Reachability
• Tracing collector (semispace, marksweep)
– Marks the objects reachable from the roots live, and
then performs a transitive closure over them
• All unmarked objects are dead, and can be
reclaimed
sweep
A
B
C
globals
{
....
r0 = obj
PC -> p.f = obj
....
stack registers
heap
14
Today
• Garbage Collection
– Why use garbage collection?
– What is garbage?
• Reachable vs live, stack maps, etc.
– Allocators and their collection mechanisms
• Semispace
• Marksweep
• Performance comparisons
– Incremental age based collection
• Write barriers: Friend or foe?
• Generational
• Beltway
– More performance
15
Semispace
• Fast bump pointer allocation
• Requires copying collection
• Cannot incrementally reclaim memory, must
free en masse
• Reserves 1/2 the heap to copy in to, in case
all objects are live
to space
from space
heap
16
Semispace
• Fast bump pointer allocation
• Requires copying collection
• Cannot incrementally reclaim memory, must
free en masse
• Reserves 1/2 the heap to copy in to, in case
all objects are live
to space
from space
heap
17
Semispace
• Fast bump pointer allocation
• Requires copying collection
• Cannot incrementally reclaim memory, must
free en masse
• Reserves 1/2 the heap to copy in to, in case
all objects are live
to space
from space
heap
18
Semispace
• Fast bump pointer allocation
• Requires copying collection
• Cannot incrementally reclaim memory, must
free en masse
• Reserves 1/2 the heap to copy in to, in case
all objects are live
to space
from space
heap
19
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
from space
to space
heap
20
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
from space
to space
heap
21
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
from space
to space
heap
22
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
from space
to space
heap
23
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
– reclaims “from space” en masse
from space
to space
heap
24
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
– reclaims “from space” en masse
– start allocating again into “to space”
from space
to space
heap
25
Semispace
• Mark phase:
– copies object when collector first encounters it
– installs forwarding pointers
– performs transitive closure, updating pointers
as it goes
– reclaims “from space” en masse
– start allocating again into “to space”
from space
to space
heap
26
Semispace
• Notice:
fast allocation
locality of contemporaneously allocated objects
locality of objects connected by pointers
wasted space
from space
to space
heap
27
Marksweep
• Free-lists organized by size
– blocks of same size, or
– individual objects of same size
• Most objects are small < 128 bytes
4
8
12
16
...
...
128
...
free lists
heap
28
Marksweep
• Allocation
– Grab a free object off the free list
4
8
12
16
...
...
128
...
free lists
heap
29
Marksweep
• Allocation
– Grab a free object off the free list
4
8
12
16
...
...
128
...
free lists
heap
30
Marksweep
• Allocation
– Grab a free object off the free list
4
8
12
16
...
...
128
...
free lists
heap
31
Marksweep
• Allocation
–
–
–
–
Grab a free object off the free list
No more memory of the right size triggers a collection
Mark phase - find the live objects
Sweep phase - put free ones on the free list
4
8
12
16
...
...
128
...
free lists
heap
32
Marksweep
• Mark phase
– Transitive closure marking all the live objects
• Sweep phase
– sweep the memory for free objects populating free list
4
8
12
16
...
...
128
...
free lists
heap
33
Marksweep
• Mark phase
– Transitive closure marking all the live objects
• Sweep phase
– sweep the memory for free objects populating free list
4
8
12
16
...
...
128
...
free lists
heap
34
Marksweep
• Mark phase
– Transitive closure marking all the live objects
• Sweep phase
– sweep the memory for free objects populating free list
4
8
12
16
...
...
128
...
free lists
heap
35
Marksweep
• Mark phase
– Transitive closure marking all the live objects
• Sweep phase
– sweep the memory for free objects populating free list
– can be made incremental by organizing the heap in blocks and
sweeping one block at a time on demand
4
8
12
16
...
...
128
...
free lists
heap
36
Marksweep
space efficiency
Incremental object reclamation
relatively slower allocation time
poor locality of contemporaneously allocated
objects
4
8
12
16
...
...
128
...
free lists
heap
37
How do these differences
play out in practice?
Marksweep
space efficiency
Incremental object reclamation
relatively slower allocation time
poor locality of contemporaneously allocated
objects
Semispace
fast allocation
locality of contemporaneously allocated objects
locality of objects connected by pointers
wasted space
38
Methodology
[SIGMETRICS 2004]
• Compare Marksweep (MS) and Semispace (SS)
• Mutator time, GC time, total time
• Jikes RVM & MMTk
• replay compilation
• measure second iteration without compilation
• Platforms
• 1.6GHz G5 (PowerPC 970)
• 1.9GHz AMD Athlon 2600+
• 2.6GHz Intel P4
• Linux 2.6.0 with perfctr patch & libraries
– Separate accounting of GC & Mutator counts
• SPECjvm98 & pseudojbb
39
Allocation Mechanism
• Bump pointer
– ~70 bytes IA32 instructions, 726MB/s
• Free list
– ~140 bytes IA32 instructions, 654MB/s
• Bump pointer 11% faster in tight loop
– < 1% in practical setting
– No significant difference (?)
40
Mutator Time
41
jess
jess mutator time
1.35
Normalized mutator time
1.3
1.25
1.2
MarkSweep
SemiSpace
1.15
1.1
1.05
1
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
42
jess
jess L1 misses
2
Normalized L1 misses
1.8
1.6
MarkSweep
SemiSpace
1.4
1.2
1
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
43
jess
jess L2 misses
9
8
Normalized L2 misses
7
6
MarkSweep
5
SemiSpace
4
3
2
1
1
1.21
1.44
1.93
2.47
3.07
3.72
4.43
5.19
6
Normalized Heap Size
44
jess
jess TLB misses
Normalized TLB misses
3
2.5
MarkSweep
2
SemiSpace
1.5
1
1
1.21
1.44
1.93
2.47
3.07
3.72
4.43
5.19
6
Normalized Heap Size
45
javac
javac L2 misses
1.8
1.15
1.6
MarkSweep
1.1
SemiSpace
1.05
Normalized L2 misses
Normalized mutator time
javac mutator time
1.2
1
MarkSweep
1.4
SemiSpace
1.2
1
1
1.21
1.44
1.93
2.47
3.07
3.72
4.43
5.19
6
1
1.21
1.44
Normalized Heap Size
2.47
3.07
3.72
4.43
5.19
6
Normalized Heap Size
javac L1 misses
javac TLB misses
1.5
1.8
1.3
MarkSweep
SemiSpace
1.2
1.1
1
Normalized TLB misses
1.4
Normalized L1 misses
1.93
1.6
MarkSweep
1.4
SemiSpace
1.2
1
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
46
pseudojbb
jbb mutator time
jbb L2 misses
1.25
1.7
1.6
1.15
MarkSweep
SemiSpace
1.1
Normalized L2 misses
Normalized mutator time
1.2
1.05
1.5
1.4
MarkSweep
SemiSpace
1.3
1.2
1.1
1
1
1
1.21
1.44
1.93
2.47
3.07
3.72
4.43
5.19
6
1
1.21
1.44
Normalized Heap Size
1.93
2.47
3.07
3.72
4.43
5.19
6
Normalized Heap Size
jbb L1 misses
jbb TLB misses
1.4
1.6
Normalized L1 misses
MarkSweep
1.2
SemiSpace
1.1
Normalized TLB misses
1.5
1.3
1.4
MarkSweep
1.3
SemiSpace
1.2
1.1
1
1
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
1
1.21
1.44
1.93
2.47
3.07
Normalized Heap Size
3.72
4.43
5.19
6
47
Geometric Mean
Mutator Time
1.2
Mutator Time (Normalized to Best)
1.18
MarkSweep
1.16
SemiSpace
1.14
1.12
1.1
1.08
1.06
1.04
1.02
1
1
2
3
4
5
6
Heap Size (Relative to Min)
48
Garbage Collection Time
49
Garbage Collection Time
13
31
MarkSweep
javac
9
7
5
SemiSpace
pseudojbb
21
16
11
6
3
1
1
1
2
31
3
4
5
1
6
MarkSweep
GC Time (Normalized to Best)
jess
16
11
1
3
4
Heap Size (Relative to Min)
5
6
6
Geometric
mean
5
1
5
SemiSpace
7
3
2
4
MarkSweep
9
6
1
3
Heap Size (Relative to Min)
11
SemiSpace
21
2
13
Heap Size (Relative to Min)
26
GC Time (Normalized to Best)
MarkSweep
26
SemiSpace
GC Time (Normalized to Best)
GC Time (Normalized to Best)
11
1
2
3
4
Heap Size (Relative to Min)
5
6
50
Total Time
51
Total Time
10
3.5
3
javac
2.5
MarkSweep
8
SemiSpace
Time (Normalized to Best)
Time (Normalized to Best)
9
MarkSweep
2
1.5
pseudojbb
7
6
SemiSpace
5
4
3
2
1
1
10
1
2
3
4
1
6
Time (Normalized to Best)
SemiSpace
jess
7
6
5
3
4
5
4
3
2
6
Heap Size (Relative to Min)
2.4
MarkSweep
8
2
2.6
Heap Size (Relative to Min)
9
Time (Normalized to Best)
5
MarkSweep
SemiSpace
2.2
Geometric
mean
2
1.8
1.6
1.4
1.2
1
1
2
3
4
Heap Size (Relative to Min)
5
6
1
1
2
3
4
Heap Size (Relative to Min)
5
52
6
MS/SS Crossover: 1.6GHz PPC
3
1.6GHz PPC SemiSpace
1.6GHz PPC MarkSweep
Normalized Total Time
2.5
2
1.5
1
1
2
3
4
5
6
Heap Size Relative to Minimum
53
MS/SS Crossover: 1.9GHz AMD
3
1.6GHz PPC SemiSpace
1.6GHz PPC MarkSweep
1.9GHz AMD SemiSpace
Normalized Total Time
2.5
1.9GHz AMD MarkSweep
2
1.5
1
1
2
3
4
5
6
Heap Size Relative to Minimum
54
MS/SS Crossover: 2.6GHz P4
3
1.6GHz PPC SemiSpace
1.6GHz PPC MarkSweep
1.9GHz AMD SemiSpace
Normalized Total Time
2.5
1.9GHz AMD MarkSweep
2.6GHz P4 SemiSpace
2.6GHz P4 MarkSweep
2
1.5
1
1
2
3
4
5
6
Heap Size Relative to Minimum
55
MS/SS Crossover: 3.2GHz P4
3
1.6GHz PPC SemiSpace
1.6GHz PPC MarkSweep
1.9GHz AMD SemiSpace
Normalized Total Time
2.5
1.9GHz AMD MarkSweep
2.6GHz P4 SemiSpace
2.6GHz P4 MarkSweep
3.2GHz P4 SemiSpace
2
3.2GHz P4 MarkSweep
1.5
1
1
2
3
4
5
6
Heap Size Relative to Minimum
56
MS/SS Crossover
3
1.6GHz PPC SemiSpace
1.6GHz PPC MarkSweep
1.9GHz AMD SemiSpace
Normalized Total Time
2.5
1.9GHz AMD MarkSweep
2.6GHz P4 SemiSpace
locality
2.6GHz P4 MarkSweep
space
3.2GHz P4 SemiSpace
2
3.2GHz P4 MarkSweep
2.6GHz
1.5
3.2GHz
1
1
2
1.6GHz
1.9GHz
3
4
5
6
Heap Size Relative to Minimum
57
Today
• Garbage Collection
– Why use garbage collection?
– What is garbage?
• Reachable vs live, stack maps, etc.
– Allocators and their collection mechanisms
• Semispace
• Marksweep
• Performance comparisons
– Incremental age based collection
• Enabling mechanisms
– write barrier & remembered sets
• Heap organizations
– Generational
– Beltway
– Performance comparisons
58
One Big Heap?
Pause times
– it takes to long to trace the whole heap at once
Throughput
– the heap contains lots of long lived objects, why collect
them over and over again?
Incremental collection
– divide up the heap into increments and collect one at a time.
to space
from space
Increment 1
to space
from space
Increment 2
59
Incremental Collection
Ideally
• perfect pointer knowledge of live pointers between
increments
• requires scanning whole heap, defeats the purpose
to space
from space
Increment 1
to space
from space
Increment 2
60
Incremental Collection
Ideally
• perfect pointer knowledge of live pointers between
increments
• requires scanning whole heap, defeats the purpose
to space
from space
Increment 1
to space
from space
Increment 2
61
Incremental Collection
Ideally
• perfect pointer knowledge of live pointers between
increments
• requires scanning whole heap, defeats the purpose
to space
from space
Increment 1
to space
from space
Increment 2
62
Incremental Collection
Ideally
• perfect pointer knowledge of live pointers between
increments
• requires scanning whole heap, defeats the purpose
Mechanism: Write barrier
• records pointers between increments when the mutator
installs them, conservative approximation of reachability
to space
from space
Increment 1
to space
from space
Increment 2
63
Write barrier
compiler inserts code that records pointers between
increments when the mutator installs them
// original program
p.f = o;
// compiler support for incremental collection
if (incr(p) != incr(o) {
remembered set (incr(o)) U p.f;
}
p.f = o;
remset1 ={w}
a b c
remset2 ={f,g}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
64
Write barrier
Install new pointer d -> v
// original program
p.f = o;
// compiler support for incremental collection
if (incr(p) != incr(o) {
remembered set (incr(o)) U p.f;
}
p.f = o;
remset1 ={w}
a b c
remset2 ={f,g}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
65
Write barrier
Install new pointer d -> v, then update d-> y
// original program
p.f = o;
// compiler support for incremental collection
if (incr(p) != incr(o) {
remembered set (incr(o)) = p.f;
}
p.f = o;
remset1 ={w}
a b c
remset2 ={f,g,d}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
66
Write barrier
Install new pointer d -> v, then update d-> y
// original program
p.f = o;
// compiler support for incremental collection
if (incr(p) != incr(o) {
remembered set (incr(o)) = p.f;
}
p.f = o;
remset1 ={w}
a b c
remset2 ={f,g,d,d}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
67
Write barrier
At collection time
• collector re-examines all entries in the remset for
the increment, treating them like roots
• Collect Increment 2
remset1 ={w}
a b c
remset2 ={f,g,d,d}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
68
Write barrier
At collection time
• collector re-examines all entries in the remset for
the increment, treating them like roots
• Collect Increment 2
remset1 ={w}
a b c
remset2 ={f,g,d,d}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
69
Summary of the costs of
incremental collection
• write barrier to catch pointer stores crossing
boundaries
• remsets to store crossing pointers
• processing remembered sets at collection time
• excess retention
remset1 ={w}
a b c
remset2 ={f,g,d,d}
d e f g
to space
t
from space
Increment 1
u v w
x y z
to space
from space
Increment 2
70
Heap Organization
What objects should we put where?
• Generational hypothesis
– young objects die more quickly than older ones [Lieberman &
Hewitt’83, Ungar’84]
– most pointers are from younger to older objects [Appel’89,
Zorn’90]
 Organize the heap in to young and old, collect young objects
preferentially
to space
Young
to space
from space
Old
71
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
72
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
73
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
74
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
75
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
76
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
77
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
78
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
79
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
80
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
81
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
to space
from space
Old
82
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
from space
to space
Old
83
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces - ignore remembered sets
– Generalizing to m generations
• if space n < m fills up, collect n through n-1
to space
Young
from space
to space
Old
84
Generational Heap
Organization
•
•
•
Divide the heap in to two spaces: young and old
Allocate in to the young space
When the young space fills up,
•
When the old space fills up
– collect it, copying into the old space
– collect both spaces - ignore remembered sets
– Generalizing to m generations
• if space n < m fills up, collect 1 through n-1
to space
Young
from space
to space
Old
85
Generational
Write Barrier
Unidirectional barrier
 record only older to younger pointers
 no need to record younger to older pointers, since we
never collect the old space independently
• most pointers are from younger to older objects [Appel’89,
Zorn’90]
• track the barrier between young objects and old spaces
address
barrier
to space
Young


to space
from space
Old
86
Generational
Write Barrier
unidirectional boundary barrier
// original program
p.f = o;
// compiler support for incremental collection
if (p > barrier && o < barrier) {
remsetnursery U p.f;
}
p.f = o;

to space
Young

to space
from space
Old
87
Generational
Write Barrier
Unidirectional
 record only older to younger pointers
 no need to record younger to older pointers, since we
never collect the old space independently
– most pointers are from younger to older objects [Appel’89,
Zorn’90]
– most mutations are to young objects [Stefanovic et al.’99]

to space
Young

to space
from space
Old
88
Results
89
Garbage Collection Time
181
MarkSweep
GC Time (Normalized to Best)
161
SemiSpace
141
GenMS
GenCopy
121
101
81
61
41
21
1
1
2
3
4
5
6
Heap Size (Relative to Min)
90
Mutator Time
1.2
MarkSweep
Mutator Time (Normalized to Best)
1.18
SemiSpace
1.16
GenMS
1.14
GenCopy
1.12
1.1
1.08
1.06
1.04
1.02
1
1
2
3
4
Heap Size (Relative to Min)
5
6
91
Total Time
3
MarkSweep
2.8
SemiSpace
Time (Normalized to Best)
2.6
GenMS
2.4
GenCopy
2.2
2
1.8
1.6
1.4
1.2
1
1
2
3
4
Heap Size (Relative to Min)
5
6
92
Recap
• Copying improves locality
• Incrementality improves responsiveness
• Generational hypothesis
– Young objects: Most very short lived
• Infant mortality: ~90% die young (within 4MB of alloc)
– Old objects: most very long lived (bimodal)
• Mature morality: ~5% die each 4MB of new allocation
• Help from pointer mutations
– In Java, pointers go in both directions, but older to younger
pointers across many objects are rare
• less than 1%
– Most mutations among young objects
• 92 to 98% of pointer mutations
McKinley, UT
380C
• Where are we & where we are going
– Managed languages
• Dynamic compilation
• Inlining
• Garbage collection
–
–
–
–
–
– Can we get mutator locality, space efficiency, and collector efficiency
all in one collector?
– Read: Blackburn and McKinley, Immix: A Mark-Region Garbage
Collector with Space Efficiency, Fast Collection, and Mutator
Performance, ACM SIGPLAN Conference on Programming Language
Design and Implementation (PLDI), pp. 22-32, Tucson AZ, June 2008.
Why you need to care about workloads & methodology
Alias analysis
Dependence analysis
Loop transformations
EDGE architectures
94