Towards JIT compiler for IO language

Download Report

Transcript Towards JIT compiler for IO language

Optimizing dynamic dispatch
with fine-grained state tracking
Salikh Zakirov, Shigeru Chiba and
Etsuya Shibayama
Tokyo Institute of Technology
Dept. of Mathematical and Computing
Sciences
2010-10-18
2
Mixin
• code composition technique
BaseServer
BaseServer
Additional
Security
Additional
Security
Server
Mixin use declaration
Server
Mixin semantics
3
Dynamic mixin
• Temporary change in class hierarchy
• Available in Ruby, Python, JavaScript
BaseServer
BaseServer
Additional
Security
Server
Server
4
Dynamic mixin (2)
• Powerful technique of dynamic languages
• Enables
▫ dynamic patching
▫ dynamic monitoring
• Can be used to implement
▫ Aspect-oriented programming
▫ Context-oriented programming
• Widely used in Ruby, Python
▫ e.g. Object-Relational Mapping
5
Dynamic mixin in Ruby
• Ruby has dynamic mixin
▫ but only “install”, no “remove” operation
• “remove” can be implemented easily
▫ 23 lines
6
Target application
• Mixin is installed and removed frequently
• Application server with dynamic features
class BaseServer
def process() … end
end
module AdditionalSecurity
def process()
…
# security check
super # delegate to superclass
end
end
class Server < BaseServer
def process()
if request.isSensitive()
Server.class_eval {
include AdditionalSecurity
} end
super # delegate to superclass
…
# remove mixin
end
end
7
Overhead is high
Reasons
• Invalidation granularity
▫ clearing whole method cache
▫ invalidating all inline caches
 next calls require full method lookup
• Inline caching saves just 1 target
▫ which changes with mixin operations
▫ even though mixin operations are mostly
repeated
8
Our research problem
• Improve performance of application which
frequently uses dynamic mixin
▫ Make invalidation granularity smaller
▫ Make dynamic dispatch target cacheable in
presence of dynamic mixin operations
9
Proposal
• Reduce granularity of inline cache
invalidation
▫ Fine-grained state tracking
• Cache multiple dispatch targets
▫ Polymorphic inline caching
• Enable cache reuse on repeated mixin
installation and removal
▫ Alternate caching
10
Basics: Inline caching
consider a call site
cat.speak()
(executable code)
cat.speak()
class
Cat
ic
method
speak
Dynamic dispatch implementation
method = lookup(cat, ”speak”)
method(cat)
Expensive!
But the result
is mostly the
same
Inline caching
if (cat has type ic.class) {
ic.method(cat)
} else {
ic.method = lookup(cat, ”speak”)
ic.class = cat.class
ic.method(cat)
}
method
implementation
Animal
speak() { … }
subclass
Cat
instance
cat
11
Inline caching: problem
• What if the method has been overridden?
Animal
speak() { … }
Training
cat.speak()
class
Cat
ic
method
speak
Inline caching
speak(){ … }
if (cat has type ic.class) {
ic.method(cat)
} else {
ic.method = lookup(cat, ”speak”)
ic.class = cat.class
ic.method(cat)
}
Cat
instance
cat
12
Inline caching: invalidation
if (cat has type ic.class && state == ic.state) {
ic.method(cat)
} else {
ic.method = lookup(cat, ”speak”)
ic.class = cat.class; ic.state = state
ic.method(cat)
}
1
2
Global state
Animal
speak() { … }
Training
cat.speak()
speak(){ … }
Cat
class
Cat
ic
method
speak
state
1
2
Single global state object
• too coarse invalidation granularity
instance
cat
13
Fine-grained state tracking
• Many state objects
▫ small invalidation extent
▫ share as much as possible
• One state object for each family of methods
called from the same call site
• State objects associated with lookup path
▫ links updated during method lookups
• Invariant
▫ Any change that may affect method dispatch
must also trigger change of associated state
object
14
State object allocation
if (cat has type ic.class && ic.pstate.state == ic.state ) {
ic.method(cat)
} else {
ic.method, ic.pstate = lookup(cat, ”speak”, ic.pstate)
ic.class = cat.class; ic.state = state
method(cat)
}
inline caching code
Animal
cat.speak()
speak() { *1* }
class
Cat
ic
speak *1*
method
state
1
pstate
1
No
implemmentation
here
Cat
speak
15
Mixin installation
if (cat has type ic.class && ic.pstate.state == ic.state ) {
ic.method(cat)
} else {
ic.method, ic.pstate = lookup(cat, ”speak”, ic.pstate)
ic.class = cat.class; ic.state = state
method(cat)
}
inline caching code
Animal
cat.speak()
speak() { *1* }
class
Cat
ic
speak *2*
*1*
method
state
2
1
pstate
Training
1
2
speak() { *2* }
Cat
speak
16
Mixin removal
if (cat has type ic.class && ic.pstate.state == ic.state ) {
ic.method(cat)
} else {
ic.method, ic.pstate = lookup(cat, ”speak”, ic.pstate)
ic.class = cat.class; ic.state = state
method(cat)
}
inline caching code
Animal
cat.speak()
speak() { *1* }
class
Cat
ic
speak *1*
*2*
method
state
2
3
pstate
Training
2
3
speak() { *2* }
Cat
speak
17
Alternate caching
• Detect repetition
• Conflicts detected by
state check
alternate cache
super Training
speak
state
4
3
pstate
3
Animal
A
speak() { *1* }
class
Cat
ic
4
…
cat.speak()
speak *2*
*1*
method
Animal
Training
4
3
speak() { *2* }
Cat
speak
Inline cache contents oscillates
18
Polymorphic caching
• Use multiple entries in
inline cache
alternate cache
super Training
speak
Animal
4
3
…
Animal
cat.speak()
speak() { *1* }
Cat
class
Cat
ic
*1*
*2*
method
3state4
pstate
Training
4
3
speak() { *2* }
Cat
speak
19
State object merge
animal
executable
code
instance
animal.speak()
Animal
speak() { *1* }
while(true) {
cat.speak()
}
Training
S
Cat
remove mixin
Overridden by
speak() { *2* }
speak
Q
instance
cat
One-time invalidation
20
Overheads of proposed scheme
• Increased memory use
▫
▫
▫
▫
1 state object per polymorphic method family
additional method entries
alternate cache
polymorphic inline cache entries
• Some operations become slower
▫ Lookup needs to track and update state objects
▫ Explicit state object checks on method dispatch
21
Generalizations (beyond Ruby)
• Delegation object model
▫ track arbitrary delegation pointer change
• Thread-local delegation
▫ allow for thread-local modification of
delegation pointer
▫ by having thread-local state object values
• Details in the article…
22
Evaluation
• Implementation based on Ruby 1.9.2
• Hardware
▫ Intel Core i7 860 2.8 GHz
23
Evaluation: microbenchmarks
• Single method call overhead
▫ Inline cache hit
 state checks 1%
 polymorphic inline caching 49% overhead
▫ Full lookup
 2x slowdown
24
Dynamic mixin-heavy microbenchmark
Normalized execution time
(smaller is better)
100%
23%
base
method cache
state checks
17%
15%
fgst
fgst+PIC+altern
25
Evaluation: application
• Application server with dynamic mixin on
each request
Normalized execution time
(smaller is better)
100%
70%
baseline
method
cache state
checks
58%
60%
fgst
fgst + PIC
52%
fgst + PIC +
altern
26
Evaluation
• Fine-grained state tracking considerably
reduces overhead
• Alternate caching brings only small
improvement
▫ Number of call sites affected by mixin is low
▫ Lookup cost / inline cache hit cost is low
 about 1.6x on Ruby
27
Related work
• Dependency tracking in Self
▫ focused on reducing recompilation, rather
than reducing method lookups
• Inline caching for Objective-C
▫ state object associated with method, no
dynamic mixin support
28
Conclusion
• We proposed combination of techniques
▫ Fine-grained state tracking
▫ Alternate caching
▫ Polymorphic inline caching
• To increase efficiency of inline caching
▫ with frequent dynamic mixin installation
and removal
29
Thank you for your attention
30
Method caching in Ruby
• Global hashtable
▫ indexed by method name and class
• On method lookup
▫ gives answer in 1 hash lookup
• On miss
▫ answer obtained by recursive lookup
▫ result stored in method cache
• On method redefinition or mixin operation
▫ method cache cleared completely