Kein Folientitel
Download
Report
Transcript Kein Folientitel
Dataflow Frequency Analysis
based on Whole Program Paths
Bernhard Scholz
Eduard Mehofer
Institute of Computer Languages
Vienna University of Technology
Institute for Software Science
University of Vienna
[email protected]
www.complang.tuwien.ac.at/scholz
[email protected]
www.par.univie.ac.at/~mehofer
Dataflow Frequency Analysis
Goal
– accurately computing frequencies of data flow facts
Problem:
– high costs for computing accurate frequencies
• requires whole program path
• efficient data structures and algorithm?
Approach:
– exploiting algebraic properties of bi-distributive DFA problems
– employing WPPs to capture control flow
– computing frequencies in a bottom-up style on the WPP graph
Page 2
Outline
Motivation
WPP profiling
Properties of bi-distributive DFAs
Algorithm
Experiments
Conclusion
Page 3
Classical Approach
Classical Program Optimization:
optimizer
data flow analysis
Program
binary
information
Optimized
program
transformation
Drawback:
heavily
rarely
never
Optimizer
Page 4
Profiling Approach
Probabilistic Program Optimization:
Optimizer based on profiling
Program
dataflow freq. analysis
frequency
Profile
information
Optimized
program
transformation
Advantage:
heavily
rarely
never
Optimizer
Page 5
Running Example
– simple code fragment
– 8 times left branch
– terminates via right branch
s
1
2
d1: x:=...
3
d2: x:=...
...x...
Reaching definitions problem
– two definitions: d1, d2
– d1 kills d2 and vice versa
– use of x at the end of loop
4
5
CFG Example
Questions
– How often does d1 hold at node 5?
– How often does d2 hold at node 5?
Page 6
WPP Profiling
Captures the whole program path
– Larus at PLDI’99
Path profiling techniques for acyclic paths
– minimal insertion of instrumentation code
– keeps executable fast
Sequitur for compression
–
–
–
–
–
–
builds a grammar
terminals are acyclic paths
nonterminals have only one production
graph representation of grammar
grammar has only sentence
best case: logarithmic size reduction
Page 7
WPP Example
CFG Example
s
WPP Graph & Grammar
S a AAA b c
Abb
S
1
2
A
3
4
a
5
Program Run
- 8x left branch
- 1x right branch
b
c
Terminals:
a: [s,1,2,4]
b: [1,2,4]
c: [1,3,4,5]
Page 8
Bi-Distributive Dataflow Problems
Properties
– finite lattice 2D (power set of dataflow facts)
– transition functions are monotone
– transition functions distribute
f ( X Y ) f ( X ) f (Y )
f ( X Y ) f ( X ) f (Y )
– representation relation
– covers bit-vector problems
Due to properties
– transition functions represented as 0/1-matrices
– states represented as 0/1-vectors
Page 9
Representation Relation
Transition function f: 2D 2D
– represented by f r : D 2D
f r ( )
f r (d )
f (0) {}
f ({d }) f (0)
– artificial data fact
Example
1
2
d1: x:=...
D
M(24)r
d1 {}
d2 {}
4
{d1, }
Page 10
Matrix Representation
Matrix representation of function f
1, if d i f r (d j )
aij
0, otherwise
Example
M (2 4)d1 , d2 d1
0 0 1 1 1
A
M (2 4) d1 , d 2 0 0 0 1 0
0 0 1 1 1
Page 11
Dataflow Frequencies
Definition of dataflow frequencies for node v
y r (v)
–
–
–
–
–
(state( ))
Prefix(
s
r ,v )
r whole program path
v
prefix: set of all sub-paths from start node to node v
: converts data flow facts to 0/1-vector
state(): data flow facts which hold along path
sums up the occurrences of data flow facts which hold in v
Approach for fast computation
– adopt definition for grammar symbols of SEQUITUR
Page 12
Frequency Matrix
Definition of frequency matrices
F (v)
M (
[u1, u2 ,, uk ])
A
[u1 ,u2 ,,uk 1 ,v ]Prefix( ,v )
– sum computation due matrix calculus
Frequency matrices for eliminating sum
y r (v)
(state( ))
Prefix(
r ,v )
F r (v) (c)
Computation of frequency matrices for grammar symbols
Page 13
Terminals
Transition function
– compose function for acyclic path t:[u1, u2, ..., uk]
– represent transition function as matrix
M (t ) A
M ([u1 , u2 ,, uk ])A
A
M (uk 1 uk ) M (u1 u2 )
Frequency matrix
M ([u1 , u2 ,, v]) A if v u1 , u2 ,, uk
Ft (v)
otherwise
0
Page 14
Nonterminals
Transition function
– compose transition function for ntX1, X2, ..., Xk
– represent transition function as matrix
M (nt) A
M ( X 1 X 2 X k ]) A
M ( X k ) A M ( X 2 ) A M ( X 1 ) A
Frequency matrix
Fnt (v) FX1 (v) FX 2 (v) M ( X 1 ) A
FX k (v) M ( X 1 X 2 X k ) A
Page 15
Example
0 0 1
A
Fb (4) M ([1,2,4]) 0 0 0
0 0 1
Terminal b: [1,2,4]
1
2
d1: x:=...
4
Nonterminal Abb
M ( A) A M (bb) A M (b) A M (b) A M (b) A
0 0 2
A
FA (4) Fb (4) Fb (4) M (b) 0 0 0
0 0 2
Page 16
Algorithm
Pseudo-Code
forall vN do
forall tT do
compute terminal t for node v
endfor
forall ntNT in reverse topological order do
compute nonterminal nt for node v
endfor
y(v) FS (v) (c)
endfor
Page 17
Example
Transition matrices and frequency matrices for terminals
S
S
a
A
a
b
a
c
b
A
b
c
S
S
c
A
A
a
b
c
a
b
c
Page 18
Example
Transition matrices and frequency matrices for nonterminals
S
S
A
A
a
b
c
S
A
a
b
c
S
Frequency matrix of start symbol S
contains the dataflow frequency information!
A
a
b
c
Page 19
Experiments
Gcc-Compiler 2.95.2
– data flow frequency analysis written in C++/C
– implementation of WPP (runtime & compiletime)
Benchmark
– some programs of SpecInt95
– reaching definitions problem
Environment
– Sun Ultra Enterprise 450 (4 x 296 MhZ) with 2.5 GB
Page 20
Node Statistics
12000
Nodes
10000
not executed
executed w/o DFA
analyzed
8000
6000
4000
2000
0
0
.g
99
1
o
8
.m
4
2
s
8k
1
im
.c
29
res
p
om
s
13
0.l
i
13
jp
2.i
eg
13
e
4.p
rl
about 40% of nodes are executed
no computations for 60% of nodes required
Page 21
WPP Size & Overhead
WPP Size in Kbytes
Compile Overhead in %
35
20000
30
15000
25
20
10000
15
10
5000
5
0
9
09
0
.go
12
k
88
4.m
sim
1
.c
29
pre
om
ss
13
i
0.l
13
jp
2.i
eg
1
.p
34
erl
9
09
.go
12
k
88
4.m
sim
1
.c
29
pre
om
ss
13
0.l
i
13
jp
2.i
eg
1
.p
34
erl
- Compile time overhead almost proportional to WPP size
Page 22
Conclusion
Novel dataflow frequency analysis
– designed for bi-distributive dataflow analysis problems
– matrix representation of transition functions
– employs SEQUITUR Grammars
Accurate and efficient algorithm
Experiments
– platform: gcc for Ultra 450
– benchmark: reaching definitions problem for SpecInt95
– overhead is proportional to the size of WPP
Page 23
Stop!
Page 24