Kein Folientitel

Download Report

Transcript Kein Folientitel

Dataflow Frequency Analysis
based on Whole Program Paths
Bernhard Scholz
Eduard Mehofer
Institute of Computer Languages
Vienna University of Technology
Institute for Software Science
University of Vienna
[email protected]
www.complang.tuwien.ac.at/scholz
[email protected]
www.par.univie.ac.at/~mehofer
Dataflow Frequency Analysis

Goal
– accurately computing frequencies of data flow facts

Problem:
– high costs for computing accurate frequencies
• requires whole program path
• efficient data structures and algorithm?

Approach:
– exploiting algebraic properties of bi-distributive DFA problems
– employing WPPs to capture control flow
– computing frequencies in a bottom-up style on the WPP graph
Page 2
Outline

Motivation

WPP profiling

Properties of bi-distributive DFAs

Algorithm

Experiments

Conclusion
Page 3
Classical Approach

Classical Program Optimization:
optimizer
data flow analysis
Program
binary
information
Optimized
program
transformation

Drawback:
heavily
rarely
never
Optimizer
Page 4
Profiling Approach

Probabilistic Program Optimization:
Optimizer based on profiling
Program
dataflow freq. analysis
frequency
Profile

information
Optimized
program
transformation
Advantage:
heavily
rarely
never
Optimizer
Page 5
Running Example

– simple code fragment
– 8 times left branch
– terminates via right branch
s
1

2
d1: x:=...
3
d2: x:=...
...x...
Reaching definitions problem
– two definitions: d1, d2
– d1 kills d2 and vice versa
– use of x at the end of loop
4
5
CFG Example

Questions
– How often does d1 hold at node 5?
– How often does d2 hold at node 5?
Page 6
WPP Profiling

Captures the whole program path
– Larus at PLDI’99

Path profiling techniques for acyclic paths
– minimal insertion of instrumentation code
– keeps executable fast

Sequitur for compression
–
–
–
–
–
–
builds a grammar
terminals are acyclic paths
nonterminals have only one production
graph representation of grammar
grammar has only sentence
best case: logarithmic size reduction
Page 7
WPP Example

CFG Example
s

WPP Graph & Grammar
S  a AAA b c
Abb
S
1
2
A
3
4
a
5


Program Run
- 8x left branch
- 1x right branch
b
c
Terminals:
a: [s,1,2,4]
b: [1,2,4]
c: [1,3,4,5]
Page 8
Bi-Distributive Dataflow Problems

Properties
– finite lattice 2D (power set of dataflow facts)
– transition functions are monotone
– transition functions distribute
f ( X  Y )  f ( X )  f (Y )
f ( X  Y )  f ( X )  f (Y )
– representation relation
– covers bit-vector problems

Due to properties
– transition functions represented as 0/1-matrices
– states represented as 0/1-vectors
Page 9
Representation Relation

Transition function f: 2D 2D
– represented by f r : D  2D
f r ( ) 
f r (d )

f (0)  {}
f ({d })  f (0)
– artificial data fact 

Example
1
2
d1: x:=...
D
M(24)r
d1 {}
d2 {}
4

{d1, }
Page 10
Matrix Representation

Matrix representation of function f
1, if d i  f r (d j )
aij  
0, otherwise

Example
M (2  4)d1 , d2   d1
 0 0 1 1  1 

   
A
M (2  4)   d1 , d 2    0 0 0 1   0 
 0 0 1 1  1 

   
Page 11
Dataflow Frequencies

Definition of dataflow frequencies for node v

y r (v) 
–
–
–
–
–

 (state( ))
 Prefix(
s
r ,v )
r whole program path
v
prefix: set of all sub-paths from start node to node v
: converts data flow facts to 0/1-vector
state(): data flow facts which hold along path 
sums up the occurrences of data flow facts which hold in v
Approach for fast computation
– adopt definition for grammar symbols of SEQUITUR
Page 12
Frequency Matrix

Definition of frequency matrices
F (v) 
M ( 
[u1, u2 ,, uk ])
A
[u1 ,u2 ,,uk 1 ,v ]Prefix( ,v )
– sum computation due matrix calculus

Frequency matrices for eliminating sum

y r (v) 
 (state( ))
 Prefix(
r ,v )
 F r (v)   (c)

Computation of frequency matrices for grammar symbols
Page 13
Terminals

Transition function
– compose function for acyclic path t:[u1, u2, ..., uk]
– represent transition function as matrix
M (t ) A

 M ([u1 , u2 ,, uk ])A
A
 M (uk 1  uk )   M (u1  u2 )
Frequency matrix
M ([u1 , u2 ,, v]) A if v  u1 , u2 ,, uk 
Ft (v)  
otherwise
0
Page 14
Nonterminals

Transition function
– compose transition function for ntX1, X2, ..., Xk
– represent transition function as matrix
M (nt) A

 M ( X 1 X 2  X k ]) A
 M ( X k ) A   M ( X 2 ) A  M ( X 1 ) A
Frequency matrix
Fnt (v)  FX1 (v)  FX 2 (v) M ( X 1 ) A  
 FX k (v) M ( X 1 X 2  X k ) A
Page 15
Example


 0 0 1


A
Fb (4)  M ([1,2,4])   0 0 0 
 0 0 1


Terminal b: [1,2,4]
1
2
d1: x:=...
4
Nonterminal Abb
M ( A) A  M (bb) A  M (b) A  M (b) A  M (b) A
 0 0 2


A
FA (4)  Fb (4)  Fb (4) M (b)   0 0 0 
 0 0 2


Page 16
Algorithm

Pseudo-Code
forall vN do
forall tT do
compute terminal t for node v
endfor
forall ntNT in reverse topological order do
compute nonterminal nt for node v
endfor

y(v)  FS (v) (c)
endfor
Page 17
Example

Transition matrices and frequency matrices for terminals
S
S
a
A
a
b
a
c
b
A
b
c
S
S
c
A
A
a
b
c
a
b
c
Page 18
Example

Transition matrices and frequency matrices for nonterminals
S
S
A
A
a
b
c
S
A
a
b
c
S
Frequency matrix of start symbol S
contains the dataflow frequency information!
A
a
b
c
Page 19
Experiments

Gcc-Compiler 2.95.2
– data flow frequency analysis written in C++/C
– implementation of WPP (runtime & compiletime)

Benchmark
– some programs of SpecInt95
– reaching definitions problem

Environment
– Sun Ultra Enterprise 450 (4 x 296 MhZ) with 2.5 GB
Page 20
Node Statistics
12000
Nodes
10000
not executed
executed w/o DFA
analyzed
8000
6000
4000
2000
0
0
.g
99
1


o
8
.m
4
2
s
8k
1
im
.c
29
res
p
om
s
13
0.l
i
13
jp
2.i
eg
13
e
4.p
rl
about 40% of nodes are executed
no computations for 60% of nodes required
Page 21
WPP Size & Overhead

WPP Size in Kbytes
Compile Overhead in %

35
20000
30
15000
25
20
10000
15
10
5000
5
0
9
09
0
.go
12
k
88
4.m
sim
1
.c
29
pre
om
ss
13
i
0.l
13
jp
2.i
eg
1
.p
34
erl
9
09
.go
12
k
88
4.m
sim
1
.c
29
pre
om
ss
13
0.l
i
13
jp
2.i
eg
1
.p
34
erl
- Compile time overhead almost proportional to WPP size
Page 22
Conclusion

Novel dataflow frequency analysis
– designed for bi-distributive dataflow analysis problems
– matrix representation of transition functions
– employs SEQUITUR Grammars

Accurate and efficient algorithm

Experiments
– platform: gcc for Ultra 450
– benchmark: reaching definitions problem for SpecInt95
– overhead is proportional to the size of WPP
Page 23
Stop!
Page 24