Code Generation - National Chung Cheng University

Download Report

Transcript Code Generation - National Chung Cheng University

Code Generation
•
•
•
•
•
•
•
The target machine
Instruction selection and register allocation
Basic blocks and flow graphs
A simple code generator
Peephole optimization
Instruction selector generator
Graph-coloring register allocator
1
The Target Machine
• A byte addressable machine with four bytes to a
word and n general purpose registers
• Two address instructions
– op
source, destination
• Six addressing modes
–
–
–
–
–
–
absolute
register
indexed
ind register
ind indexed
literal
M
R
c(R)
*R
*c(R)
#c
M
R
c+content(R)
content(R)
content(c+content(R))
c
1
0
1
0
1
12
Examples
MOV
MOV
MOV
MOV
MOV
R0, M
4 (R0), M
*R0, M
*4 (R0), M
#1, R0
3
Instruction Costs
• Cost of an instruction = 1 + costs of source
and destination addressing modes
• This cost corresponds to the length (in
words) of the instruction
• Minimize instruction length also tend to
minimize the instruction execution time
4
Examples
MOV
MOV
MOV
MOV
R0, R1
R0, M
#1, R0
4 (R0), *12 (R1)
1
2
2
3
5
An Example
Consider a := b + c
1. MOV
ADD
MOV
b, R0
c, R0
R0, a
3. R0, R1, R2 contains
the addresses of a, b, c
MOV
*R1, *R0
ADD
*R2, *R0
2. MOV
ADD
b, a
c, a
4. R1, R2 contains
the values of b, c
ADD
R2, R1
MOV
R1, a
6
Instruction Selection
• Code skeleton
x := y
MOV
ADD
MOV
+ z
y, R0
z, R0
R0, x
a := b
MOV
ADD
MOV
+ c
b, R0
c, R0
R0, a
d := a
MOV
ADD
MOV
+ e
a, R0
e, R0
R0, d
INC
a
• Multiple choices
a := a + 1
MOV a, R0
ADD #1, R0
MOV R0, a
7
Register Allocation
• Register allocation: select the set of
variables that will reside in registers
• Register assignment: pick the specific
register that a variable will reside in
• The problem is NP-complete
8
An Example
t := a + b
t := t * c
t := t / d
t := a + b
t := t + c
t := t / d
MOV
ADD
MUL
DIV
MOV
MOV
ADD
ADD
SRDA
DIV
MOV
a, R1
b, R1
c, R0
d, R0
R1, t
a, R0
b, R0
c, R0
R0, 32
d, R0
R1, t
9
Basic Blocks
• A basic block is a sequence of consecutive
statements in which control enters at the
beginning and leaves at the end without halt
or possibility of branching except at the end
10
An Example
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
prod := 0
i := 1
t1 := 4 * i
t2 := a[t1]
t3 := 4 * i
t4 := b[t3]
t5 := t2 * t4
t6 := prod + t5
prod := t6
t7 := i + 1
i := t7
if i <= 20 goto (3)
11
Flow Graphs
• A flow graph is a directed graph
• The nodes in the graph are basic blocks
• There is an edge from B1 to B2 iff B2
immediately follows B1 in some execution
sequence
– B2 immediately follows B1 in program text
– there is a jump from B1 to B2
• B1 is a predecessor of B2, B2 is a successor
of B1
12
An Example
(1)
(2)
prod := 0
i := 1
B0
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
t1 := 4 * i
t2 := a[t1]
t3 := 4 * i
t4 := b[t3]
t5 := t2 * t4
t6 := prod + t5
prod := t6
t7 := i + 1
i := t7
if i <= 20 goto (3)
B1
13
Construction of Basic Blocks
• Determine the set of leaders
– the first statement is a leader
– the target of a jump is a leader
– any statement immediately following a jump is
a leader
• For each leader, its basic block consists of
the leader and all statements up to but not
including the next leader or the end of the
program
14
Representation of Basic Blocks
• Each basic block is represented by a record
consisting of
–
–
–
–
a count of the number of statements
a pointer to the leader
a list of predecessors
a list of successors
15
Define and Use
• A three address statement x := y + z is
said to define x and to use y and z
• A name is live in a basic block at a given
point if its value is used after that point,
perhaps in another basic block
16
Next-Use Information
i:
x := …
…
j:
no assignment to x
y := … x …
Statement j uses the value of x defined at i
17
An Example
b:(1), c:(1,4), d:(2)
(1)
a := b + c
a:(2,3,5), c:(4), d:(2)
(2)
e := a + d
a:(3,5), c:(4), e:(3)
(3)
f := e - a
a:(5), c:(4), f:(4)
(4)
e := f + c
a:(5), e:(5)
(5)
g := e - a
g:(?)
b, c, d are live at the beginning of the block
18
Computing Next Uses
• Scan statements “i: x := y op z” backward
• Attach to statement i the information
currently found in the symbol table regarding
the next uses and liveness of x, y, and z
• In the symbol table, set x to “not live” and
clear the next uses” of x
• In the symbol table, set y and z to “live” and
add i to the “next uses” of y and z
among
within blocks
blocks19
A Simple Code Generator
• Consider each statement in a basic block in
turn, remembering if operands are in registers
• Assume that
– each operator has a corresponding target language
operator
– computed results can be left in registers as long as
possible, unless
• out of registers
• at the end of a basic block
20
Register and Address
Descriptors
• A register descriptor keeps track of what is
currently in each register
• An address descriptor keeps track of the
location(s) where the current value of the
name can be found at run time
21
An Example
d := (a - b) + (a - c) + (a - c)
t := a - b
v := t + u
MOV
SUB
MOV
SUB
ADD
d := v + u
ADD R1, R0
u := a - c
a, R0
b, R0
a, R1
c, R1
R1, R0
MOV R0, d
[][]
[R0:(t)]
[t:(R0)]
[R0:(t), R1:(u)]
[t:(R0), u:(R1)]
[R0:(v), R1:(u)]
[v:(R0), u:(R1)]
[R0:(d)]
[d:(R0)]
[][]
22
Code Generation Algorithm
• Consider an instruction of the form “x := y op z”
• Invoke getreg to determine the location L where
the result of “y op z” will be placed
• Determine a current location y’ of y from the
address descriptor (register location preferred).
If y’ is not L, generate “MOV y’, L”
• Generate “op z’, L”, where z’ is a current location
of z from the address descriptor.
• Update the address and register descriptors for x, y,
z, and L
23
Code Generation Algorithm
• Consider an instruction of the form “x := y”
• If y is in a register, change the register and
address descriptors
• If y is in memory,
– if x has next use in the block, invoke getreg to
find a register r, generate “MOV y, r”, and
make r the location of x
– otherwise, generate “MOV y, x”
24
Code Generation Algorithm
• Once all statements in the basic block are
processed, we store those names that are
live on exit and not in their memory
locations
25
The Function getreg
• Consider an instruction of the form “x := y op z”
• If y is in a register r that holds the value of no other
names, and y is not live and no next uses after this
statement, return r
• Otherwise, return an empty register r if there is one
• Otherwise, if x has a next use in the block, or op is
an operator requiring a register, find an occupied
register r. Store the value of r, update address
descriptor, and return r
• If x has no next use, or no suitable occupied register
26
can be found, return the memory location of x
An Example
d := (a - b) + (a - c) + (a - c)
t := a - b
v := t + u
MOV
SUB
MOV
SUB
ADD
d := v + u
ADD R1, R0
u := a - c
a, R0
b, R0
a, R1
c, R1
R1, R0
MOV R0, d
[][]
[R0:(t)]
[t:(R0)]
[R0:(t), R1:(u)]
[t:(R0), u:(R1)]
[R0:(v), R1:(u)]
[v:(R0), u:(R1)]
[R0:(d)]
[d:(R0)]
[][]
27
Indexing and Pointer
Operations
i in Ri
i in Mi
i in Si(A)
a := b[i] MOV b(Ri), R MOV Mi, R
MOV Si(A), R
MOV b(R), R MOV b(R), R
a[i] := b MOV b, a(Ri) MOV Mi, R MOV Si(A), R
MOV b, a(R) MOV b, a(R)
a := *p
p in Rp
MOV *Rp, R
*p := a
MOV a, *Rp
p in Mp
MOV Mp, R
MOV *R, R
MOV Mp, R
MOV a, *R
p in Sp(A)
MOV Sp(A), R
MOV *R, R
Mov a, R
MOV R, *Sp(A)
28
Conditional Statements
• Condition codes
if x < y goto z
• Conditon code descriptors
x := y + z
if x < 0 goto z
CMP x, y
CJ< z
MOV
ADD
MOV
CJ<
y, R0
z, R0
R0, x
z
29
Global Register Allocation
• Keep live variables in registers across block
boundaries
• Keep variables frequently used in inner
loops in registers
30
Loops
• A loop is a collection of nodes such that
– all nodes in the collection are strongly
connected
– the collection of nodes has a unique entry
• An inner loop is one that contains no other
loops
31
Variable Usage Counts
• Savings
– Count a saving of one for each use of x in loop L
that is not preceded by an assignment to x in the
same block
– Save two units if we can avoid a store of x at the
end of a block
• Costs
– Cost two units if x is live at the entry or exit of
the inner loop
32
An Example
b,c,d,f
B1
a,c,d,e
a := b + c
d := d - b
e := a + f
a,c,d,f
a,c,d,e,f
B3 b := d + f
e := a - c
f := a - d B2
c,d,e,f
c,d,e,f
B4
b,c,d,e,f
b,d,e,f
b := d + c
b,c,d,e,f
b,c,d,e,f
33
An Example
use(a, B1) = 0,
use(a, B3) = 1,
live(a, B1) = 1,
live(a, B3) = 0,
use(a, B2) = 1
use(a, B4) = 0
live(a, B2) = 0
live(a, B4) = 0
save(a) = (0+1+1+0) + 2  (1+0+0+0) = 4
save(b) = 5
save(c) = 3
save(d) = 6
save(e) = 4
save(f) = 4
34
An Example
MOV b, R1; MOV d, R2
MOV R1, R0; ADD c, Ro
SUB R1, R2; MOV R0, R3 B1
ADD f, R3;
MOV R3, e
B2
MOV R0, R3; SUB R2, R3
MOV R3, f
B4
MOV R2, R1; ADD c, R1
MOV R1, b; MOV R2, d
B3
MOV R2, R1; ADD f, R1
MOV R0, R1; SUB c, R3
MOV R3, e
MOV R1, b; MOV R2, d
35
Register Assignment for Outer
Loops
• Apply the same idea for inner loops to
progressively larger loops
• If an outer loop L1 contains an inner loop L2, a
name allocated a register in L2 need not be
allocated a register in L1-L2
• If name x is allocated a register in L1 but not L2,
need store x on entrance to L2 and load x on exit
from L2
• If name x is allocated a register in L2 but not L1,
need load x on entrance to L2 and store x on exit
36
from L2
Peephole Optimization
• Improve the performance of the target
program by examining and transforming a
short sequence of target instructions
• May need repeated passes over the code
• Can also be applied directly after
intermediate code generation
37
Examples
• Redundant loads and stores
MOV R0, a
MOV a, Ro
• Algebraic Simplification
x := x + 0
x := x * 1
• Constant folding
x := 2 + 3
y := x + 3
x := 5
y := 8
38
Examples
• Unreachable code
#define debug 0
if (debug) (print debugging information)
if 0 <> 1 goto L1
print debugging information
L1:
if 1 goto L1
print debugging information
L1:
39
Examples
• Flow-of-control optimization
goto L1
…
L1: goto L2
goto L1
…
L1: if a < b goto L2
L3:
goto L2
…
L2: goto L2
if a < b goto L2
goto L3
…
L3:
40
Examples
• Reduction in strength: replace expensive
operations by cheaper ones
– x2  x * x
– fixed-point multiplication and division by a
power of 2  shift
– floating-point division by a constant 
floating-point multiplication by a constant
41
Examples
• Use of machine Idioms: hardware
instructions for certain specific operations
– auto-increment and auto-decrement addressing
mode (push or pop stack in parameter passing)
42
DAG Representation of Blocks
• Easy to determine:
• common subexpressions
• names used in the block but evaluated
outside the block
• names whose values could be used outside
the block
43
DAG Representation of Blocks
• Leaves labeled by unique identifiers
• Interior nodes labeled by operator symbols
• Nodes optionally given a sequence of
identifiers, having the value represented by
the nodes
44
An Example
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
t1 := 4 * i
t6, prod
t2 := a[t1]
+
t3 := 4 * i
t5
t4 := b[t3]
prod0
*
t5 := t2 * t4
t4
t2
(1)
t6 := prod + t5
[]
[]
<=
prod := t6
t1,t3
a
b
20
t7 := i + 1
+
*
t7, i
i := t7
4 i0 1
if i <= 20 goto (1)
45
Constructing a DAG
• Consider x := y op z. Other statements can be
handled similarly
• If node(y) is undefined, create a leaf labeled y
and let node(y) be this leaf. If node(z) is
undefined, create a leaf labeled z and let
node(z) be that leaf
46
Constructing a DAG
• Determine if there is a node labeled op,
whose left child is node(y) and its right child
is node(z). If not, create such a node. Let n
be the node found or created.
• Delete x from the list of attached identifiers
for node(x). Append x to the list of attached
identifiers for the node n and set node(x) to n
47
Reconstructing Quadruples
• Evaluate the interior nodes in topological order
• Assign the evaluated value to one of its
attached identifier x, preferring one whose
value is needed outside the block
• If there is no attached identifier, create a new
temp to hold the value
• If there are additional attached identifiers y1,
y2, …, yk whose values are also needed outside
the block, add
y1 := x, y2 := x, …, yk := x
48
An Example
prod
+
prod0
*
(1)
[]
a
[]
b
<=
i
+
*
4
i0
20
(1)
(2)
(3)
(4)
(5)
(6)
(7)
t1 := 4 * i
t2 := a[t1]
t3 := b[t1]
t4 := t2 * t3
prod := prod + t4
i := i + 1
if i <= 20 goto (1)
1
49
Arrays, Pointers, Procedure Calls
x := a[i]
a[j] := y
z := a[i]
=> range analysis
x := a[i]
z := x
a[j] := y
*p := w
=> aliasing analysis
side effects caused by procedure calls
=> inter-procedural analysis
50
Ordering Rules
• Any evaluation of or assignment to an
element of array a must follow the previous
assignment of that array if there is one
• Any assignment to an element of array a
must follow any previous evaluation of a
51
Ordering Rules
• Any use of any identifier must follow the
previous procedure call or indirect
assignment through a pointer if there is one
• Any procedure call or indirect assignment
through a pointer must follow all previous
evaluations of any identifier
52
Generating Code From DAGs
t1 := a + b
t2 := c + d
t3 := e - t2
t4 := t1 - t3
t1
+
a0
-
t4
-
t3
t2
b0 e0
+
c0
d0
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
MOV
ADD
MOV
ADD
MOV
MOV
SUB
MOV
SUB
MOV
a, R0
b, R0
c, R1
d, R1
R0, t1
e, R0
R1, R0
t1, R1
R0, R1
R1, t4
53
Rearranging the Order
t2 := c + d
t3 := e - t2
t1 := a + b
t4 := t1 - t3
t1
+
a0
-
t4
-
t3
t2
b0 e0
+
c0
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
MOV
ADD
MOV
SUB
MOV
ADD
SUB
MOV
c, R0
d, R0
e, R1
R0, R1
a, R0
b, R0
R1, R0
R0, t4
d0
54
A Heuristic Ordering for DAG
• Attempt as far as possible to make the
evaluation of a node immediately follow the
evaluation of its left most argument
55
Node Listing Algorithm
while unlisted interior nodes remain do begin
select an unlisted node n, all of whose
parents have been listed;
list n;
while the leftmost child m of n has no unlisted
parents and is not a leaf do begin
list m;
n := m;
end
end
56
An Example
*
2
+
1
-
4
5
*
7
6
a0
+
c0
+
b0
3
d0
e0
t7 := d + e
t6 := a + b
t5 := t6 - c
t4 := t5 * t7
t3 := t4 - e
t2 := t6 + t4
t1 := t2 * t3
57
Generating Code From Trees
• There exists an algorithm that determines
the optimal order in which to evaluate
statements in a block when the dag
representation of the block is a tree
• Optimal order here means the order that
yields the shortest instruction sequence
58
Optimal Ordering for Trees
• Label each node of the tree bottom-up with
an integer denoting fewest number of
registers required to evaluate the tree with
no stores of immediate results
• Generate code during a tree traversal by
first evaluating the operand requiring more
registers
59
The Labeling Algorithm
if n is a leaf then
if n is the leftmost child of its parent then
label(n) := 1
else
label(n) := 0
else begin
let n1, n2, …, nk be the children of n ordered by
label so that label(n1)  label(n2)  …  label(nk);
label(n) := max1 i  k(label(ni) + i - 1)
end
60
An Example
For binary interior nodes:
max(l1, l2), if l1  l2
l1 + 1,
if l1 = l2
label(n) =
2
t4
1
2
t3
t1
1
a
0
b
1
1
t2
e
1
0
c
d
61
Code Generation From a
Labeled Tree
• Use a stack rstack to allocate registers R0,
R1, …, R(r-1)
• The value of a tree is always computed in
the top register on rstack
• The function swap(rstack) interchanges the
top two registers on rstack
• Use a stack tstack to allocate temporary
memory locations T0, T1, ...
62
Cases Analysis
op
n
name
n1
op
n1
n2 name
op
n2
n1
op
n2
n1
n2
label(n1) < label(n2) label(n2)  label(n1) both labels  r
63
The Function gencode
procedure gencode(n);
begin
if n is a left leaf representing operand name
and n is the leftmost child of its parent then
print 'MOV' || name || ',' || top(rstack)
else if n is an interior node with operator op,
left child n1, and right child n2 then
if label(n2) = 0 then /* case 1 */
else if 1 label(n1) < label(n2) and label(n1) < r then /* case 2 */
else if 1 label(n2)  label(n1) and label(n2) < r then /* case 3 */
else /* case 4, both labels  r */
end
64
The Function gencode
/* case 1 */
begin
let name be the operand represented by n2;
gencode(n1);
print op || name || ',' || top(rstack)
end
/* case 2 */
begin
swap(rstack); gencode(n2);
R := pop(rstack); gencode(n1);
print op || R || ',' || top(rstack);
push(rstack, R); swap(rstack);
end
65
The Function gencode
/* case 3 */
begin
gencode(n1);
R := pop(rstack); gencode(n2);
print op || R || ',' || top(rstack);
push(rstack, R);
end
/* case 4 */
begin
gencode(n2); T := pop(tstack);
print 'MOV' || top(rstack) || ',' || T;
gencode(n1); push(tstack, T);
print op || T || ',' || top(rstack);
end
66
An Example
2
t4
-
1
1
a
t1
+
0
b
2
t3
1 1
e
t2
1 + 0
c
d
gencode(t4) [R1, R0]
gencode(t3) [R0, R1]
gencode(e) [R0, R1]
print MOV e, R1
gencode(t2) [R0]
gencode(c) [R0]
print MOV c, R0
print ADD d, R0
print SUB R0, R1
gencode(t1) [R0]
gencode(a) [R0]
print MOV a, R0
print ADD b, R0
print SUB R1, R0
/* 2 */
/* 3 */
/* 0 */
/* 1 */
/* 0 */
/* 1 */
/* 0 */
67
Multiregister Operations
• Some operations like multiplication,
division, or a function call normally require
more than one register
• The labeling algorithm needs to ensure that
label(n) is always at least the number of
registers required by the operation
label(n) =
max(2, l1, l2), if l1  l2
l1 + 1,
if l1 = l2
68
Algebraic Properties
+
max(2, l)
1
+
l
l
commutative
T1
l
0
T1
+
+
associative
+
+
T1
+
T4
T2
+
commutative
T3
largest
Ti4
Ti3
Ti1
Ti2
69
Common Subexpressions
• Nodes with more than one parent in a dag are
called shared nodes
• Optimal code generation for dags on both a
one-register machine or an unlimited number
of registers machine are NP-complete
70
Partitioning a DAG into Trees
• Partition a dag into a set of trees by finding
for each root and shared node n, the
maximal subtree with n as root that includes
no other shared nodes, except as leaves
• Determine a code generation ordering for
the trees
• Generate code for each tree using the
algorithms for generating code from trees
71
An Example
*
*
2
+
2
1
-
4
5
*
+
a0
b0
d0
4
+
4
*
5
e0
6
+
4
-
+
c0
3
e0
*
*
+
c0
+
6
7
6
3
1
d0
7
e0
6
+
a0
e0
b0
72
Dynamic Programming Code
Generation
• The dynamic programming algorithm applies to
a broad class of register machines with complex
instruction sets
• Machines has r interchangeable registers
• Machines has instructions of the form
Ri = E
where E is any expression containing operators,
registers, and memory locations. If E involves
registers, then Ri must be one of them
73
Dynamic Programming
• The dynamic programming algorithm
partitions the problem of generating optimal
code for an expression into sub-problems of
generating optimal code for the subexpressions of the given expression
+
T1
T2
74
Contiguous Evaluation
• We say a program P evaluates a tree T
contiguously if
• it first evaluates those subtrees of T that
need to be computed into memory
• it then evaluates the subtrees of the root in
either order
• it finally evaluates the root
75
Optimally Contiguous Program
• For the machines defined above, given any
program P to evaluate an expression tree T, we
can find an equivalent program P' such that
– P' is of no higher cost than P
– P' uses no more registers than P
– P' evaluates the tree in a contiguous fashion
• This implies that every expression tree can be
evaluated optimally by a contiguous program
76
Dynamic Programming Algorithm
• Phase 1: compute bottom-up for each node
n of the expression tree T an array C of
costs, in which the ith component C[i] is the
optimal cost of computing the subtree S
rooted at n into a register, assuming i
registers are available for the computation.
C[0] is the optimal cost of computing the
subtree S into memory
77
Dynamic Programming Algorithm
• To compute C[i] at node n, consider each
machine instruction R := E whose
expression E matches the subexpression
rooted at node n
• Determine the costs of evaluating the
operands of E by examining the cost vectors
at the corresponding descendants of n
78
Dynamic Programming Algorithm
• For those operands of E that are registers,
consider all possible orders in which the
corresponding subtrees of T can be evaluated
into registers
• In each ordering, the first subtree
corresponding to a register operand can be
evaluated using i available registers, the
second using i-1 registers, and so on
79
Dynamic Programming Algorithm
• For node n, add in the cost of the instruction
R := E that was used to match node n
• The value C[i] is then the minimum cost
over all possible orders
• At each node, store the instruction used to
achieve the best cost for C[i] for each i
• The smallest cost in the vector gives the
minimum cost of evaluating T
80
Dynamic Programming Algorithm
• Phase 2: traverse T and use the cost vectors
to determine which subtrees of T must be
computed into memory
• Phase 3: traverse T and use the cost vectors
and associated instructions to generate the
final target code
81
An Example
Consider a machine with two registers R0 and R1
and instructions
Ri := Mj
Mi := Ri
Ri := Rj
Ri := Ri op Rj Ri := Ri op Mj
+ (8, 8, 7)
(3, 2, 2)
-
(0, 1, 1)
a
(0, 1, 1) (0, 1, 1)
b
(5, 5, 4)
*
(3, 2, 2)
/
e
(0, 1, 1) c
d (0, 1, 1)
82
An Example
+ (8, 8, 7)
(3, 2, 2)
(0, 1, 1)
a
(0, 1, 1) (0, 1, 1)
b
R0 := c
R1 := d
R1 := R1 / e
R0 := R0 * R1
R1 := a
R1 := R1 - b
R1 := R1 + R0
(5, 5, 4)
*
c
(0, 1, 1) d
(3, 2, 2)
/
e (0, 1, 1)
83
Code Generator Generators
• A tool to automatically construct the
instruction selection phrase of a code generator
• Such tools may use tree grammars or context
free grammars to describe the target machines
• Register allocation will be implemented as a
separate mechanism
• Graph coloring is one of the approaches for
register allocation
84
Tree Rewriting
a[i] := b + 1
:=
ind
+
+
+
consta
memb
const1
ind
regsp
consti
+
regsp
85
Tree Rewriting
• The code is generated by reducing the input
tree into a single node using a sequence of
tree-rewriting rules
• Each tree rewriting rule is of the form
replacement  template { action }
– replacement is a single node
– template is a tree
– action is a code fragment
• A set of tree-rewriting rules is called a tree86
translation scheme
An Example
regi
+

{ ADD Rj, Ri }
regi
regj
Each tree template represents a computation performed
by the sequence of machines instructions emitted by the
associated action
87
Tree Rewriting Rules
(1)
regi  constc
{ MOV #c, Ri }
(2)
regi  mema
{ MOV a, Ri }
(3)
mem 
(4)
(5)
:=
mema regi
:=
mem 
ind
regj
regi
ind
+
regi 
constc regj
{ MOV Ri, a }
{ MOV Rj, *Ri }
{ MOV c(Rj), Ri }
88
Tree Rewriting Rules
+
(6)
regi 
regi
ind
{ ADD c(Rj), Ri }
+
constc regj
(7)
regi 
(8)
regi 
+
regi
{ ADD Rj, Ri }
regj
+
regi const1
{ INC Ri }
89
An Example
:=
ind
+
+
+
consta
(1)
{ MOV #a, R0 }
memb
const1
ind
regsp
consti
+
regsp
90
An Example
:=
ind
+
+
+
reg0
memb
const1
ind
regsp
(7)
consti
{ ADD SP, R0 }
+
regsp
91
An Example
:=
ind
+
{ ADD i (SP), R0 }
+
reg0
memb
ind
+
(5)
(6)
const1
consti
{ MOV i (SP), R1 }
regsp
92
An Example
:=
ind
reg0
+
memb
const1
(2)
{ MOV b, R1 }
93
An Example
:=
ind
reg0
+
reg1
const1
(8)
{ INC R1 }
94
An Example
:=
ind
reg1
reg0
(4)
{ MOV R1, *R0 }
95
Tree Pattern Matching
• The tree pattern matching algorithm can be
implemented by extending the multiplekeyword pattern matching algorithm
• Each tree template is represented by a set of
strings, each of which represents a path from
the root to a leave
• Each rule is associated with cost information
• The dynamic programming algorithm can be
used to select an optimal sequence of matches
96
Semantic Predicates
regi
+

regi
constc
{ if c = 1 then
INC Ri
else
ADD #c, Ri }
The general use of semantic actions and predicates can
provide greater flexibility and ease of description than
a purely grammatical specification
97
Pattern Matching by Parsing
• Use an LR parser to do the pattern matching
• The input tree can be treated as a string by
using its prefix representation
:= ind + + consta regsp ind +
consti regsp + memb const1
• The tree-translation scheme can be
converted into a syntax-directed translation
scheme by replacing the tree templates with
their prefix representations
98
Syntax-Directed Translation
Scheme
(1)
regi  constc
{ MOV #c, Ri }
(2)
regi  mema
{ MOV a, Ri }
(3)
mem  := mema regi
{ MOV Ri, a }
(4)
mem  := ind regi regj
{ MOV Rj, *Ri }
(5)
regi  ind + constc regj
{ MOV c(Rj), Ri }
(6)
regi  + regi ind + constc regj { ADD c(Rj), Ri }
(7)
regi  + regi regj
{ ADD Rj, Ri }
(8)
regi  + regi const1
{ INC Ri }
99
Advantages of Syntax-Directed
Translation Scheme
• The parsing method is efficient and well
understood
• It is relatively easy to retarget the code
generator
• The code generator can be made more
efficient by adding special-case productions
100
Disadvantages of SyntaxDirected Translation Scheme
• A left-to-right order of evaluation is fixed
• The machine description grammar can
become inordinately large
• Context free grammar is usually highly
ambiguous
101
Graph Coloring
• In the first pass, target machine instructions
are selected as though there were an infinite
number of symbolic registers
• In the second pass, physical registers are
assigned to symbolic registers using graph
coloring algorithms
• During the second pass, if a register is
needed when all available registers are used,
some of the used registers must be spilled
102
Interference Graph
• For each procedure, a register-interference
graph is constructed
• The nodes in the graph are symbolic
registers
• An edge connects two nodes if one is live at
a point where the other is defined
103
K-Colorable Graphs
• A graph is said to be k-colorable if each
node can be assigned one of the k colors
such that no two adjacent nodes have the
same color
• A color represents a register
• The problem of determining whether a
graph is k-colorable is NP-complete
104
A Graph Coloring Algorithm
• Remove a node n and its edges if it has fewer
than k neighbors
• Repeat the removing step above until we end
up with the empty graph or a graph in which
each node has k or more adjacent nodes
• In the latter case, a node is selected and spilled
by deleting that node and its edges, and the
removing step above continues
105
A Graph Coloring Algorithm
• The nodes in the graph can be colored in the
reverse order in which they are removed
• Each node can be assigned a color not
assigned to any of its neighbors
• Spilled nodes can be assigned any color
106
An Example
3
1
3
4
5
4
2
5
2
3
4
5
4
5
5
107
An Example
B
G
B
G
R
G
R
R
R
B
G
R
G
R
R
108