중간코드생성 - Welcome ^^! Prof. Won

Download Report

Transcript 중간코드생성 - Welcome ^^! Prof. Won

http://plac.dongguk.ac.kr
컴파일러 입문
컴파일러 입문
제 10 장
중간 코드 생성
http://plac.dongguk.ac.kr
목 차

Introduction

Syntax-Directed Translation

Code Generation

U-Code Translator
Intermediate Code Generation
2
http://plac.dongguk.ac.kr
Introduction

Formal Specification



lexical structure : regular expression
syntactic structure : context-free grammar
the remaining phases of compilation : no such notations


but, we use a syntax-directed translation scheme which is a
method associating semantic rules(or actions) with production.
SDTS ::= cfg + semantic actions



cfg의 production rule에 있는 grammar symbol을 이용하여 직접
semantic action을 기술하는 방법.
AST generation
Attribute grammar
Intermediate Code Generation
3
http://plac.dongguk.ac.kr

Intermediate code generation



the phase that generates an explicit intermediate code of the
source program.
after syntax and semantic analysis.
A Model for Intermediate code generation
Source
Program
Scanner
Intermediate
Representation
Parser

Intermediate
Code Generation
Our implementations:




Source program
:
Intermediate Representation :
Intermediate code
:
Execution
:
Mini C 프로그램
Abstract Syntax Tree (AST)
U-Code
U-Code Interpreter
Intermediate Code Generation
4
http://plac.dongguk.ac.kr

Implementation Model
data
Mini C Compiler
Source
Program
*.mc
Parser
Scanner
Token
Stream
SDT
ICG
AST
Ucode
Ucode
Interpreter
*.uco
result




scanner
parser
SDT
ICG
: shift action of parser
: main program (LR parser)
: reduce action of parser (AST generation)
: Intermediate code generation by traversing AST.
※ Semantic Analysis와 Intermediate Code Generation을 효율적으로 처
리하기 위해서 AST의 design은 매우 중요.
Intermediate Code Generation
5
http://plac.dongguk.ac.kr
Syntax-Directed Translation

Syntax-Directed Translation Scheme(SDTS)
::=

a production rule + semantic action(no widely-accepted formalism)
A Model of SDTS
Parsing table
get_token
Source
Program
Scanner
token
Parser
(main)
Result
call semantic
.
.
.

Semantic Actions
whenever a reduction takes place, the semantic rule corresponding to
the applied syntactic rule is activated.
Intermediate Code Generation
6
http://plac.dongguk.ac.kr
▶ Advantages of SDT


Providing a method describing semantic rules and that
description is independent of any particular implementation.
Easy to modify - new productions and semantic actions can be
added without disturbing the existing ones.
▶ Disadvantages of SDT


파싱중에 error가 일어난 경우 이제까지 행한 semantic action이 모두
무의미해 진다.
input에 대해 one pass이면서 syntax-directed하게 처리하기 때문에 어
떤 경우에는 정보가 부족하여 후에 필요한 정보가 나타났을 때
backpatching 등 복잡하게 처리해야 한다.
Solution
Syntax-directed한 방법으로는 의미 분석과 코드 생성시에 필요한
정보만을 구성하고 다음 단계에서 그것을 이용하여 의미 분석과
코드 생성을 한다.
Intermediate Code Generation
7
http://plac.dongguk.ac.kr
Description of Semantic Actions

SDTS(Syntax-Directed Translation Scheme)
::= production rules + semantic actions

Description of Semantic Actions
(1) Conventional PL.
(2) Meta Language - Formal Semantic Language(FSL)
Semantic
Description
Preprocessor
Input
table
SDT
Output
Intermediate Code Generation
8
http://plac.dongguk.ac.kr

Semantic Description using the Attributes of Grammar Symbol
::= We associate information with a programming language construct by
attaching attributes to the grammar symbols representing the
construct. Values for attributes are computed by "semantic rules"
associated with the grammar productions.

An attribute of symbol
::= A value associated with a grammar symbol. Each grammar symbol
has an associated set of attributes. An attribute can represent
anything we choose: a string, a number, a type, a memory location,
or whatever.
ex)
Production
L → E$
E → E1 + T
E→ T
T → T1 * F
T→F
F → (E)
F → digit
Semantic Rules
print(E.val)
E.val := E1.val + T.val
E.val := T.val
T.val := T1.val * F.val
T.val := F.val
F.val := E.val
F.val := digit.lexval
Intermediate Code Generation
9
http://plac.dongguk.ac.kr

Synthesized attribute
::= the value of the attribute of the nonterminal on the left side of
the production is defined as a function of the grammar symbols
on the right side.
XYZ A := F(X,Y,Z)
ex) A

Inherited attribute
::= the value of the attribute of a nonterminal on the right side of
the production is defined in terms of an attribute of the
nonterminal on the left.
XYZ Y.val := 2 * X.val
ex) A
※ Synthesized attribute is more natural than inherited attribute for
mapping most programming language constructs into
intermediate code.
Intermediate Code Generation
10
http://plac.dongguk.ac.kr
Implementation of SDT
▶ Designing steps




Input design - language construct에 대한 grammar를 cfg를 이
용하여 design.
Scanner, Parser의 작성.
Semantic Specification - conventional PL.
SDT
Translator의 완성 - interconnection.
▶ Examples : 1. Desk Calculator
2. Conversion infix into postfix
3. Construction of AST
Intermediate Code Generation
11
http://plac.dongguk.ac.kr
1. Desk Calculator
(1) Input design
0.
1.
2.
3.
4.
S -> E $
E -> E + E
E -> E * E
E -> ( E )
E -> num
(2) Parsing table
symbols
states
0
num
*
s3
)
$
acc
s2
r4
E
1
s5
s3
3
(
s2
s4
1
2
+
r4
6
r4
r4
4
s3
s2
7
5
s3
s2
8
6
s4
s5
s8
7
r1
s5
r1
r1
8
r2
r2
r2
r2
9
r3
r3
r3
r3
Intermediate Code Generation
12
http://plac.dongguk.ac.kr
(3) Semantic Specification
Production
L → E$
E → E1 + E2
E → E1 * E2
E → (E1)
E → num
Semantic Rules
print E.val
E.val := E1.val + E2.val
E.val := E1.val * E2.val
E.val := E1.val
E.val := num.lexval
(4) Implementation of Desk Calculator


Parsing stack : Symbol stack + State stack + Value stack
Value stack : holding the values of the corresponding
attribute.
Intermediate Code Generation
13
http://plac.dongguk.ac.kr
Production
S → E$
E→E+E
E→E*E
E → (E)
E → num



Code Fragment
print (val[top])
val[top-2] := val[top-2] + val[top]
val[top-2] := val[top-2] * val[top]
val[top-2] := val[top-1]
val[top] := num.lexval
the code fragments do not show how variable top is
managed.
lexval : token value
the code fragments are executed before a
reduction takes place.
Intermediate Code Generation
14
http://plac.dongguk.ac.kr
ex) 23 * 5 + 4 $
state
s3
r4
s5
s3
r4
r2
s4
s3
r4
r1
===>
===>
===>
===>
===>
===>
===>
===>
===>
===>
===>
(0
(0 3
(0 1
(0 1 5
(0 1 5 3
(0 1 5 8
(0 1
(0 1 4
(0 1 4 3
(0 1 4 7
(0 1
accept
input
symbol
, 23 * 5 + 4$,
,
,
* 5 + 4$, num
,
,
* 5 + 4$, E
,
,
5 + 4$, E *
,
,
+ 4$, E * num ,
,
+ 4$, E * E
,
,
+ 4$, E
,
,
4$, E +
,
,
$, E + num ,
,
$, E + E ,
,
$, E
,
value
23
23_
23__
23_5
115
115_
115__
115_4
119
parse
,
)
,
)
,4
)
, 4
)
, 4
)
, 44
)
, 442
)
, 442
)
, 442
)
, 4424 )
, 44241)
Intermediate Code Generation
15
http://plac.dongguk.ac.kr
2. Conversion infix into postfix
Production
E→E+E
E→E*E
E→E/E
E → (E)
E→a
Code Fragment
print ‘+’
print ‘*’
print ‘/’
print ‘a’
a + (a + a) * a
aaa+a*+
Intermediate Code Generation
16
http://plac.dongguk.ac.kr
3. Construction of AST

AST is a condensed form of parse tree useful for representing
language constructs.
ex) a = b + 1;
=
a
+
b
1
ex) if (a > b) x = a; else x = b;
if
>
a
=
b
x
=
a
x
b
Intermediate Code Generation
17
http://plac.dongguk.ac.kr

Functions to create the nodes of AST for expressions with binary
operators. Each function returns a pointer to a newly created node.
1. mktree(op,left,right) creates an operator node with label op and
two fields containing pointers to left and right.
2. mknode(a) creates a terminal node for a and returns the node
pointer.

Semantic Specification
Production

E → E1 + E
E → E1 - E
E→T
T → (E)
T→a
Semantic Rules
E.nptr := mktree(‘+’, E1.nptr, T.nptr)
E.nptr := mktree(‘-’, E1.nptr, T.nptr)
E.nptr := T.nptr
T.nptr := E.nptr
T.nptr := mknode(a)
The synthesized attribute nptr for E and T keeps track of the pointers
returned by the function calls.
Intermediate Code Generation
18
http://plac.dongguk.ac.kr

AST for a - 4 + c
+
-
id
to entry for c
id
num
4
to entry for a
Intermediate Code Generation
19
http://plac.dongguk.ac.kr
Programming Assignment #4

Implement a syntax-directed translator producing an
AST for Mini C program.
mini C
Program
Scanner
*.mc
Parser
SDT


AST
Mini C Program : Perfect.mc(Text pp.421)
The Output form of AST using printtree() : Text pp.417-418
Intermediate Code Generation
20
http://plac.dongguk.ac.kr

AST design

Grammar form : production rule [ => node_name ] ;


Note : => node name의 생략 시에는 부 트리를 구성하지 않음.
A ->  => node_name ;
node_name
α1

α2
…
αn
Mini C Grammar
mini_c
-> translation_unit
translation_unit -> external_dcl;
-> translation_unit external_dcl;
external_dcl
-> function_def;
-> declaration;
=> PROGRAM;
Text pp. 408-411
Intermediate Code Generation
21
http://plac.dongguk.ac.kr

Data Structures

A node form of AST
token
son
noderep
tokenNumber
brother
tokenValue
next brother
node
son node

Node structure
struct tokenType {
int tokenNumber;
char * tokenValue;
};
typedef struct nodeType {
struct tokenType token;
enum {terminal, nonterm} noderep;
struct nodeType *son;
struct nodeType *brother;
} Node;
// 토큰 번호
// 토큰 값
// 토큰 종류
// 노드 종류
// 왼쪽 링크
// 오른쪽 링크
Intermediate Code Generation
22
http://plac.dongguk.ac.kr

Production rule name
enum nodeNumber {
ACTUAL_PARAM, ADD, ADD_ASSIGN, ARRAY_VAR, ASSIGN_OP,
… , WHILE_ST
};
char *nodeName[] = {
"ACTUAL_PARAM", "ADD", "ADD_ASSIGN", "ARRAY_VAR", "ASSIGN_OP",
… "WHILE_ST"
};
int ruleName[] = {
/* 0
1
0,
PROGRAM,
…
/* 95
96
0,
0,
};
2
0,
3
0,
4
0,
97
0
*/
*/
Intermediate Code Generation
23
http://plac.dongguk.ac.kr

Shift action of parsing :

if the token is meaningful, then call buildNode.
Node *buildNode(struct tokenType token)
{
Node *ptr;
ptr = (Node *) malloc(sizeof(Node));
if (!ptr) { printf("malloc error in buildNode()\n");
exit(1);
}
ptr->token = token;
ptr->noderep = terminal;
ptr->son = ptr->brother = NULL;
return ptr;
}

Reduce action of parsing :

if the production rule is meaningful
1. build subtree
- linking brothers
- making a subtree
else
2. only linking brothers
Intermediate Code Generation
24
http://plac.dongguk.ac.kr

Node *buildTree(int nodeNumber, int rhsLength)
Node *buildTree(int nodeNumber, int rhsLength)
{ //…
i = sp - rhsLength + 1;
// step 1: find a first index with node in value stack
while (i <= sp && valueStack[i] == NULL) i++;
if (!nodeNumber && i > sp) return NULL;
start = i;
// step 2: linking brothers
while (i <= sp-1) {
j = i + 1;
while (j <= sp && valueStack[j] == NULL) j++;
if (j <= sp) {
ptr = valueStack[i];
while (ptr->brother) ptr = ptr->brother;
ptr->brother=valueStack[j];
}
i = j;
}
first = (start > sp) ? NULL : valueStack[start];
// step 3: making subtree root and linking son
if (nodeNumber) {
//… memory allocation for ptr
ptr->token.tokenNumber = nodeNumber;
ptr->token.tokenValue = NULL;
ptr->noderep = nonterm;
ptr->son = first;
ptr->brother = NULL;
return ptr;
}
else return first;
}
Intermediate Code Generation
/* …… ① */
/* …… ② */
/* …… ③ */
/* …… ④ */
/* …… ⑤ */
/* …… ⑥ */
/* …… ⑦ */
25
http://plac.dongguk.ac.kr

buildTree()
① 현재 reduce되는 생성 규칙의 rhs에 노드가 매달려 있는 인덱스를
값 스택에서 찾는다. 형제 노드로 연결할 노드의 첫 번째 인덱스를
찾은 것이다.
② 의미있는 생성 규칙이 아니고 연결할 형제 노드도 없으면 그냥 복귀
한다.
③ 형제 노드로 연결할 노드의 다음 인덱스를 ①과 같은 방법으로 찾는
다.
④ 만약 다음 인덱스를 찾았으면, 형제 노드로 연결한다.
⑤ 연속해서 다음 인덱스를 찾기 위해 위치를 앞으로 이동한다.
⑥ 연결된 형제 노드들의 첫 번째 노드의 포인터를 first에 저장한다.
⑦ 의미있는 생성 규칙이면, nonterminal 노드를 만든 후에 연결된 형
제 노드를 son으로 연결하고 새로 만든 노드의 포인터를 복귀한다.
의미있는 생성 규칙이 아니면, 연결된 형제 노드의 포인터만을 복귀
한다.
Intermediate Code Generation
26
http://plac.dongguk.ac.kr

Parsing Stack and Value Stack

Parsing Stack과 Value Stack은 병렬로 운행
…
Parsing Stack :
Value Stack :
…
…
SP
Intermediate Code Generation
27
http://plac.dongguk.ac.kr

Printing the information about AST

void printnode(NODE *pt, int indent) : print a node information
void printNode(Node *pt, int indent)
{
extern FILE * astFile; int i;
for (i=1; i<=indent; i++) fprintf(astFile," ");
if (pt->noderep == terminal) {
if (pt->token.number == tident)
fprintf(astFile," Terminal: %s", pt->token.value.id);
else if (pt->token.number == tnumber)
fprintf(astFile," Terminal: %d", pt->token.value.num);
}
else { // nonterminal node
int i;
i = (int) (pt->token.number);
fprintf(astFile," Nonterminal: %s", nodeName[i]);
}
fprintf(astFile,"\n");
}
Intermediate Code Generation
28
http://plac.dongguk.ac.kr

void printtree(NODE *pt, int indent) : print an AST
void printTree(Node *pt, int indent)
{
Node *p = pt;
while (p != NULL) {
printNode(p, indent);
if (p->noderep == nonterm) printTree(p->son, indent+5);
p = p->brother;
}
}
Intermediate Code Generation
29
http://plac.dongguk.ac.kr
Code Generation

A Model for ICG
data
Mini C Compiler
Source
Program
*.mc
Parser
Scanner
Token
Stream
SDT
ICG
AST
Ucode
Ucode
Interpreter
*.uco
result
Source language
: Mini C
Intermediate Representation : Abstract Syntax Tree(AST)
Intermediate code
: Ucode
Execution
: Ucode Interpreter
Intermediate Code Generation
30
http://plac.dongguk.ac.kr

Mini C Program




External declaration and Function define
Declaration
expression
Statement






return statement
compound statement
expression statement
control statement - if, if - else, while
Function
Production rules : (Mini C)
mini_c
translation_unit
external_dcl
function_def
declaration
…
→ translation_unit
→ external_dcl;
→ translation_unit external_dcl;
→ function_def;
→ declaration;
=> PROGRAM;
→ function_header compound_st
→ dcl_spec init_dcl_list ';'
=> FUNC_DEF;
=> DCL;
Intermediate Code Generation
31
http://plac.dongguk.ac.kr

Mini C Program AST
root
PROGRAM
DCL
…
…
FUNC_DEF
…
…
Intermediate Code Generation
32
http://plac.dongguk.ac.kr

Code Generating Routines
codeGen
processDeclaration
processSimpleVariable
processArrayVariable
processFuncHeader
processFunction
processStatement
processOperator
processCondition
Intermediate Code Generation
33
http://plac.dongguk.ac.kr

codeGen
1. process external variables and function headers
2. process statements for function
3. generate starting code of U-Code interpreter
void codeGen(Node *ptr)
{
//…
// step 1: process the declaration part
for (p=ptr->son; p; p=p->brother) {
if (p->token.number == DCL) processDeclaration(p->son);
else if (p->token.number == FUNC_DEF) processFuncHeader(p->son);
else icg_error(3);
}
//…
// step 2: process the function part
for (p=ptr->son; p; p=p->brother)
if (p->token.number == FUNC_DEF) processFunction(p);
//…
// step 3: generate codes for starting routine
emit1(bgn, globalSize);
emit0(ldp);
emitJump(call, "main");
emit0(endop);
}
Intermediate Code Generation
34
http://plac.dongguk.ac.kr

Declaration

Grammar
declaration
→ dcl_spec init_dcl_list ';‘
declaration
dcl_spec
dcl_specifiers
→ dcl_spec init_dcl_list ';‘
→ dcl_specifiers
→ dcl_specifier;
→ dcl_specifiers dcl_specifier;
→ type_qualifier;
→ type_specifier;
→ 'const‘
→ 'int‘
→ 'void‘
→ init_declarator;
→ init_dcl_list ',' init_declarator;
→ declarator
→ declarator '=' '%number‘
→ '%ident‘
→ '%ident' '[' opt_number ']‘
→ '%number';
dcl_specifier
type_qualifier
type_specifier
init_dcl_list
init_declarator
declarator
opt_number
=> DCL;
=> DCL_SPEC;
=> CONST_NODE;
=> INT_NODE;
=> VOID_NODE;
=> DCL_ITEM;
=> DCL_ITEM;
=> SIMPLE_VAR;
=> ARRAY_VAR;
→;
Intermediate Code Generation
35
http://plac.dongguk.ac.kr

AST
DCL
DCL_SPEC

DCL_ITEM
…
DCL_ITEM
Process function code segment
void processDeclaration(Node *ptr)
{ //…
// step 1: process DCL_SPEC
//…
// step 2: process DCL_ITEM
while (p) {
q = p->son; // SIMPLE_VAR or ARRAY_VAR
switch (q->token.number) {
case SIMPLE_VAR:
// simple variable
processSimpleVariable(q, typeSpecifier, typeQualifier);
break;
case ARRAY_VAR:
// array variable
processArrayVariable(q, typeSpecifier, typeQualifier);
break;
default: printf("error in SIMPLE_VAR or ARRAY_VAR\n"); break;
} // end switch
p = p->brother;
} // end while
}
Intermediate Code Generation
36
http://plac.dongguk.ac.kr

Expression - Assignment

Grammar
expression
assignment_exp

→ assignment_exp;
→ logical_or_exp;
→ unary_exp '=' assignment_exp
→ unary_exp '+=' assignment_exp
→ unary_exp '-=' assignment_exp
→ unary_exp '*=' assignment_exp
→ unary_exp '/=' assignment_exp
→ unary_exp '%=' assignment_exp
=> ASSIGN_OP;
=> ADD_ASSIGN;
=> SUB_ASSIGN;
=> MUL_ASSIGN;
=> DIV_ASSIGN;
=> MOD_ASSIGN;
AST
ASSIGN_OP
IDENT
ASSIGN_OP
exp
INDEX
IDENT
exp
exp
Intermediate Code Generation
37
http://plac.dongguk.ac.kr
 Process
function code segment
void processOperator(Node *ptr)
{
switch (ptr->token.number) {
// assignment operator
case ASSIGN_OP:
// ...
// step 1: generate instructions for left-hand side if
// step 2: generate instructions for right-hand side
// step 3: generate a store instruction
// complex assignment operators
case ADD_ASSIGN: case SUB_ASSIGN: case MUL_ASSIGN:
case DIV_ASSIGN: case MOD_ASSIGN:
// ...
// step 1: code generation for left hand side
// step 2: code generation for repeating part
// step 3: code generation for right hand side
// step 4: emit the corresponding operation code
// step 5: code generation for store code
// ...
} // end switch
}
Intermediate Code Generation
38
http://plac.dongguk.ac.kr
 Example

program
sum += 100;

AST
ADD_ASSIGN
sum

100
U-Code
lod
ldc
add
1 1
100
/* sum */
str
1 1
/* sum */
Intermediate Code Generation
39
http://plac.dongguk.ac.kr

Expression
-
Binary(arithmetic/relational/logical)
operators
 Grammar
logical_or_exp
logical_and_exp
equality_exp
relational_exp
additive_exp
multiplicative_exp
→ logical_and_exp;
→ logical_or_exp '||' logical_and_exp
→ equality_exp;
→ logical_and_exp '&&' equality_exp
→ relational_exp;
→ equality_exp '==' relational_exp
→ equality_exp '!=' relational_exp
→ additive_exp;
→ relational_exp '>' additive_exp
→ relational_exp '<' additive_exp
→ relational_exp '>=' additive_exp
→ relational_exp '<=' additive_exp
→ multiplicative_exp;
→ additive_exp '+' multiplicative_exp
→ additive_exp '-' multiplicative_exp
→ unary_exp;
→ multiplicative_exp '*' unary_exp
→ multiplicative_exp '/' unary_exp
→ multiplicative_exp '%' unary_exp
=> LOGICAL_OR;
=> LOGICAL_AND;
=> EQ;
=> NE;
=> GT;
=> LT;
=> GE;
=> LE;
=> ADD;
=> SUB;
=> MUL;
=> DIV;
=> MOD;
Intermediate Code Generation
40
http://plac.dongguk.ac.kr

Process function code segment
void processOperator(Node *ptr)
{
switch (ptr->token.number) {
//…
// binary(arithmetic/relational/logical) operators
case ADD: case SUB: case MUL: case DIV: case MOD:
case EQ: case NE: case GT: case LT: case GE: case LE:
case LOGICAL_AND: case LOGICAL_OR:
{
// step 1: visit left operand
if (lhs->noderep == nonterm) processOperator(lhs);
else rv_emit(lhs);
// step 2: visit right operand
if (rhs->noderep == nonterm) processOperator(rhs);
else rv_emit(rhs);
// step 3: visit root
switch (ptr->token.number) {
// arithmetic operators
// relational operators
// logical operators
}
}
// ...
} // end switch
}
Intermediate Code Generation
41
http://plac.dongguk.ac.kr

Example

program
init = 10;
value = init + 20 * 2;

AST
EXP_ST
EXP_ST
②
⑧
ASSIGN_OP
ASSIGN_OP
①
⑦
init
10
value
ADD
③
⑥
init
MUL
④

20
U-Code
ldc
str
lod
ldc
ldc
mult
add
str
⑤
2
⇐①
⇐②
⇐③
⇐④
⇐⑤
⇐⑥
⇐⑦
2 ; value ⇐ ⑧
10
1 1 ; init
1 1 ; init
20
2
1
Intermediate Code Generation
42
http://plac.dongguk.ac.kr

Expression - Unary operators

Grammar
unary_exp
postfix_exp

→ postfix_exp;
→ '-' unary_exp
=> UNARY_MINUS;
→ '!' unary_exp
=> LOGICAL_NOT;
→ '++' unary_exp
=> PRE_INC;
→ '--' unary_exp
=> PRE_DEC;
→ primary_exp;
→ postfix_exp '[' expression ']'
=> INDEX;
→ postfix_exp '(' opt_actual_param‘)’ => CALL;
→ postfix_exp '++'
=> POST_INC;
→ postfix_exp '--'
=> POST_DEC;
Process function code segment - Unary - , !
// unary operators
case UNARY_MINUS: case LOGICAL_NOT:
{
Node *p = ptr->son;
if (p->noderep == nonterm) processOperator(p);
else rv_emit(p);
switch (ptr->token.number) {
case UNARY_MINUS: emit0(neg); break;
case LOGICAL_NOT: emit0(notop); break;
}
break;
}
Intermediate Code Generation
43
http://plac.dongguk.ac.kr

Array variable
In one-dimensional array, location of i's element = Base + (i - Low) * W
where, Low : lower bound of array
Base : start address of array
in C programming language, Low is always 0.
∴ Address of A[i] = Base + i*W


Assume that the size of integer is 1. W = 1
Location of list[10] = (start address of array) + 10 * 1
Process function code segment - Array
case INDEX:
{
Node *indexExp = ptr->son->brother;
if (indexExp->noderep == nonterm) processOperator(indexExp);
else rv_emit(indexExp);
stIndex = lookup(ptr->son->token.value.id);
if (stIndex == -1) {
printf("undefined variable : %s\n", ptr->son->token.value.id);
return;
}
emit2(lda, symbolTable[stIndex].base, symbolTable[stIndex].offset);
emit0(add);
if (!lvalue) emit0(ldi);
// rvalue
break;
}
Intermediate Code Generation
44
http://plac.dongguk.ac.kr
【예 9】 다음은 Mini C에서 배열의 선언과 배열의 참조를 나타낸다.
int vector[100];
void main()
{
int temp;
// ...
vector[5] = 10;
// ...
temp = vector[20];
// ...
}
위 프로그램에서 vector[5] = 10;에 해당하는 U-코드는 다음과 같다.
ldc
lda
add
ldc
sti
5
1 1 /* base address(vector)의 적재 */
10
위 프로그램에서 temp = vector[20];에 해당하는 U-코드는 다음과 같다.
ldc
lda
add
ldi
str
20
1 1 /* base address(vector)의 적재 */
2 1 /* temp */
Intermediate Code Generation
45
http://plac.dongguk.ac.kr

Process function code segment - ++, -// increment/decrement operators
case PRE_INC: case PRE_DEC: case POST_INC: case POST_DEC:
{
//…
// compute operand
//…
switch (ptr->token.number) {
case PRE_INC: emit0(incop);
// if (isOperation(ptr)) emit0(dup);
break;
case PRE_DEC: emit0(decop);
// if (isOperation(ptr)) emit0(dup);
break;
case POST_INC:
// if (isOperation(ptr)) emit0(dup);
emit0(incop); break;
case POST_DEC:
// if (isOperation(ptr)) emit0(dup);
emit0(decop); break;
}
//…
// compute index
//…
}
Intermediate Code Generation
46
http://plac.dongguk.ac.kr

Example

program
sum = ++vector[5] + 3;

AST
ASSIGN_OP
sum
ADD
PRE_INC
3
INDEX
vector

5
U-Code
ldc
lda
add
ldi
inc
dup
ldc
5
1 2
5
/* vector */
lda
add
swp
sti
ldc
add
str
1 2
/* vector */
3
1 1
/* sum */
Intermediate Code Generation
47
http://plac.dongguk.ac.kr

Statement

Grammar
statement

→ compound_st;
→ expression_st;
→ if_st;
→ while_st;
→ return_st;
Process function code segment
void processStatement(Node *ptr)
{
switch (ptr->token.number) {
// process COMPOUND_ST ...
// process EXP_ST …
case RETURN_ST:
if (ptr->son != NULL) {
returnWithValue = 1;
p = ptr->son;
if (p->noderep == nonterm) processOperator(p); // return value
else rv_emit(p);
emit0(retv);
} else emit0(ret);
break;
// process IF_ST, IF_ELSE_ST, WHILE_ST …
} //end switch
}
Intermediate Code Generation
48
http://plac.dongguk.ac.kr

Statement - Compound statement

Grammar
compound_st
opt_dcl_list
opt_stat_list
statement_list
→ '{' opt_dcl_list opt_stat_list '}‘
→ declaration_list
→
→ statement_list
→;
→ statement;
→ statement_list statement;
=> COMPOUND_ST;
=> DCL_LIST;
=> DCL_LIST;
=> STAT_LIST;
※ Mini C 언어에서, 함수 내에서는 지역 변수를 선언할 수 있지만 복합문 내에서는 지역 변수를 선
언할 수 없다. 따라서, 복합문 내에서 지역 변수를 선언하더라도 무시하고 문장들만 처리한다.

Process function code segment
void processStatement(Node *ptr)
{
//…
case COMPOUND_ST:
p = ptr->son->brother; // STAT_LIST
p = p->son;
while (p) {
processStatement(p);
p = p->brother;
}
break;
//…
}
Intermediate Code Generation
49
http://plac.dongguk.ac.kr

Statement - Expression statement

Grammar
expression_st
opt_expression

→ opt_expression ';‘
→ expression;
→;
=> EXP_ST;
Process function code segment
void processStatement(Node *ptr)
{
//…
case EXP_ST:
if (ptr->son != NULL) processOperator(ptr->son);
break;
//…
}
Intermediate Code Generation
50
http://plac.dongguk.ac.kr

Statement - Control statement

Control Statement
1. conditional statement
2. iteration statement
3. branch statement

- if, case, switch
- for, while, do-while, repeat-until
- goto
Logical expression
1. use calculation of logical value
2. use control expression in control statement

Expression of logical value
1. true와 false를 숫자로 변환, 산술식의 연산과 유사한 방법으로 계산
2. 프로그램 내에서 도달될 위치를 나타냄
【예 11】 논리식 !a || b && c에 대한 일련의 U-코드는 다음과 같다.
lod 1 1
not
lod 1 2
lod 1 3
and
or
// a
// b
// c
Intermediate Code Generation
51
http://plac.dongguk.ac.kr
【예 12】 관계식 a >= b + 1에 대한 AST와 U-코드는 다음과 같다.
AST 형태 :
GE
a
ADD
b
1
U-코드 :
lod
lod
loc
add
ge
Ba Oa
Bb Ob
1
// Ba: 변수 a의 base, Oa: 변수 a의 offset
// Bb: 변수 b의 base, Ob: 변수 b의 offset
Intermediate Code Generation
52
http://plac.dongguk.ac.kr

Generate control statement code

if 구조
CONDITION 코드
false
goto tag
STATEMENT 코드
tag:

if – else 구조
CONDITION 코드
STATEMENT1 코드
false
goto tag1
goto tag2
tag1:
STATEMENT2 코드
tag2:

while 구조
tag1:
CONDITION 코드
STATEMENT 코드
false
goto tag2
goto tag1 코드
STATEMENT2
tag2:
Intermediate Code Generation
53
http://plac.dongguk.ac.kr

Grammar
→ 'if' '(' expression ')' statement
→ 'if' '(' expression ')' statement 'else' statement
→ 'while' '(' expression ')' statement
if_st
while_st


=> IF_ST;
=> IF_ELSE_ST;
=> WHILE_ST;
AST
Process function code segment

if statement
IF_ST
CONDITION
STATEMENT
void processStatement(Node *ptr)
{
//…
case IF_ST:
{
char label[LABEL_SIZE];
genLabel(label);
processCondition(ptr->son); // condition part
emitJump(fjp, label);
processStatement(ptr->son->brother); // true part
emitLabel(label);
}
//…
}
Intermediate Code Generation
54
http://plac.dongguk.ac.kr

if-else statement
IF_ELSE_ST
CONDITION
STATEMENT1
STATEMENT2
void processStatement(Node *ptr)
{
//…
case IF_ELSE_ST:
{
char label1[LABEL_SIZE], label2[LABEL_SIZE];
genLabel(label1); genLabel(label2);
processCondition(ptr->son);
// condition part
emitJump(fjp, label1);
processStatement(ptr->son->brother);
// true part
emitJump(ujp, label2);
emitLabel(label1);
processStatement(ptr->son->brother->brother);
// false part
emitLabel(label2);
}
//…
}
Intermediate Code Generation
55
http://plac.dongguk.ac.kr

while statement
WHILE_ST
CONDITION
STATEMENT
void processStatement(Node *ptr)
{
//…
case WHILE_ST:
{
char label1[LABEL_SIZE], label2[LABEL_SIZE];
genLabel(label1); genLabel(label2);
emitLabel(label1);
processCondition(ptr->son);
emitJump(fjp, label2);
processStatement(ptr->son->brother);
emitJump(ujp, label1);
emitLabel(label2);
// condition part
// loop body
}
//…
}
Intermediate Code Generation
56
http://plac.dongguk.ac.kr

Example 1

program
if (a > max) max = a;

AST
IF_ST
GT
a
EXP_ST
max
ASSIGN_OP
max

a
U-Code
$$1
lod
lod
gt
1 1
1 2
fjp
$$1
lod
str
1 1
1 2
// a
// max
// a > max
// max = a
nop
Intermediate Code Generation
57
http://plac.dongguk.ac.kr

Example 2

program
while (i <= 100) {
sum += i;
++i;
}

AST
WHILE_ST
LE
i
COMPOUND_ST
100
EXP_ST
EXP_ST
ADD_ASSIGN
PRE_INC
sum
i
i
Intermediate Code Generation
58
http://plac.dongguk.ac.kr

U-Code
$$1
$$2
nop
lod
loc
le
1 1
100
fjp
$$2
lod
lod
add
str
sum
i
lod
inc
str
i
ujp
$$1
// i <= 100
// sum += i;
sum
// ++i;
i
nop
Intermediate Code Generation
59
http://plac.dongguk.ac.kr

Function – Function call

Grammar
→ primary_exp;
→ postfix_exp '(' opt_actual_param ')‘ => CALL;
opt_actual_param → actual_param;
→;
actual_param
→ actual_param_list
=> ACTUAL_PARAM;
actual_param_list → assignment_exp;
→ actual_param_list ',' assignment_exp;
postfix_exp

AST
CALL
IDENT
ACTUAL_PARAM
EXP
EXP
…
Intermediate Code Generation
60
http://plac.dongguk.ac.kr

Process function code segment
void processStatement(Node *ptr)
{
//…
case CALL:
{
//…
// predefined(Library) functions
//…
// handle for user function
functionName = p->token.value.id;
stIndex = lookup(functionName);
if (stIndex == -1) break; // undefined function !!!
noArguments = symbolTable[stIndex].width;
emit0(ldp);
p = p->brother;
// ACTUAL_PARAM
while (p) {
// processing actual arguments
if (p->noderep == nonterm) processOperator(p);
else rv_emit(p);
noArguments--;
p = p->brother;
}
//…
emitJump(call, ptr->son->token.value.id);
break;
}
//…
}
Intermediate Code Generation
61
http://plac.dongguk.ac.kr

Function – Function define

Grammar
function_def
→ function_header compound_st
function_header → dcl_spec function_name formal_param
function_name
→ '%ident';
formal_param
→ '(' opt_formal_param ')'
opt_formal_param → formal_param_list;
→;
formal_param_list → param_dcl;
→ formal_param_list ',' param_dcl;
param_dcl
→ dcl_spec declarator

=> FUNC_DEF;
=> FUNC_HEAD;
=> FORMAL_PARA;
=> PARAM_DCL;
AST

Function define
FUNC_DEF
FUNC_HEAD
COMPOUND_ST
Intermediate Code Generation
62
http://plac.dongguk.ac.kr

Function head
FUNC_HEAD
DCL_SPEC

IDENT
FORMAL_PARA
Formal parameter
FORMAL_PARA
PARAM_DCL
PARAM_DCL
…
Intermediate Code Generation
63
http://plac.dongguk.ac.kr

Process function code segment – Function header
void processFuncHeader(Node *ptr)
{
//…
// step 1: process the function return type
p = ptr->son->son;
while (p) {
if (p->token.number == INT_NODE) returnType = INT_TYPE;
else if (p->token.number == VOID_NODE) returnType = VOID_TYPE;
else printf("invalid function return type\n");
p = p->brother;
}
// step 2: count the number of formal parameters
p = ptr->son->brother->brother; // FORMAL_PARA
p = p->son;// PARAM_DCL
noArguments = 0;
while (p) {
noArguments++;
p = p->brother;
}
// step 3: insert the function name
stIndex = insert(ptr->son->brother->token.value.id, returnType, FUNC_TYPE,
1/*base*/, 0/*offset*/, noArguments/*width*/, 0/*initialValue*/);
//if (!strcmp("main", functionName)) mainExist = 1;
}
Intermediate Code Generation
64
http://plac.dongguk.ac.kr

Process function code segment - Function
void processFunction(Node *ptr)
{
// …
// step 1: process formal parameters
// step 2: process the declaration part in function body
// step 3: emit the function start code
// step 4: process the statement part in function body
// step 5: check if return type and return value
// step 6: generate the ending codes
// …
}
Intermediate Code Generation
65
http://plac.dongguk.ac.kr
Ucode Translator

Design and Implementation of Ucode Translator

scanner, parser, SDT, ICG
data
Mini C Compiler
Source
Program
*.mc
Parser
Scanner
Token
Stream
SDT
ICG
AST
Ucode
Ucode
Interpreter
*.uco
result
Intermediate Code Generation
66
http://plac.dongguk.ac.kr

Execution sequence of perfect.mc
① Mini C program : Text pp.421
② The Output form of AST using printtree() : Text pp.417-418
③ Ucode that generated by code generator : Text pp.426-463
④ The execution of Ucode using Ucode Interpreter
ucodei perfect.uco
Result filename is perfect.lst
-- Assembling...
-- Executing...
-- Result Data
6 28 496
Intermediate Code Generation
67
http://plac.dongguk.ac.kr
Programming Assignment #5

Mini C 언어에 대한 Ucode Translator를 작성하시오.
생성된 Ucode는 Interpreter를 사용하여 실행하시오.
data
AST
ICG
Ucode
*.uco
Ucode
Interpreter
execution
result
Intermediate Code Generation
68
http://plac.dongguk.ac.kr

예제 프로그램: perfect.mc
int max = 500;
void main()
{
int i, j, k;
int rem, sum;
i = 2;
while (i <= max) {
sum = 0;
k = i / 2;
j = 1;
while (j <= k) {
rem = i % j;
if (rem == 0) {
sum += j;
}
++j;
}
if (i == sum) write(i);
++i;
}
}
Intermediate Code Generation
69