중간코드생성 - Welcome ^^! Prof. Won
Download
Report
Transcript 중간코드생성 - Welcome ^^! Prof. Won
http://plac.dongguk.ac.kr
컴파일러 입문
컴파일러 입문
제 10 장
중간 코드 생성
http://plac.dongguk.ac.kr
목 차
Introduction
Syntax-Directed Translation
Code Generation
U-Code Translator
Intermediate Code Generation
2
http://plac.dongguk.ac.kr
Introduction
Formal Specification
lexical structure : regular expression
syntactic structure : context-free grammar
the remaining phases of compilation : no such notations
but, we use a syntax-directed translation scheme which is a
method associating semantic rules(or actions) with production.
SDTS ::= cfg + semantic actions
cfg의 production rule에 있는 grammar symbol을 이용하여 직접
semantic action을 기술하는 방법.
AST generation
Attribute grammar
Intermediate Code Generation
3
http://plac.dongguk.ac.kr
Intermediate code generation
the phase that generates an explicit intermediate code of the
source program.
after syntax and semantic analysis.
A Model for Intermediate code generation
Source
Program
Scanner
Intermediate
Representation
Parser
Intermediate
Code Generation
Our implementations:
Source program
:
Intermediate Representation :
Intermediate code
:
Execution
:
Mini C 프로그램
Abstract Syntax Tree (AST)
U-Code
U-Code Interpreter
Intermediate Code Generation
4
http://plac.dongguk.ac.kr
Implementation Model
data
Mini C Compiler
Source
Program
*.mc
Parser
Scanner
Token
Stream
SDT
ICG
AST
Ucode
Ucode
Interpreter
*.uco
result
scanner
parser
SDT
ICG
: shift action of parser
: main program (LR parser)
: reduce action of parser (AST generation)
: Intermediate code generation by traversing AST.
※ Semantic Analysis와 Intermediate Code Generation을 효율적으로 처
리하기 위해서 AST의 design은 매우 중요.
Intermediate Code Generation
5
http://plac.dongguk.ac.kr
Syntax-Directed Translation
Syntax-Directed Translation Scheme(SDTS)
::=
a production rule + semantic action(no widely-accepted formalism)
A Model of SDTS
Parsing table
get_token
Source
Program
Scanner
token
Parser
(main)
Result
call semantic
.
.
.
Semantic Actions
whenever a reduction takes place, the semantic rule corresponding to
the applied syntactic rule is activated.
Intermediate Code Generation
6
http://plac.dongguk.ac.kr
▶ Advantages of SDT
Providing a method describing semantic rules and that
description is independent of any particular implementation.
Easy to modify - new productions and semantic actions can be
added without disturbing the existing ones.
▶ Disadvantages of SDT
파싱중에 error가 일어난 경우 이제까지 행한 semantic action이 모두
무의미해 진다.
input에 대해 one pass이면서 syntax-directed하게 처리하기 때문에 어
떤 경우에는 정보가 부족하여 후에 필요한 정보가 나타났을 때
backpatching 등 복잡하게 처리해야 한다.
Solution
Syntax-directed한 방법으로는 의미 분석과 코드 생성시에 필요한
정보만을 구성하고 다음 단계에서 그것을 이용하여 의미 분석과
코드 생성을 한다.
Intermediate Code Generation
7
http://plac.dongguk.ac.kr
Description of Semantic Actions
SDTS(Syntax-Directed Translation Scheme)
::= production rules + semantic actions
Description of Semantic Actions
(1) Conventional PL.
(2) Meta Language - Formal Semantic Language(FSL)
Semantic
Description
Preprocessor
Input
table
SDT
Output
Intermediate Code Generation
8
http://plac.dongguk.ac.kr
Semantic Description using the Attributes of Grammar Symbol
::= We associate information with a programming language construct by
attaching attributes to the grammar symbols representing the
construct. Values for attributes are computed by "semantic rules"
associated with the grammar productions.
An attribute of symbol
::= A value associated with a grammar symbol. Each grammar symbol
has an associated set of attributes. An attribute can represent
anything we choose: a string, a number, a type, a memory location,
or whatever.
ex)
Production
L → E$
E → E1 + T
E→ T
T → T1 * F
T→F
F → (E)
F → digit
Semantic Rules
print(E.val)
E.val := E1.val + T.val
E.val := T.val
T.val := T1.val * F.val
T.val := F.val
F.val := E.val
F.val := digit.lexval
Intermediate Code Generation
9
http://plac.dongguk.ac.kr
Synthesized attribute
::= the value of the attribute of the nonterminal on the left side of
the production is defined as a function of the grammar symbols
on the right side.
XYZ A := F(X,Y,Z)
ex) A
Inherited attribute
::= the value of the attribute of a nonterminal on the right side of
the production is defined in terms of an attribute of the
nonterminal on the left.
XYZ Y.val := 2 * X.val
ex) A
※ Synthesized attribute is more natural than inherited attribute for
mapping most programming language constructs into
intermediate code.
Intermediate Code Generation
10
http://plac.dongguk.ac.kr
Implementation of SDT
▶ Designing steps
Input design - language construct에 대한 grammar를 cfg를 이
용하여 design.
Scanner, Parser의 작성.
Semantic Specification - conventional PL.
SDT
Translator의 완성 - interconnection.
▶ Examples : 1. Desk Calculator
2. Conversion infix into postfix
3. Construction of AST
Intermediate Code Generation
11
http://plac.dongguk.ac.kr
1. Desk Calculator
(1) Input design
0.
1.
2.
3.
4.
S -> E $
E -> E + E
E -> E * E
E -> ( E )
E -> num
(2) Parsing table
symbols
states
0
num
*
s3
)
$
acc
s2
r4
E
1
s5
s3
3
(
s2
s4
1
2
+
r4
6
r4
r4
4
s3
s2
7
5
s3
s2
8
6
s4
s5
s8
7
r1
s5
r1
r1
8
r2
r2
r2
r2
9
r3
r3
r3
r3
Intermediate Code Generation
12
http://plac.dongguk.ac.kr
(3) Semantic Specification
Production
L → E$
E → E1 + E2
E → E1 * E2
E → (E1)
E → num
Semantic Rules
print E.val
E.val := E1.val + E2.val
E.val := E1.val * E2.val
E.val := E1.val
E.val := num.lexval
(4) Implementation of Desk Calculator
Parsing stack : Symbol stack + State stack + Value stack
Value stack : holding the values of the corresponding
attribute.
Intermediate Code Generation
13
http://plac.dongguk.ac.kr
Production
S → E$
E→E+E
E→E*E
E → (E)
E → num
Code Fragment
print (val[top])
val[top-2] := val[top-2] + val[top]
val[top-2] := val[top-2] * val[top]
val[top-2] := val[top-1]
val[top] := num.lexval
the code fragments do not show how variable top is
managed.
lexval : token value
the code fragments are executed before a
reduction takes place.
Intermediate Code Generation
14
http://plac.dongguk.ac.kr
ex) 23 * 5 + 4 $
state
s3
r4
s5
s3
r4
r2
s4
s3
r4
r1
===>
===>
===>
===>
===>
===>
===>
===>
===>
===>
===>
(0
(0 3
(0 1
(0 1 5
(0 1 5 3
(0 1 5 8
(0 1
(0 1 4
(0 1 4 3
(0 1 4 7
(0 1
accept
input
symbol
, 23 * 5 + 4$,
,
,
* 5 + 4$, num
,
,
* 5 + 4$, E
,
,
5 + 4$, E *
,
,
+ 4$, E * num ,
,
+ 4$, E * E
,
,
+ 4$, E
,
,
4$, E +
,
,
$, E + num ,
,
$, E + E ,
,
$, E
,
value
23
23_
23__
23_5
115
115_
115__
115_4
119
parse
,
)
,
)
,4
)
, 4
)
, 4
)
, 44
)
, 442
)
, 442
)
, 442
)
, 4424 )
, 44241)
Intermediate Code Generation
15
http://plac.dongguk.ac.kr
2. Conversion infix into postfix
Production
E→E+E
E→E*E
E→E/E
E → (E)
E→a
Code Fragment
print ‘+’
print ‘*’
print ‘/’
print ‘a’
a + (a + a) * a
aaa+a*+
Intermediate Code Generation
16
http://plac.dongguk.ac.kr
3. Construction of AST
AST is a condensed form of parse tree useful for representing
language constructs.
ex) a = b + 1;
=
a
+
b
1
ex) if (a > b) x = a; else x = b;
if
>
a
=
b
x
=
a
x
b
Intermediate Code Generation
17
http://plac.dongguk.ac.kr
Functions to create the nodes of AST for expressions with binary
operators. Each function returns a pointer to a newly created node.
1. mktree(op,left,right) creates an operator node with label op and
two fields containing pointers to left and right.
2. mknode(a) creates a terminal node for a and returns the node
pointer.
Semantic Specification
Production
E → E1 + E
E → E1 - E
E→T
T → (E)
T→a
Semantic Rules
E.nptr := mktree(‘+’, E1.nptr, T.nptr)
E.nptr := mktree(‘-’, E1.nptr, T.nptr)
E.nptr := T.nptr
T.nptr := E.nptr
T.nptr := mknode(a)
The synthesized attribute nptr for E and T keeps track of the pointers
returned by the function calls.
Intermediate Code Generation
18
http://plac.dongguk.ac.kr
AST for a - 4 + c
+
-
id
to entry for c
id
num
4
to entry for a
Intermediate Code Generation
19
http://plac.dongguk.ac.kr
Programming Assignment #4
Implement a syntax-directed translator producing an
AST for Mini C program.
mini C
Program
Scanner
*.mc
Parser
SDT
AST
Mini C Program : Perfect.mc(Text pp.421)
The Output form of AST using printtree() : Text pp.417-418
Intermediate Code Generation
20
http://plac.dongguk.ac.kr
AST design
Grammar form : production rule [ => node_name ] ;
Note : => node name의 생략 시에는 부 트리를 구성하지 않음.
A -> => node_name ;
node_name
α1
α2
…
αn
Mini C Grammar
mini_c
-> translation_unit
translation_unit -> external_dcl;
-> translation_unit external_dcl;
external_dcl
-> function_def;
-> declaration;
=> PROGRAM;
Text pp. 408-411
Intermediate Code Generation
21
http://plac.dongguk.ac.kr
Data Structures
A node form of AST
token
son
noderep
tokenNumber
brother
tokenValue
next brother
node
son node
Node structure
struct tokenType {
int tokenNumber;
char * tokenValue;
};
typedef struct nodeType {
struct tokenType token;
enum {terminal, nonterm} noderep;
struct nodeType *son;
struct nodeType *brother;
} Node;
// 토큰 번호
// 토큰 값
// 토큰 종류
// 노드 종류
// 왼쪽 링크
// 오른쪽 링크
Intermediate Code Generation
22
http://plac.dongguk.ac.kr
Production rule name
enum nodeNumber {
ACTUAL_PARAM, ADD, ADD_ASSIGN, ARRAY_VAR, ASSIGN_OP,
… , WHILE_ST
};
char *nodeName[] = {
"ACTUAL_PARAM", "ADD", "ADD_ASSIGN", "ARRAY_VAR", "ASSIGN_OP",
… "WHILE_ST"
};
int ruleName[] = {
/* 0
1
0,
PROGRAM,
…
/* 95
96
0,
0,
};
2
0,
3
0,
4
0,
97
0
*/
*/
Intermediate Code Generation
23
http://plac.dongguk.ac.kr
Shift action of parsing :
if the token is meaningful, then call buildNode.
Node *buildNode(struct tokenType token)
{
Node *ptr;
ptr = (Node *) malloc(sizeof(Node));
if (!ptr) { printf("malloc error in buildNode()\n");
exit(1);
}
ptr->token = token;
ptr->noderep = terminal;
ptr->son = ptr->brother = NULL;
return ptr;
}
Reduce action of parsing :
if the production rule is meaningful
1. build subtree
- linking brothers
- making a subtree
else
2. only linking brothers
Intermediate Code Generation
24
http://plac.dongguk.ac.kr
Node *buildTree(int nodeNumber, int rhsLength)
Node *buildTree(int nodeNumber, int rhsLength)
{ //…
i = sp - rhsLength + 1;
// step 1: find a first index with node in value stack
while (i <= sp && valueStack[i] == NULL) i++;
if (!nodeNumber && i > sp) return NULL;
start = i;
// step 2: linking brothers
while (i <= sp-1) {
j = i + 1;
while (j <= sp && valueStack[j] == NULL) j++;
if (j <= sp) {
ptr = valueStack[i];
while (ptr->brother) ptr = ptr->brother;
ptr->brother=valueStack[j];
}
i = j;
}
first = (start > sp) ? NULL : valueStack[start];
// step 3: making subtree root and linking son
if (nodeNumber) {
//… memory allocation for ptr
ptr->token.tokenNumber = nodeNumber;
ptr->token.tokenValue = NULL;
ptr->noderep = nonterm;
ptr->son = first;
ptr->brother = NULL;
return ptr;
}
else return first;
}
Intermediate Code Generation
/* …… ① */
/* …… ② */
/* …… ③ */
/* …… ④ */
/* …… ⑤ */
/* …… ⑥ */
/* …… ⑦ */
25
http://plac.dongguk.ac.kr
buildTree()
① 현재 reduce되는 생성 규칙의 rhs에 노드가 매달려 있는 인덱스를
값 스택에서 찾는다. 형제 노드로 연결할 노드의 첫 번째 인덱스를
찾은 것이다.
② 의미있는 생성 규칙이 아니고 연결할 형제 노드도 없으면 그냥 복귀
한다.
③ 형제 노드로 연결할 노드의 다음 인덱스를 ①과 같은 방법으로 찾는
다.
④ 만약 다음 인덱스를 찾았으면, 형제 노드로 연결한다.
⑤ 연속해서 다음 인덱스를 찾기 위해 위치를 앞으로 이동한다.
⑥ 연결된 형제 노드들의 첫 번째 노드의 포인터를 first에 저장한다.
⑦ 의미있는 생성 규칙이면, nonterminal 노드를 만든 후에 연결된 형
제 노드를 son으로 연결하고 새로 만든 노드의 포인터를 복귀한다.
의미있는 생성 규칙이 아니면, 연결된 형제 노드의 포인터만을 복귀
한다.
Intermediate Code Generation
26
http://plac.dongguk.ac.kr
Parsing Stack and Value Stack
Parsing Stack과 Value Stack은 병렬로 운행
…
Parsing Stack :
Value Stack :
…
…
SP
Intermediate Code Generation
27
http://plac.dongguk.ac.kr
Printing the information about AST
void printnode(NODE *pt, int indent) : print a node information
void printNode(Node *pt, int indent)
{
extern FILE * astFile; int i;
for (i=1; i<=indent; i++) fprintf(astFile," ");
if (pt->noderep == terminal) {
if (pt->token.number == tident)
fprintf(astFile," Terminal: %s", pt->token.value.id);
else if (pt->token.number == tnumber)
fprintf(astFile," Terminal: %d", pt->token.value.num);
}
else { // nonterminal node
int i;
i = (int) (pt->token.number);
fprintf(astFile," Nonterminal: %s", nodeName[i]);
}
fprintf(astFile,"\n");
}
Intermediate Code Generation
28
http://plac.dongguk.ac.kr
void printtree(NODE *pt, int indent) : print an AST
void printTree(Node *pt, int indent)
{
Node *p = pt;
while (p != NULL) {
printNode(p, indent);
if (p->noderep == nonterm) printTree(p->son, indent+5);
p = p->brother;
}
}
Intermediate Code Generation
29
http://plac.dongguk.ac.kr
Code Generation
A Model for ICG
data
Mini C Compiler
Source
Program
*.mc
Parser
Scanner
Token
Stream
SDT
ICG
AST
Ucode
Ucode
Interpreter
*.uco
result
Source language
: Mini C
Intermediate Representation : Abstract Syntax Tree(AST)
Intermediate code
: Ucode
Execution
: Ucode Interpreter
Intermediate Code Generation
30
http://plac.dongguk.ac.kr
Mini C Program
External declaration and Function define
Declaration
expression
Statement
return statement
compound statement
expression statement
control statement - if, if - else, while
Function
Production rules : (Mini C)
mini_c
translation_unit
external_dcl
function_def
declaration
…
→ translation_unit
→ external_dcl;
→ translation_unit external_dcl;
→ function_def;
→ declaration;
=> PROGRAM;
→ function_header compound_st
→ dcl_spec init_dcl_list ';'
=> FUNC_DEF;
=> DCL;
Intermediate Code Generation
31
http://plac.dongguk.ac.kr
Mini C Program AST
root
PROGRAM
DCL
…
…
FUNC_DEF
…
…
Intermediate Code Generation
32
http://plac.dongguk.ac.kr
Code Generating Routines
codeGen
processDeclaration
processSimpleVariable
processArrayVariable
processFuncHeader
processFunction
processStatement
processOperator
processCondition
Intermediate Code Generation
33
http://plac.dongguk.ac.kr
codeGen
1. process external variables and function headers
2. process statements for function
3. generate starting code of U-Code interpreter
void codeGen(Node *ptr)
{
//…
// step 1: process the declaration part
for (p=ptr->son; p; p=p->brother) {
if (p->token.number == DCL) processDeclaration(p->son);
else if (p->token.number == FUNC_DEF) processFuncHeader(p->son);
else icg_error(3);
}
//…
// step 2: process the function part
for (p=ptr->son; p; p=p->brother)
if (p->token.number == FUNC_DEF) processFunction(p);
//…
// step 3: generate codes for starting routine
emit1(bgn, globalSize);
emit0(ldp);
emitJump(call, "main");
emit0(endop);
}
Intermediate Code Generation
34
http://plac.dongguk.ac.kr
Declaration
Grammar
declaration
→ dcl_spec init_dcl_list ';‘
declaration
dcl_spec
dcl_specifiers
→ dcl_spec init_dcl_list ';‘
→ dcl_specifiers
→ dcl_specifier;
→ dcl_specifiers dcl_specifier;
→ type_qualifier;
→ type_specifier;
→ 'const‘
→ 'int‘
→ 'void‘
→ init_declarator;
→ init_dcl_list ',' init_declarator;
→ declarator
→ declarator '=' '%number‘
→ '%ident‘
→ '%ident' '[' opt_number ']‘
→ '%number';
dcl_specifier
type_qualifier
type_specifier
init_dcl_list
init_declarator
declarator
opt_number
=> DCL;
=> DCL_SPEC;
=> CONST_NODE;
=> INT_NODE;
=> VOID_NODE;
=> DCL_ITEM;
=> DCL_ITEM;
=> SIMPLE_VAR;
=> ARRAY_VAR;
→;
Intermediate Code Generation
35
http://plac.dongguk.ac.kr
AST
DCL
DCL_SPEC
DCL_ITEM
…
DCL_ITEM
Process function code segment
void processDeclaration(Node *ptr)
{ //…
// step 1: process DCL_SPEC
//…
// step 2: process DCL_ITEM
while (p) {
q = p->son; // SIMPLE_VAR or ARRAY_VAR
switch (q->token.number) {
case SIMPLE_VAR:
// simple variable
processSimpleVariable(q, typeSpecifier, typeQualifier);
break;
case ARRAY_VAR:
// array variable
processArrayVariable(q, typeSpecifier, typeQualifier);
break;
default: printf("error in SIMPLE_VAR or ARRAY_VAR\n"); break;
} // end switch
p = p->brother;
} // end while
}
Intermediate Code Generation
36
http://plac.dongguk.ac.kr
Expression - Assignment
Grammar
expression
assignment_exp
→ assignment_exp;
→ logical_or_exp;
→ unary_exp '=' assignment_exp
→ unary_exp '+=' assignment_exp
→ unary_exp '-=' assignment_exp
→ unary_exp '*=' assignment_exp
→ unary_exp '/=' assignment_exp
→ unary_exp '%=' assignment_exp
=> ASSIGN_OP;
=> ADD_ASSIGN;
=> SUB_ASSIGN;
=> MUL_ASSIGN;
=> DIV_ASSIGN;
=> MOD_ASSIGN;
AST
ASSIGN_OP
IDENT
ASSIGN_OP
exp
INDEX
IDENT
exp
exp
Intermediate Code Generation
37
http://plac.dongguk.ac.kr
Process
function code segment
void processOperator(Node *ptr)
{
switch (ptr->token.number) {
// assignment operator
case ASSIGN_OP:
// ...
// step 1: generate instructions for left-hand side if
// step 2: generate instructions for right-hand side
// step 3: generate a store instruction
// complex assignment operators
case ADD_ASSIGN: case SUB_ASSIGN: case MUL_ASSIGN:
case DIV_ASSIGN: case MOD_ASSIGN:
// ...
// step 1: code generation for left hand side
// step 2: code generation for repeating part
// step 3: code generation for right hand side
// step 4: emit the corresponding operation code
// step 5: code generation for store code
// ...
} // end switch
}
Intermediate Code Generation
38
http://plac.dongguk.ac.kr
Example
program
sum += 100;
AST
ADD_ASSIGN
sum
100
U-Code
lod
ldc
add
1 1
100
/* sum */
str
1 1
/* sum */
Intermediate Code Generation
39
http://plac.dongguk.ac.kr
Expression
-
Binary(arithmetic/relational/logical)
operators
Grammar
logical_or_exp
logical_and_exp
equality_exp
relational_exp
additive_exp
multiplicative_exp
→ logical_and_exp;
→ logical_or_exp '||' logical_and_exp
→ equality_exp;
→ logical_and_exp '&&' equality_exp
→ relational_exp;
→ equality_exp '==' relational_exp
→ equality_exp '!=' relational_exp
→ additive_exp;
→ relational_exp '>' additive_exp
→ relational_exp '<' additive_exp
→ relational_exp '>=' additive_exp
→ relational_exp '<=' additive_exp
→ multiplicative_exp;
→ additive_exp '+' multiplicative_exp
→ additive_exp '-' multiplicative_exp
→ unary_exp;
→ multiplicative_exp '*' unary_exp
→ multiplicative_exp '/' unary_exp
→ multiplicative_exp '%' unary_exp
=> LOGICAL_OR;
=> LOGICAL_AND;
=> EQ;
=> NE;
=> GT;
=> LT;
=> GE;
=> LE;
=> ADD;
=> SUB;
=> MUL;
=> DIV;
=> MOD;
Intermediate Code Generation
40
http://plac.dongguk.ac.kr
Process function code segment
void processOperator(Node *ptr)
{
switch (ptr->token.number) {
//…
// binary(arithmetic/relational/logical) operators
case ADD: case SUB: case MUL: case DIV: case MOD:
case EQ: case NE: case GT: case LT: case GE: case LE:
case LOGICAL_AND: case LOGICAL_OR:
{
// step 1: visit left operand
if (lhs->noderep == nonterm) processOperator(lhs);
else rv_emit(lhs);
// step 2: visit right operand
if (rhs->noderep == nonterm) processOperator(rhs);
else rv_emit(rhs);
// step 3: visit root
switch (ptr->token.number) {
// arithmetic operators
// relational operators
// logical operators
}
}
// ...
} // end switch
}
Intermediate Code Generation
41
http://plac.dongguk.ac.kr
Example
program
init = 10;
value = init + 20 * 2;
AST
EXP_ST
EXP_ST
②
⑧
ASSIGN_OP
ASSIGN_OP
①
⑦
init
10
value
ADD
③
⑥
init
MUL
④
20
U-Code
ldc
str
lod
ldc
ldc
mult
add
str
⑤
2
⇐①
⇐②
⇐③
⇐④
⇐⑤
⇐⑥
⇐⑦
2 ; value ⇐ ⑧
10
1 1 ; init
1 1 ; init
20
2
1
Intermediate Code Generation
42
http://plac.dongguk.ac.kr
Expression - Unary operators
Grammar
unary_exp
postfix_exp
→ postfix_exp;
→ '-' unary_exp
=> UNARY_MINUS;
→ '!' unary_exp
=> LOGICAL_NOT;
→ '++' unary_exp
=> PRE_INC;
→ '--' unary_exp
=> PRE_DEC;
→ primary_exp;
→ postfix_exp '[' expression ']'
=> INDEX;
→ postfix_exp '(' opt_actual_param‘)’ => CALL;
→ postfix_exp '++'
=> POST_INC;
→ postfix_exp '--'
=> POST_DEC;
Process function code segment - Unary - , !
// unary operators
case UNARY_MINUS: case LOGICAL_NOT:
{
Node *p = ptr->son;
if (p->noderep == nonterm) processOperator(p);
else rv_emit(p);
switch (ptr->token.number) {
case UNARY_MINUS: emit0(neg); break;
case LOGICAL_NOT: emit0(notop); break;
}
break;
}
Intermediate Code Generation
43
http://plac.dongguk.ac.kr
Array variable
In one-dimensional array, location of i's element = Base + (i - Low) * W
where, Low : lower bound of array
Base : start address of array
in C programming language, Low is always 0.
∴ Address of A[i] = Base + i*W
Assume that the size of integer is 1. W = 1
Location of list[10] = (start address of array) + 10 * 1
Process function code segment - Array
case INDEX:
{
Node *indexExp = ptr->son->brother;
if (indexExp->noderep == nonterm) processOperator(indexExp);
else rv_emit(indexExp);
stIndex = lookup(ptr->son->token.value.id);
if (stIndex == -1) {
printf("undefined variable : %s\n", ptr->son->token.value.id);
return;
}
emit2(lda, symbolTable[stIndex].base, symbolTable[stIndex].offset);
emit0(add);
if (!lvalue) emit0(ldi);
// rvalue
break;
}
Intermediate Code Generation
44
http://plac.dongguk.ac.kr
【예 9】 다음은 Mini C에서 배열의 선언과 배열의 참조를 나타낸다.
int vector[100];
void main()
{
int temp;
// ...
vector[5] = 10;
// ...
temp = vector[20];
// ...
}
위 프로그램에서 vector[5] = 10;에 해당하는 U-코드는 다음과 같다.
ldc
lda
add
ldc
sti
5
1 1 /* base address(vector)의 적재 */
10
위 프로그램에서 temp = vector[20];에 해당하는 U-코드는 다음과 같다.
ldc
lda
add
ldi
str
20
1 1 /* base address(vector)의 적재 */
2 1 /* temp */
Intermediate Code Generation
45
http://plac.dongguk.ac.kr
Process function code segment - ++, -// increment/decrement operators
case PRE_INC: case PRE_DEC: case POST_INC: case POST_DEC:
{
//…
// compute operand
//…
switch (ptr->token.number) {
case PRE_INC: emit0(incop);
// if (isOperation(ptr)) emit0(dup);
break;
case PRE_DEC: emit0(decop);
// if (isOperation(ptr)) emit0(dup);
break;
case POST_INC:
// if (isOperation(ptr)) emit0(dup);
emit0(incop); break;
case POST_DEC:
// if (isOperation(ptr)) emit0(dup);
emit0(decop); break;
}
//…
// compute index
//…
}
Intermediate Code Generation
46
http://plac.dongguk.ac.kr
Example
program
sum = ++vector[5] + 3;
AST
ASSIGN_OP
sum
ADD
PRE_INC
3
INDEX
vector
5
U-Code
ldc
lda
add
ldi
inc
dup
ldc
5
1 2
5
/* vector */
lda
add
swp
sti
ldc
add
str
1 2
/* vector */
3
1 1
/* sum */
Intermediate Code Generation
47
http://plac.dongguk.ac.kr
Statement
Grammar
statement
→ compound_st;
→ expression_st;
→ if_st;
→ while_st;
→ return_st;
Process function code segment
void processStatement(Node *ptr)
{
switch (ptr->token.number) {
// process COMPOUND_ST ...
// process EXP_ST …
case RETURN_ST:
if (ptr->son != NULL) {
returnWithValue = 1;
p = ptr->son;
if (p->noderep == nonterm) processOperator(p); // return value
else rv_emit(p);
emit0(retv);
} else emit0(ret);
break;
// process IF_ST, IF_ELSE_ST, WHILE_ST …
} //end switch
}
Intermediate Code Generation
48
http://plac.dongguk.ac.kr
Statement - Compound statement
Grammar
compound_st
opt_dcl_list
opt_stat_list
statement_list
→ '{' opt_dcl_list opt_stat_list '}‘
→ declaration_list
→
→ statement_list
→;
→ statement;
→ statement_list statement;
=> COMPOUND_ST;
=> DCL_LIST;
=> DCL_LIST;
=> STAT_LIST;
※ Mini C 언어에서, 함수 내에서는 지역 변수를 선언할 수 있지만 복합문 내에서는 지역 변수를 선
언할 수 없다. 따라서, 복합문 내에서 지역 변수를 선언하더라도 무시하고 문장들만 처리한다.
Process function code segment
void processStatement(Node *ptr)
{
//…
case COMPOUND_ST:
p = ptr->son->brother; // STAT_LIST
p = p->son;
while (p) {
processStatement(p);
p = p->brother;
}
break;
//…
}
Intermediate Code Generation
49
http://plac.dongguk.ac.kr
Statement - Expression statement
Grammar
expression_st
opt_expression
→ opt_expression ';‘
→ expression;
→;
=> EXP_ST;
Process function code segment
void processStatement(Node *ptr)
{
//…
case EXP_ST:
if (ptr->son != NULL) processOperator(ptr->son);
break;
//…
}
Intermediate Code Generation
50
http://plac.dongguk.ac.kr
Statement - Control statement
Control Statement
1. conditional statement
2. iteration statement
3. branch statement
- if, case, switch
- for, while, do-while, repeat-until
- goto
Logical expression
1. use calculation of logical value
2. use control expression in control statement
Expression of logical value
1. true와 false를 숫자로 변환, 산술식의 연산과 유사한 방법으로 계산
2. 프로그램 내에서 도달될 위치를 나타냄
【예 11】 논리식 !a || b && c에 대한 일련의 U-코드는 다음과 같다.
lod 1 1
not
lod 1 2
lod 1 3
and
or
// a
// b
// c
Intermediate Code Generation
51
http://plac.dongguk.ac.kr
【예 12】 관계식 a >= b + 1에 대한 AST와 U-코드는 다음과 같다.
AST 형태 :
GE
a
ADD
b
1
U-코드 :
lod
lod
loc
add
ge
Ba Oa
Bb Ob
1
// Ba: 변수 a의 base, Oa: 변수 a의 offset
// Bb: 변수 b의 base, Ob: 변수 b의 offset
Intermediate Code Generation
52
http://plac.dongguk.ac.kr
Generate control statement code
if 구조
CONDITION 코드
false
goto tag
STATEMENT 코드
tag:
if – else 구조
CONDITION 코드
STATEMENT1 코드
false
goto tag1
goto tag2
tag1:
STATEMENT2 코드
tag2:
while 구조
tag1:
CONDITION 코드
STATEMENT 코드
false
goto tag2
goto tag1 코드
STATEMENT2
tag2:
Intermediate Code Generation
53
http://plac.dongguk.ac.kr
Grammar
→ 'if' '(' expression ')' statement
→ 'if' '(' expression ')' statement 'else' statement
→ 'while' '(' expression ')' statement
if_st
while_st
=> IF_ST;
=> IF_ELSE_ST;
=> WHILE_ST;
AST
Process function code segment
if statement
IF_ST
CONDITION
STATEMENT
void processStatement(Node *ptr)
{
//…
case IF_ST:
{
char label[LABEL_SIZE];
genLabel(label);
processCondition(ptr->son); // condition part
emitJump(fjp, label);
processStatement(ptr->son->brother); // true part
emitLabel(label);
}
//…
}
Intermediate Code Generation
54
http://plac.dongguk.ac.kr
if-else statement
IF_ELSE_ST
CONDITION
STATEMENT1
STATEMENT2
void processStatement(Node *ptr)
{
//…
case IF_ELSE_ST:
{
char label1[LABEL_SIZE], label2[LABEL_SIZE];
genLabel(label1); genLabel(label2);
processCondition(ptr->son);
// condition part
emitJump(fjp, label1);
processStatement(ptr->son->brother);
// true part
emitJump(ujp, label2);
emitLabel(label1);
processStatement(ptr->son->brother->brother);
// false part
emitLabel(label2);
}
//…
}
Intermediate Code Generation
55
http://plac.dongguk.ac.kr
while statement
WHILE_ST
CONDITION
STATEMENT
void processStatement(Node *ptr)
{
//…
case WHILE_ST:
{
char label1[LABEL_SIZE], label2[LABEL_SIZE];
genLabel(label1); genLabel(label2);
emitLabel(label1);
processCondition(ptr->son);
emitJump(fjp, label2);
processStatement(ptr->son->brother);
emitJump(ujp, label1);
emitLabel(label2);
// condition part
// loop body
}
//…
}
Intermediate Code Generation
56
http://plac.dongguk.ac.kr
Example 1
program
if (a > max) max = a;
AST
IF_ST
GT
a
EXP_ST
max
ASSIGN_OP
max
a
U-Code
$$1
lod
lod
gt
1 1
1 2
fjp
$$1
lod
str
1 1
1 2
// a
// max
// a > max
// max = a
nop
Intermediate Code Generation
57
http://plac.dongguk.ac.kr
Example 2
program
while (i <= 100) {
sum += i;
++i;
}
AST
WHILE_ST
LE
i
COMPOUND_ST
100
EXP_ST
EXP_ST
ADD_ASSIGN
PRE_INC
sum
i
i
Intermediate Code Generation
58
http://plac.dongguk.ac.kr
U-Code
$$1
$$2
nop
lod
loc
le
1 1
100
fjp
$$2
lod
lod
add
str
sum
i
lod
inc
str
i
ujp
$$1
// i <= 100
// sum += i;
sum
// ++i;
i
nop
Intermediate Code Generation
59
http://plac.dongguk.ac.kr
Function – Function call
Grammar
→ primary_exp;
→ postfix_exp '(' opt_actual_param ')‘ => CALL;
opt_actual_param → actual_param;
→;
actual_param
→ actual_param_list
=> ACTUAL_PARAM;
actual_param_list → assignment_exp;
→ actual_param_list ',' assignment_exp;
postfix_exp
AST
CALL
IDENT
ACTUAL_PARAM
EXP
EXP
…
Intermediate Code Generation
60
http://plac.dongguk.ac.kr
Process function code segment
void processStatement(Node *ptr)
{
//…
case CALL:
{
//…
// predefined(Library) functions
//…
// handle for user function
functionName = p->token.value.id;
stIndex = lookup(functionName);
if (stIndex == -1) break; // undefined function !!!
noArguments = symbolTable[stIndex].width;
emit0(ldp);
p = p->brother;
// ACTUAL_PARAM
while (p) {
// processing actual arguments
if (p->noderep == nonterm) processOperator(p);
else rv_emit(p);
noArguments--;
p = p->brother;
}
//…
emitJump(call, ptr->son->token.value.id);
break;
}
//…
}
Intermediate Code Generation
61
http://plac.dongguk.ac.kr
Function – Function define
Grammar
function_def
→ function_header compound_st
function_header → dcl_spec function_name formal_param
function_name
→ '%ident';
formal_param
→ '(' opt_formal_param ')'
opt_formal_param → formal_param_list;
→;
formal_param_list → param_dcl;
→ formal_param_list ',' param_dcl;
param_dcl
→ dcl_spec declarator
=> FUNC_DEF;
=> FUNC_HEAD;
=> FORMAL_PARA;
=> PARAM_DCL;
AST
Function define
FUNC_DEF
FUNC_HEAD
COMPOUND_ST
Intermediate Code Generation
62
http://plac.dongguk.ac.kr
Function head
FUNC_HEAD
DCL_SPEC
IDENT
FORMAL_PARA
Formal parameter
FORMAL_PARA
PARAM_DCL
PARAM_DCL
…
Intermediate Code Generation
63
http://plac.dongguk.ac.kr
Process function code segment – Function header
void processFuncHeader(Node *ptr)
{
//…
// step 1: process the function return type
p = ptr->son->son;
while (p) {
if (p->token.number == INT_NODE) returnType = INT_TYPE;
else if (p->token.number == VOID_NODE) returnType = VOID_TYPE;
else printf("invalid function return type\n");
p = p->brother;
}
// step 2: count the number of formal parameters
p = ptr->son->brother->brother; // FORMAL_PARA
p = p->son;// PARAM_DCL
noArguments = 0;
while (p) {
noArguments++;
p = p->brother;
}
// step 3: insert the function name
stIndex = insert(ptr->son->brother->token.value.id, returnType, FUNC_TYPE,
1/*base*/, 0/*offset*/, noArguments/*width*/, 0/*initialValue*/);
//if (!strcmp("main", functionName)) mainExist = 1;
}
Intermediate Code Generation
64
http://plac.dongguk.ac.kr
Process function code segment - Function
void processFunction(Node *ptr)
{
// …
// step 1: process formal parameters
// step 2: process the declaration part in function body
// step 3: emit the function start code
// step 4: process the statement part in function body
// step 5: check if return type and return value
// step 6: generate the ending codes
// …
}
Intermediate Code Generation
65
http://plac.dongguk.ac.kr
Ucode Translator
Design and Implementation of Ucode Translator
scanner, parser, SDT, ICG
data
Mini C Compiler
Source
Program
*.mc
Parser
Scanner
Token
Stream
SDT
ICG
AST
Ucode
Ucode
Interpreter
*.uco
result
Intermediate Code Generation
66
http://plac.dongguk.ac.kr
Execution sequence of perfect.mc
① Mini C program : Text pp.421
② The Output form of AST using printtree() : Text pp.417-418
③ Ucode that generated by code generator : Text pp.426-463
④ The execution of Ucode using Ucode Interpreter
ucodei perfect.uco
Result filename is perfect.lst
-- Assembling...
-- Executing...
-- Result Data
6 28 496
Intermediate Code Generation
67
http://plac.dongguk.ac.kr
Programming Assignment #5
Mini C 언어에 대한 Ucode Translator를 작성하시오.
생성된 Ucode는 Interpreter를 사용하여 실행하시오.
data
AST
ICG
Ucode
*.uco
Ucode
Interpreter
execution
result
Intermediate Code Generation
68
http://plac.dongguk.ac.kr
예제 프로그램: perfect.mc
int max = 500;
void main()
{
int i, j, k;
int rem, sum;
i = 2;
while (i <= max) {
sum = 0;
k = i / 2;
j = 1;
while (j <= k) {
rem = i % j;
if (rem == 0) {
sum += j;
}
++j;
}
if (i == sum) write(i);
++i;
}
}
Intermediate Code Generation
69