Introduction to Compilation Technology
Download
Report
Transcript Introduction to Compilation Technology
Simple One-Pass
Compiler
Natawut Nupairoj, Ph.D.
Department of Computer Engineering
Chulalongkorn University
Outline
Translation Scheme.
Annotated Parse Tree.
Parsing Fundamental.
Top-Down Parsers.
Abstract Stack Machine.
Simple Code Generation.
Simple One-Pass Compiler
Source program
(text stream)
m a i n (
)
Scanner
{
Parser
Object Code
(text stream)
Sample Grammar
expr
expr
expr
term
expr + term
expr - term
term
0 | 1 | 2 | ... | 9
Derivation
String: 9 – 5 + 2
expr
expr + term
expr – term + term
term – term + term
9 – term + term
9 – 5 + term
9 – 5 + 2
leftmost/rightmost derivation
Parse Tree
expr
expr
expr
term
term
term
9
-
5
+
2
Translation Scheme
Context-free Grammar with Embedded Semantic Actions.
expr
expr
expr
term
term
...
term
::=
::=
::=
::=
::=
expr + term
expr – term
term
0
1
::= 9
{ print(‘+’); }
{ print(‘-’); }
{ print(‘0’); }
{ print(‘1’); }
{ print(‘9’); }
emitting (พ่น) a translation
Parse Tree with Semantic
Actions
expr
{ print(‘+’) }
+
Depth-first traversal
expr
term
{ print(‘-’) }
expr
term
9
2
{ print(‘2’) }
term
5
{ print(‘9’) }
{ print(‘5’) }
Input: 9 – 5 + 2
Output:
9 5 - 2 +
Location of Semantic Actions
Semantic Actions can be placed anywhere on the RHS.
expr
expr
expr
term
term
...
term
::=
::=
::=
::=
::=
{print(‘+’);} expr + term
{print(‘-’);} expr – term
term
0 {print(‘0’);}
1 {print(‘1’);}
::= 9 {print(‘9’);}
Parsing Approaches
Top-down parsing
build
parse tree from start symbol
match result terminal string with input stream
simple but limit in power
Bottom-up parsing
start
from input token stream
build parse tree from terminal until get start
symbol
complex but powerful
Top Down vs. Bottom Up
start here
match
result
result
input token stream
Top-down Parsing
start here
input token stream
Bottom-up Parsing
Example
type ::= simple
|
^id
|
array [ simple ] of type
simple ::= integer
|
char
|
num dotdot num
Input Token String
array [ num dotdot num ] of integer
Top-Down Parsing with Left-to-right
Scanning of Input Stream
type
array [ simple ] of type
Input
array [ num dotdot num ] of integer
lookahead token
Backtracking
(Recursive-Descent Parsing)
simple
integer
char
num
Input array [ num dotdot num ] of integer
lookahead token
Predictive Parsing
type
::= simple
|
^id
|
array [ simple ] of type
simple ::= integer
|
char
type
|
num dotdot num
array [ simple ] of type
Input array [ num dotdot num ] of integer
lookahead token
The Program for Predictive Parser
match
(scanner)
Input
(text stream)
a r
r a y
[
OK
match(‘array’)
Predictive
Parser
Output
The Program for Predictive Parsing
procedure match ( t : token );
begin
if lookahead = t then
lookahead := nexttoken
else error
end;
procedure type;
procedure simple;
begin
if lookahead = integer then
match ( integer )
else if lookahead = char then
match ( char )
else if lookahead = num then begin
match ( num )
match ( dotdot )
match ( num )
end
else error
end;
begin
if lookahead is in { integer, char, num } then
simple
else if lookahead = ‘ ^ ‘ then begin
match ( ‘ ^ ’ ); match ( id )
end
else if lookahead = array then begin
match ( array ); match ( ‘ [ ‘ ); simple; match ( ‘ ] ‘ ); match ( of ); type
end
else error
end;
Mapping Between Production and
Parser Codes
type -> arrary [ simple ] of type
scanner
match(array); match(‘[‘); simple; match(‘]’); match(of); type
parser
parsing (recognition)
of simple
Lookahead Symbols
FIRST( a ) =
A -> a
set of fist token in strings
generated from a
FIRST(simple) = { integer, char, num }
FIRST( ^id ) = { ^ }
FIRST(array [ simple ] of type) = { array }
Rules for Predictive Parser
If A -> a and A -> b then
FIRST(a) and FIRST(b) are disjoint
e-production
stmt -> begin opt_stmts end
opt_stmts -> stmt_list opt_stmts | e
Left Recursion
Left Recursion => Parser loops forever
A -> Aa | b
expr -> expr + term | term
Rewriting...
A -> b R
R -> a R | e
Example
expr
expr
expr
expr + term
expr - term
term
term 0 | 1 | 2 |
... | 9
expr
rest
|
term rest
+ term rest
- term rest
|
e
term 0 | 1 | 2 |
... | 9
Semantic Actions
expr
rest
term
...
|
term rest
+ term {print(‘+’);} rest
- term {print(‘-’);} rest
|
|
e
0 {print(‘0’);}
1 {print(‘1’);}
expr
rest
term rest
+ term {print(‘+’);} rest
| - term {print(‘-’);} rest
|
term
...
e
0 {print(‘0’);}
procedure expr;
begin
term();
rest();
end;
procedure rest;
begin
if lookahead = ‘+’ then
begin
match(‘+’);
term();
print(‘+’);
rest();
else if lookahead = ‘-’ then
begin
match(‘-’);
term();
print(‘-’);
rest();
end;
end;