Introduction to Compilation Technology

Download Report

Transcript Introduction to Compilation Technology

Simple One-Pass
Compiler
Natawut Nupairoj, Ph.D.
Department of Computer Engineering
Chulalongkorn University
Outline
Translation Scheme.
 Annotated Parse Tree.
 Parsing Fundamental.
 Top-Down Parsers.
 Abstract Stack Machine.
 Simple Code Generation.

Simple One-Pass Compiler
Source program
(text stream)
m a i n (
)
Scanner
{
Parser
Object Code
(text stream)
Sample Grammar
expr
expr
expr
term




expr + term
expr - term
term
0 | 1 | 2 | ... | 9
Derivation
String: 9 – 5 + 2
expr
 expr + term
 expr – term + term
 term – term + term
 9 – term + term
 9 – 5 + term
 9 – 5 + 2


leftmost/rightmost derivation
Parse Tree
expr
expr
expr
term
term
term
9
-
5
+
2
Translation Scheme

Context-free Grammar with Embedded Semantic Actions.
expr
expr
expr
term
term
...
term
::=
::=
::=
::=
::=
expr + term
expr – term
term
0
1
::= 9
{ print(‘+’); }
{ print(‘-’); }
{ print(‘0’); }
{ print(‘1’); }
{ print(‘9’); }
emitting (พ่น) a translation
Parse Tree with Semantic
Actions
expr
{ print(‘+’) }
+
Depth-first traversal
expr
term
{ print(‘-’) }
expr
term
9
2
{ print(‘2’) }
term
5
{ print(‘9’) }
{ print(‘5’) }
Input: 9 – 5 + 2
Output:
9 5 - 2 +
Location of Semantic Actions

Semantic Actions can be placed anywhere on the RHS.
expr
expr
expr
term
term
...
term
::=
::=
::=
::=
::=
{print(‘+’);} expr + term
{print(‘-’);} expr – term
term
0 {print(‘0’);}
1 {print(‘1’);}
::= 9 {print(‘9’);}
Parsing Approaches

Top-down parsing
 build
parse tree from start symbol
 match result terminal string with input stream
 simple but limit in power

Bottom-up parsing
 start
from input token stream
 build parse tree from terminal until get start
symbol
 complex but powerful
Top Down vs. Bottom Up
start here
match
result
result
input token stream
Top-down Parsing
start here
input token stream
Bottom-up Parsing
Example
type ::= simple
|
^id
|
array [ simple ] of type
simple ::= integer
|
char
|
num dotdot num
Input Token String
array [ num dotdot num ] of integer
Top-Down Parsing with Left-to-right
Scanning of Input Stream
type
array [ simple ] of type
Input
array [ num dotdot num ] of integer
lookahead token
Backtracking
(Recursive-Descent Parsing)
simple
integer
char
num
Input array [ num dotdot num ] of integer
lookahead token
Predictive Parsing
type
::= simple
|
^id
|
array [ simple ] of type
simple ::= integer
|
char
type
|
num dotdot num
array [ simple ] of type
Input array [ num dotdot num ] of integer
lookahead token
The Program for Predictive Parser
match
(scanner)
Input
(text stream)
a r
r a y
[
OK
match(‘array’)
Predictive
Parser
Output
The Program for Predictive Parsing
procedure match ( t : token );
begin
if lookahead = t then
lookahead := nexttoken
else error
end;
procedure type;
procedure simple;
begin
if lookahead = integer then
match ( integer )
else if lookahead = char then
match ( char )
else if lookahead = num then begin
match ( num )
match ( dotdot )
match ( num )
end
else error
end;
begin
if lookahead is in { integer, char, num } then
simple
else if lookahead = ‘ ^ ‘ then begin
match ( ‘ ^ ’ ); match ( id )
end
else if lookahead = array then begin
match ( array ); match ( ‘ [ ‘ ); simple; match ( ‘ ] ‘ ); match ( of ); type
end
else error
end;
Mapping Between Production and
Parser Codes
type -> arrary [ simple ] of type
scanner
match(array); match(‘[‘); simple; match(‘]’); match(of); type
parser
parsing (recognition)
of simple
Lookahead Symbols
FIRST( a ) =
A -> a
set of fist token in strings
generated from a
FIRST(simple) = { integer, char, num }
FIRST( ^id ) = { ^ }
FIRST(array [ simple ] of type) = { array }
Rules for Predictive Parser

If A -> a and A -> b then
FIRST(a) and FIRST(b) are disjoint

e-production
stmt -> begin opt_stmts end
opt_stmts -> stmt_list opt_stmts | e
Left Recursion

Left Recursion => Parser loops forever
A -> Aa | b
expr -> expr + term | term

Rewriting...
A -> b R
R -> a R | e
Example
expr
expr
expr



expr + term
expr - term
term
term  0 | 1 | 2 |
... | 9
expr
rest


|
term rest
+ term rest
- term rest
|
e
term  0 | 1 | 2 |
... | 9
Semantic Actions
expr
rest
term
...


|
term rest
+ term {print(‘+’);} rest
- term {print(‘-’);} rest
|

|
e
0 {print(‘0’);}
1 {print(‘1’);}
expr
rest


term rest
+ term {print(‘+’);} rest
| - term {print(‘-’);} rest
|
term
...

e
0 {print(‘0’);}
procedure expr;
begin
term();
rest();
end;
procedure rest;
begin
if lookahead = ‘+’ then
begin
match(‘+’);
term();
print(‘+’);
rest();
else if lookahead = ‘-’ then
begin
match(‘-’);
term();
print(‘-’);
rest();
end;
end;