Recursive Descent Parser

Download Report

Transcript Recursive Descent Parser

Week 4
• Questions / Concerns
• Comments about Lab1
• What’s due:
• Lab1 check off this week (see schedule)
• Homework #3 due Wednesday (Define grammar for your language)
• Homework #4 due Thursday (Grammar modifications)
• Coming up:
• Lab2a & Lab2b posted.
• Test#1 next week
• Grammar Modifications
• Recursive Descent Parser
1
Lab1
• Data structure for Symbol Table
• List (dynamic)
• Dynamic Array
• std::vector<symbol>
• Dynamically allocate more when needed but it’s done in binary (2, 4, 8, 16,
etc.)
• symbol * mySymbolArray
• Dynamically allocate more space when needed (how many more at a time?)
• Map
• Maps string (name) to more info about the name
• Sorted
• Binary search tree (red/black tree). Tree is always balanced.
• Unordered map
• STL’s hashtable
2
Preprocessor / Symbol Table
• Given the following code snippet:
#define MAX 5
Add (MAX 5) to the symbol table
void main()
{
int x = 5;
int y = 6;
main, x,y are not added to the
symbol table in the preprocessor
……
#if x == 5
//do something
#endif
const int MIN = 0;
Why?
There is no preprocessor symbol
called x in the symbol table
This can be added to the symbol
in preprocessor because it’s just
a constant that’s not going to
change
3
Lab1 check-off Schedule
• Wednesday: I will be in and out most of the day but can check-off
labs whenever I am on.
• Thursday: I will be available in the morning for lab check-off and
again in late afternoon.
4
Structure of Compilers
skeletal
source
program
preprocessor
Modified Source Program
Lexical
Analyzer
(scanner)
Tokens Syntax
Analysis
(Parser)
Syntactic Semantic
Structure Analysis
Intermediate
Representation
Optimizer
Symbol
Table
Code
Generator
Target machine
code
5
Grammar Example
S
E
E
E
->
->
->
->
E
E + E
E * E
id
• This grammar is ambiguous.
6
Revised & Expanded Grammar Example
S -> id = E ;
E -> E + T |
E – T |
T
T -> T * F |
T / F |
F
F -> ( E ) |
id
S
id
(i)
E
T
F
id
(a)
E
=
+
T
F
;
T
*
F
id
(c)
id
(b)
i = a + b * c;
7
But…
S -> id = E ;
E -> E + T |
E – T |
T
T -> T * F |
T / F |
F
F -> ( E ) |
id
This grammar doesn’t
work for top-down
because of left
recursion
8
In-Class Exercise #6
S ->
E ->
id = E ;
E + T |
E – T |
T
T ->
T * F |
T / F |
F
F -> ( E ) |
id
• Remove left-recursion from this grammar
9
Recursive Descent Parser (RDP)
• Is a top down parser
• Start with grammar modifications
• MUST remove all left recursion from the grammar.
• Try to remove all unit productions.
• Try to left factor so the grammar is one-token look ahead.
• Input to the parser is a list of tokens.
• Output:
• Yes/No: Did the input parse?
• Parse structure: representing all the statements.
10
Grammar HW#2
• Let’s look at the grammar for HW#2
• Identify left recursion
• Identify opportunities for one-token look ahead
• Identify unit productions
11
HW#2 Grammar
COMPOUND_STAT -> begin OPTIONAL_STAT end
STATEMENT -> VARIABLE := EXPRESSION
| COMPOUND_STAT
| PROCEDURE_CALL
| if EXPRESSION then STATEMENT else STATEMENT
| while EXPRESSION do STATEMENT
VARIABLE ->
id
PROCEDURE_CALL -> id
| id ( EXPR_LIST )
12
Recursive Descent Parser
Each token is a pair (Value, Type)
List of
Tokens
RDP
Lab1
void
main
(
keyword
ID
symbol
)
{
int
symbol
symbol
keyword
13
Recursive Descent Parser
Each token is a pair (Value, Type)
List of
Tokens
RDP
Lab1
void
main
(
)
keyword
ID
symbol
symbol
Tokens can be
in a separate file
14
Recursive Descent Parser
• There are 2 types of rules in the grammar:
• With  productions
• Without  productions
• Example:
• Field -> `[´ Exp `]´ `=´ Exp |
•
Name `=´ Exp | Exp
• Funcname2 -> ‘.’ Name Funcname2 |

15
Recursive Descent Parser
• Load tokens into a buffer.
• Need ability to pull out tokens but also put them back if
you can’t use them in a rule. (Backtracking)
void
main
ID
(
)
current
16
Recursive Descent Parser
• For every rule in the grammar, generate a bool function.
• This answers the yes/no question first.
• Non- rules:
• If tokens match the rule, return true.
• If tokens do not match the rule, return false.
•  rules:
• This means that this rule doesn’t match ANY tokens. It’s not
wrong, it’s just that it doesn’t match anything at this point in the
matching.
• One way to handle it is to just return True and let other rules handle the
next token.
17
Bool Function
S -> a S b |
c
bool S()
{ if currentToken == a
if (S())
if nextToken == b
return True;
else
return False;
else return False;
else if currentToken == c
return True;
else
return False;
}
18
Bool Function
S -> a S b |

bool S()
{ if currentToken == a
if (S())
if nextToken == b
return True;
else
return False;
else return False;
else
return True; //for 
}
19
Bool Function
S -> a S b |
c
bool S()
{ if currentToken == a
if (S())
if nextToken == b
return True;
else
return False;
else return False;
else if currentToken == c
return True;
else
return False;
}
Parser is a
pushdown automata
But where is the
“stack”?
Call stack
20
Bool function
Field -> `[´ Exp `]´ `=´ Exp |
Name `=´ Exp |
Exp
bool Field()
{ if currentToken == ‘[‘
if (Exp())
if nextToken == ‘]’
if nextToken == ‘=‘
if (Exp())
return True;
else if currentToken == Name
if nextToken == ‘=‘
if (Exp()) return True;
else if (Exp()) return True;
}
Return
false for all
other
conditions
May need to put
tokens back
before checking
21
this rule
Bool function
Funcname2 -> ‘.’ Name Funcname2 |

bool Funcname2()
{ if currentToken == ‘.‘
if (nextToken == Name)
if (Funcname2())
return True;
else
return True; //for .. Don’t remove
//any tokens
}
22