Recursive Descent Parsing

Download Report

Transcript Recursive Descent Parsing

Recursive descent parsing
26-Jul-16
Abstract Syntax Trees (ASTs)

An AST is a way of representing a computer program


It is abstract because it throws away unnecessary information
(comments, whitespace, punctuation used only for
disambiguation, etc.)
It represents syntax, not semantics

However, well-written syntax can sometimes help with the semantics
 For example,
<expr> ::= <integer> | <expr> + <expr> | <expr> * <expr>
is technically correct, but gives us no help in recognizing that
multiplication has precedence over addition

The program is represented as a tree
2
The Stack

One easy way to do recursive descent parsing is to have
each parse method take the tokens it needs, build a parse
tree, and put the parse tree on a global stack

Write a parse method for each nonterminal in the grammar



Each parse method should get the tokens it needs, and only those
tokens
 Those tokens (usually) go on the stack
Each parse method may call other parse methods, and expect those
methods to leave their results on the stack
Each (successful) parse method should leave one result on the stack
From Recognizer to Parser

First, make a Recognizer


Create a Stack<Tree<Token>> as a globally available instance
variable


You will use this to hold the trees as you build them
The methods from the Recognizer all return a boolean; do not
change this


The Recognizer code will form the “skeleton” of your Parser
Your new results will go onto the Stack
Each time you recognize something, also build a Tree to
represent it, and put this Tree onto the Stack

Most of the time, you will assemble the new Tree from the two Trees you
most recently put onto the Stack
Example: Unary minus


A simple example of an <expression> is -5
As the program steps through the code for <expression>,
it puts – (minus) on the stack, then it puts 5 on the stack



Each of these is in the form of a Tree consisting of a single
node with a Token as its value
Note that the most recently parsed item, 5, is on the top of the
stack
These can be combined into a new Tree, with the minus
as the root and the 5 as its child
makeTree

/**
* Removes two Trees from the stack, makes a new Tree and
* puts it on the stack. The element on the top of the stack
* (the most recently seen element) is made the child of the
* element beneath it (which was seen earlier).
*/
private void makeTree() {
Tree<Token> child = stack.pop();
Tree<Token> parent = stack.pop();
parent.addChild(child);
stack.push(parent);
}
Example: while statement


<while statement> ::= “while” <condition> <block>
The parse method for a <while statement> does this:



Calls the Tokenizer, which returns a “while” token
Makes the “while” into a Tree, which it puts on the stack
Calls the parser for <condition>, which parses a condition and puts a Tree
representation of that condition on the stack


Calls the parser for <block>, which parses a block and puts a Tree
representation of that block on the stack


Stack now contains: “while” <condition> (stack “top” is on the right),
where <condition> stands for some created Tree
Stack now contains:
“while” <condition> <block>
Pops the top three things from the stack, assembles them into a Tree
representing a while statement, and pushes this Tree onto the stack
Sample Java code

public boolean whileCommand() {
if (keyword("while")) {
if (condition()) {
if (block()) {
makeTree(3, 2, 1);
return true;
}
}
error("Error in \"while\" statement");
}
return false;
}
while
condition
block
<block>
<condition>
while
<while stmt>
makeTree

private void makeTree(int rootIndex, int... childIndices) {
// Get root from stack
Tree<Token> root = getStackItem(rootIndex);
// Get other trees from stack and add them as children of root
for (int i = 0; i < childIndices.length; i++) {
root.addChild(getStackItem(childIndices[i]));
}
// Pop root and all children from stack
for (int i = 0; i <= childIndices.length; i++) {
stack.pop();
}
// Put the root back on the stack
stack.push(root);
}
private Tree<Token> getStackItem(int n) {
return stack.get(stack.size() - n);
}
Fancier error messages

public boolean whileCommand() {
if (keyword("while")) {
if (condition()) {
if (block()) {
makeTree(3, 2, 1); // or some such
return true;
}
error("Error in \"while\" block");
}
error("Error in \"while\" condition");
}
return false;
}
Alternative code

public boolean whileCommand() {
if (keyword("while") && condition() && block()) {
makeTree(3, 2, 1); // or some such
return true;
}
return false;
}

No room for an error condition in this code
Alternative code with one message

public boolean whileCommand() {
if (keyword("while")) {
if (condition()) && (block()) {
makeTree(3, 2, 1); // or some such
return true;
}
error("Error in \"while\" statement");
}
return false;
}
Tricky problem: Defining <term>

<term> ::= <factor> “*” <term> | <factor>



<term> ::= <term> “*” <factor> | <factor>



(For simplicity, I’m ignoring the “/” and “%” operators)
This is logically correct, but it defines the wrong tree for expressions such
as x * y * z -- treats it as x * (y * z)
This is equally correct, and correctly defines x * y * z as meaning (x * y) *
z
However, the left recursion can’t be programmed: “To recognize a term,
first recognize a term”
Solution: <term> ::= <factor> { “*” <factor> }


This turns the recursion into an iteration
The result is easy to program correctly
Code for term()


public boolean term() {
if (!factor()) return false;
while (multiplyOperator()) {
if (!factor()) {
error("No term after '*' or '/'");
}
makeTree(2, 3, 1); // *, first factor, second factor
}
return true;
}
Here’s a snippet of code from my JUnit test methods:


use("x * y * z");
assertTrue(parser.term());
assertTree("*(*(x, y), z)");
use(String) and assertTree(String) are helper methods that
I’ve written; you should be able to figure out what they do
An isCommand() method

public boolean command() {
if (action()) return true;
if (thought()) return true;
}
My helper methods


I wrote a number of helper methods for the Parser and for the
ParserTest classes
One very useful method is tree, in the ParserTest class



Another is assertStackTop, which is just


tree just takes Objects and builds a tree from them
This method lets me build parse trees for use in assertEquals tests
private void assertStackTop(Tree bt) {
assertEquals(bt, parser.stack.peek());
}
Examples:


Tree condition = tree("=", "2", "2");
assertStackTop(tree("if", condition, "list"));
The End
17