Transcript Document

Top-Down Parsing

1

Relationship between parser types

2

Recursive descent

• • • Recursive descent parsers simply try to build a top-down parse tree.

It would be better if we always knew the correct action to take.

It would be better if we could avoid recursive procedure calls during parsing.

3

Predictive parsers

• • • A predictive parser always knows which production to use, ( to avoid backtracking ) Example: for the productions stmt -> if ( expr ) stmt else stmt | while ( expr ) stmt | for ( stmt expr stmt ) stmt • a recursive descent parser would always know which production to use, depending on the input token.

4

Transition diagrams

• • Transition diagrams can describe recursive parsers, just like they can describe lexical analyzers, (but the diagrams are slightly different.) • Construction: 1. Eliminate left recursion from G 2. Left factor G 3. For each non-terminal A, do 1. Create an initial and final (return) state 2. For each production A -> X1 X2 … Xn, create a path from the initial to the final state with edges X1 X2 … Xn.

5

Example transition diagrams

• • • • • An expression grammar with left recursion With ambiguity

E -> E+T | T T -> T*F | F F -> (E) | id

Eliminating the ambiguity E -> T E ’ E ’ -> + T E ’ | ε T -> F T ’ T ’ -> * F T ’ | ε F -> ( E ) | id Corresponding transition diagrams: 6

The parsing table and parsing program • • The table is a 2D array M[A,a] where A is a nonterminal symbol and a is a terminal or $.

At each step, the parser considers the top-of stack symbol X and input symbol a:  If both are $, accept  If they are the same (nonterminals), pop X, advance input  If X is a nonterminal, consult M[X,a]. – If M[X,a] is “ ERROR ” call an error recovery routine. Otherwise, if M[X,a] is a production of the grammar X -> UVW, replace X on the stack with WVU (U on top) 7

Predictive parsing without recursion

• To get rid of the recursive procedure calls, we maintain our own stack.

8

Example

• • Use the table-driven predictive parser to parse id + id * id Assuming parsing table E -> T E ’ E ’ -> + T E ’ ε T -> F T ’ T ’ -> * F T ’ ε | | F -> ( E ) | id Initial stack is $E Initial input is id + id * id $ 9

Building a predictive parse table

• • • The construction requires two functions: 1. FIRST 2. FOLLOW 10

For First

• • • • • • For a string of grammar symbols α, FIRST(α) is the set of terminals that begin all possible strings derived from α. If α = * > ε, then ε is also in FIRST(α).

FIRST(E) = FIRST (T) = FIRST (F) = {( , id } FIRST(E’) = {+ , e } E -> T E ’ E ’ -> + T E ’ | ε T -> F T ’ T ’ -> * F T ’ | ε FIRST(T) = {( , id} FIRST(T’) = { *, e } FIRST(F) = {( , id } F -> ( E ) | id 11

For Follow

• FOLLOW(A) for non terminal A is the set of terminals that can appear immediately to the right of A in some sentential form. If A can be the last symbol in a sentential form, then $ is also in FOLLOW(A).

• • • • • E -> T E ’ E ’ -> + T E ’ T -> F T ’ T ’ -> * F T ’ | ε Follow (E) = { ) , $ } Follow (E’) = Follow (E)= { ) ,$ } | ε Follow (T) = { +, Follow (E)}= {+ , ) , $} Follow (T’) = {+, ) ,$} F -> ( E ) | id Follow ( F) = {*, +, ), $ } 12

How to compute FIRST(α)

1. If X is a terminal, FIRST(X) = X.

2. Otherwise (X is a nonterminal), 1. 1. If X -> ε is a production, add ε to FIRST(X) 2. 2. If X -> Y 1 … Y k is a production, then place a in FIRST(X) if for some i, a is in FIRST(Y i ) and Y 1 … Y i-1 ε.

= * > • • • Given FIRST(X) for all single symbols X, Let FIRST(X 1 … X n ) = FIRST(X 1 ) If ε ∈ FIRST(X 1 ), then add FIRST(X 2 ), and so on … 13

How to compute FOLLOW(A)

• • • Place $ in FOLLOW(S) (for S the start symbol) If A -> α B β, then FIRST(β)-ε is placed in FOLLOW(B) If there is a production A -> α B or a production A -> α B β where β = * > ε, then everything in FOLLOW(A) is in FOLLOW(B).

• Repeatedly apply these rules until no FOLLOW set changes.

14

Example FIRST and FOLLOW

• For our favorite grammar: E -> TE ’ E T ’ ’ -> +TE | ε T -> FT ’ -> *FT ’ | ε F -> (E) | id • What is FIRST() and FOLLOW() for all nonterminals?

15

Parse table construction with FIRST/FOLLOW • Basic idea: if A -> α and a is in FIRST(α), then we expand A to α any time the current input is a and the top of stack is A.

• • • • • • • Algorithm: For each production A -> α in G, do: For each terminal a in FIRST(α) add A -> α to M[A,a] If ε ∈ FIRST(α), for each terminal b in FOLLOW(A), do: add A -> α to M[A,b] If ε ∈ FIRST(α) and $ is in FOLLOW(A), add A -> α to M[A,$] Make each undefined entry in M[ ] an ERROR 16

Example predictive parse table construction

• For our favorite grammar: E -> TE ’ E ’ -> +TE | ε T -> FT ’ T ’ -> *FT ’ | ε F -> (E) | id • What the predictive parsing table?

17

LL(1) grammars

• • • • The predictive parser algorithm can be applied to ANY grammar.

But sometimes, M[ ] might have multiply defined entries.

Example: for if-else statements and left factoring: stmt -> if ( expr ) stmt optelse optelse -> else stmt | ε When we have in the input, we have a choice of how to expand optelse ( “ else ” rule is possible) “ optelse ” on the stack and “ else is in FOLLOW(optelse) so either ” 18

LL(1) grammars

• If the predictive parsing construction for G leads to a parse table M[ ] WITHOUT multiply defined entries, we say “ G is LL(1) ” 1 symbol of lookahead Leftmost derivation Left-to-right scan of the input 19

LL(1) grammars

• Necessary and sufficient conditions for G to be LL(1): If A -> α | β 1. There does not exist a terminal a such that a ∈ FIRST(α) and a ∈ FIRST(β) 2. At most one of α and β derive ε 3. If β = * > ε, then FIRST(α) does not intersect with FOLLOW(β).

This is the same as saying the predictive parser always knows what to do!

20

a + b $ Input buffer X Y Z $ stack Predictive parsing program/driver Parsing Table M Model of a non recursive predictive parser.

21

STACK $ E $ E' T $ E' T' F $ E' T' id $ E' T' $ E' $ E' T + $ E' T $ E' T' F $ E' T' id $ E' T' $ E' T' F $ E' T' F $ E' T' id $ E' T' $ E' * $ INPUT OUTPUT id + id * id$ id + id * id$ id + id * id$ id + id * id$ + id * id$ + id * id$ + id * id$ id * id$ id * id$ id * id$ * id$ * id$ id$ id$ $ $ $ E T F T' E'  e  + T E' T F T' F T' E'       T E' F T' id F T' id *  id  e  e F T' Moves made by predictive parser on input id + id * id 22

Nonrecursive Predictive Parsing

• • •

1.

If X = a = $, the parser halts and announces successful completion of parsing.

2.

If X = a input symbol.

 $, the parser pops X off the stack and advances the input pointer to the next

3.

If X is a nonterminal, the program consults entry M[X, a] of the parsing table M. This entry will be either an X-production of the grammar or an error entry. If, for example, M[X, a] = {XUVW}, the parser replaces X on top of the stack by WVU (with U on top). As output, we shall assume that the parser just prints the production used; any other code could be executed here. If M[X, a] = error, the parser calls an error recovery routine. 23

NONTER MINAL E E' T T' F Id E  TE' T  FT' F  id + E'  + TE' T'  e INPUT SYMBOL * ( E  TE' T  FT' T'  * FT' F  ( E ) ) E'  e T'  e $ E'  e T'  e Parsing table M for grammar 24

Top-down parsing recap

• • • RECURSIVE DESCENT parsers are easy to build, but inefficient, and might require backtracking.

TRANSITION DIAGRAMS help us build recursive descent parsers.

For LL(1) grammars, it is possible to build PREDICTIVE PARSERS with no recursion automatically.

 Compute FIRST() and FOLLOW() for all nonterminals  Fill in the predictive parsing table  Use the table-driven predictive parsing algorithm 25