LL(k) and LR(k) Parsers

Download Report

Transcript LL(k) and LR(k) Parsers



CS 6800 12/11/12
Matthew Rodgers




What are LL and LR parsers?
What grammars do they parse?
What is the difference between LL and LR?
Why do we care?



Top-Down
LL(k) parsers are Top-Down
Parsers
LL(1) is Deterministic
The way you are most likely
familiar with how to
parsing grammars



Bottom-Up
LR(k) Parsers are BottomUp Parsers
LR(k) Grammars is exactly
the set of Deterministic
Context-Free Grammars
LR(k), for some k, is also
LR(1)
 Consider
the following
grammar:
o S→F
o S → (S+F)
(
S
2
)
a
1
o F→a
 Input:
(a+a)
 The parsing table for
this grammar is shown
F
3
+
$
 The
stack initializes with the
start symbol, S and is compared
to the first symbol in the input
 Since it does not find an ( on the
stack, it looks at the table to see
which rule to apply
 After applying the rule, it
attempts again
 It finds the ( in both the input
string and the top of the stack
so it removes both
(
S
2
F
)
A
+
$
1
3
 The
parser
continues to do
this until it
reaches the end
symbol, $, or
rejects the string



Given an LR(1) grammar, we can produce a shift-reduce
parser table
Shift– “Shifts” an input symbol onto the parser’s stack and
builds a node in the parse tree labeled by that symbol
Reduce– “Reduces” a string of symbols from the top of the
stack to a non-terminal symbol using a grammar rule
o When it does this it builds the piece of the parse tree


However, many LR(1) languages have too large of a parse
table to be practical
Instead we use LALR parsing
S→S + E | E
E → num | (S)
Derivation
Parse Stack
Unparsed Input
Action
(1+(2+3))
ε
(1+(2+3))
Shift
(1+(2+3))
(
1+(2+3))
Shift
(1+(2+3))
(1
+(2+3))
Reduce E→ num
(E+(2+3))
(E
+(2+3))
Reduce S→E
(S+(2+3))
(S
+(2+3))
Shift
(S+(2+3))
(S+
(2+3))
Shift
(S+(2+3))
(S+(
2+3))
Shift
(S+(2+3))
(S+(2
+3))
Reduce E→ num
(S+(E+3))
(S+(E
+3))
Reduce S→E
(S+(S+3))
(S+(S
+3))
Shift
(S+(S+3))
(S+(S+
3))
Shift
Etc.
Etc.
Etc.
Etc.



First we shall define a simple grammar
o E→E*B
o E→E+B
o E→B
o B→0
o B→1

We also add a new rule, S → E, which is used by the parser
as a final accepting rule


To create a parsing table for this grammar we must
introduce a special symbol, ∙, which indicates the current
position for which the parser has already read symbols on
the input and what to expect next
E.g. E → E ∙ + B
o This shows that the E has already been processed and the parser is
looking for a + symbol next


Each of these above rules is called an item
There is an item for each position the dot symbol can take
along the right-hand side of the rule


Since a parser may not know which grammar rule to use in
advance, when creating our table we must use sets of items to
consider all the possibilities
E.g.
o S→•E
o E→•E*B
o E→•E+B
o E→•B
o B→•0
o B→•1

The first line is the initial rule for the item set, but since we need
to consider all possibilities when we come to a non-terminal, we
must create a closure around the non-terminal E, in this case. (By
extension, we must do the same for B as shown by the 5th and 6th
items.)
 Set
0
o S→•E
o E→•E*B
 Set
2
o B→ 1•
 Set
3
o E→•E+B
o S→ E•
o E→•B
o E→ E•*B
o B→•0
o E→ E•+B
o B→•1
 Set
1
o B→ 0•
 Set
4
o E→ B•
 Set
5
o E→ E*•B
o B→•0
o B→•1
 Set
6
o E→ E+•B
o B→•0
o B→•1
 Set
7
o E→ E*B•
 Set
8
o E→ E+B•
Item Set
*
+
0
0
1
E
B
1
2
3
4
5
1
2
7
6
1
2
8
1
2
3
5
6
4
7
8

Each of the transitions can be found by following the item
sets to where the new item set is created from
o Item Set 7 Spawned as a result of Item Set 5
After finishing creating the item sets and the transitions,
follow the steps below to finish the table
1) The columns for nonterminals are copied to the goto table.
2) The columns for the terminals are copied to the action table
as shift actions.
3) An extra column for '$' (end of input) is added to the action
table that contains acc for every item set that contains S →
E •.
4) If an item set i contains an item of the
form A → w • and A → w is rule m with m > 0 then the row for
state i in the action table is completely filled with the reduce
action rm.

Action
State
*
+
0
Goto
0
1
s1
s2
1
r4
r4
r4
r4
2
r5
r5
r5
r5
3
s5
s6
4
r3
r3
$
E
B
g3
g4
acc
r3
r3
5
s1
s2
g7
6
s1
s2
g8
7
r1
r1
r1
r1
8
r2
r2
r2
r2




“Lookahead” LR Parsing– Deterministic, shift-reduce parser
Most practical (non-Natural) languages can be described by
an LALR
LALR Parser tables are fairly small
Yacc is a Parser-Generation tool that creates LALR parsers



"LL Parser." Wikipedia. Wikimedia Foundation, 11 Sept.
2012. Web. 12 Nov. 2012.
<http://en.wikipedia.org/wiki/LL_parser>.
"LR Parser." Wikipedia. Wikimedia Foundation, 11 July
2012. Web. 12 Nov. 2012.
<http://en.wikipedia.org/wiki/LR_parser>.
Rich, Elaine. "Context-Free Parsing." Automata,
Computability and Complexity: Theory and Applications.
Upper Saddle River, NJ: Pearson Prentice Hall, 2008. 32350. Print.