No Slide Title

Download Report

Transcript No Slide Title

SLR Parsing
CSE244
Aggelos Kiayias
Computer Science & Engineering Department
The University of Connecticut
371 Fairfield Road, Box U-155
Storrs, CT 06269-1155
[email protected]
http://www.cse.uconn.edu/~akiayias
CH4.1
Items


CSE244



SLR (Simple LR parsing)
DEF A LR(0) item is a production with a “marker.”
E.g. S  aA.Be
intuition: it indicates how much of a certain production
we have seen already (up to the point of the marker)
CENTRAL IDEA OF SLR PARSING: construct a DFA
that recognizes viable prefixes of the grammar.
Intuition: Shift/Reduce actions can be decided based on
this DFA (what we have seen so far & what are our next
options).
Use “LR(0) Items” for the creation of this DFA.
CH4.2
Basic Operations

CSE244
Augmented Grammar:
E’  E
EE+T | T
TT*F | F
F  ( E ) | id
EE+T | T
TT*F | F
F  ( E ) | id
CLOSURE OPERATION of a set of Items:
Function closure(I)
{
J=I;
repeat for each A .B in J and each produtcion
B of G such that B. is not in J: ADD B. to J
until … no more items can be added to J
return J
}
EXAMPLE consider I={ E’.E }
CH4.3
GOTO function

CSE244


Definition.
Goto(I,X) = closure of the set of all items
A X. where A .X belongs to I
Intuitively: Goto(I,X) set of all items that
“reachable” from the items of I once X has been
“seen.”
E.g. consider I={E’ E. , E E.+T} and compute
Goto(I,+)
Goto(I,+) = { E E+.T, T  .T * F , T  .F ,
F  .( E ) , F  .id }
CH4.4
The Canonical Collections of Items for G
Procedure Items(G’:augmented grammar)
{
C:={ closure [S’  .S] }
CSE244
repeat
for each set of items I in C and each
grammar symbol X
such that goto(I,X) is not empty and not in C
do add goto(I,X) to C
until no more sets of items can be added to C
}
I0
I1
E’  .E
E’  E.
E  .E + T
E  E. + T
E’  E
…
E  .T
EE+T | T
I2
I11
T  .T * F
TT*F | F
E  T.
T .F
F  ( E ) | id
T  T. * F
F  .( E )
F  .id
CH4.5
The DFA For Viable Prefixes
CSE244



States = Canonical Collection of Sets of Items
Transitions defined by the Goto Function.
All states final except I0
I0
+
E
I1
T
I2
*
I3
I7
F
…
…
I3
Look p. 226
Intuition: Imagine an NFA with states all the items
in the grammar and transitions to be of the form:
“A .X” goes to “A X.” with an arrow
labeled “X”
Then the closure used in the Goto functions
Essentially transforms this NFA into the DFA above
CH4.6
Example
CSE244





S’  S
S  aABe
A  Abc
A b
Bd
Start with I0 = closure(S’ .S)
CH4.7
2nd Example
CSE244
E’  E
EE+T | T
TT*F | F
F  ( E ) | id
CH4.8
Relation to Parsing

CSE244
An item A  1.2 is valid for a viable prefix
1 if we have a rightmost derivation that yields
Aw which in one step yields 12w

An item will be valid for many viable prefixes.

Whether a certain item is valid for a certain viable
prefix it helps on our decision whether to shift or
reduce when  1 is on the stack.


If 2 looks like we still need to shift.

If 2= it looks like we should reduce A  1
It could be that two valid items may
tell us different things.
CH4.9
Valid Items for Viable Prefixes

CSE244 



E+T* is a viable prefix (and the DFA will be at
state I7 after reading it)
Indeed: E’=>E=>E+T=>E+T*F is a rightmost
derivation, T*F is the handle of E+T*F, thus
E+T*F is a viable prefix, thus E+T* is also.
Examine state I7 … it contains
T  T*.F
F  .(E)
F  .id
i.e., precisely the items valid for E+T*:
E’=>E=>E+T=>E+T*F
E’=>E=>E+T=>E+T*F=>E+T*(E)
E’=>E=>E+T=>E+T*F=>E+T*id
There are no other valid items for for the viable
prefix E+T*
CH4.10
SLR Parsing Table Construction
Input: the augmented grammar G’
Output: The SLR Parsing table functions ACTION & GOTO
CSE244
1.
2.
Construct C={I0,..,In} the collections of LR(0) items for G’
“State i” is constructed from Ii
If [A  .a] is in Ii and goto(Ii,a)=Ik then we set
ACTION[i,a] to be “shift k”
(a is a terminal)
If [A  .] is in Ii then we set ACTION[i,a] to reduce “A”
for all a in Follow(A) --- (note: A is not S’)
If [S’  S.] is in Ii then we set ACTION[i,$] = accept
3. The goto transitions for state i are constructed as follows for
all A, if goto(Ii,A)=Ik then goto[i,A]=k
4. All entries not defined by rules (2) and (3) are made “error”
5. The initial state of the parser is the one constructed from the
set of items I0
CH4.11
Example.
I0
E’  .E
E  .E + T
CSE244
E  .T
T  .T * F
T .F
F  .( E )
F  .id
I1
E’  E.
E  E. + T
I2
E  T.
T  T. * F
Goto(I0, E)=I1
Goto(I0,T)=I2
Goto(I0,( )=I4
I4
F  (.E)
E  .E + T
E  .T
T  .T * F
T .F
F  .( E )
F  .id
Since F  .( E ) is in I0
And Goto(I0,( )=I4
we set ACTION(0, ( )=s4
Since E’  E. is in I1
We set ACTION(1,$)=acc
Since E  T. is in I2 and
Follow(E)={$,+,) }
We set ACTION(2,$)=rE T
ACTION(2,+)=rE T
ACTION(2,))=rE T
Follow(T)=Follow(F)={ ) , + , * , $ }
CH4.12
3rd example – SLR Table Construction
CSE244
S  AB | a
A  aA | b
Ba
CH4.13
Conflicts


Shift/Reduce
Reduce/Reduce
CSE244
Sometimes unambiguous
grammars produce multiply defined
labels (s/r, r/r conflicts)in the SLR
table.
CH4.14
Conflict Example
S’  S
SL=R | R
L  * R | id
CSE244
RL
CH4.15
Conflict Example
CSE244
S’  S
SL=R | R
L  * R | id
RL
I0 = {S’  .S , S  .L = R , S  .R , L  .* R ,
L . id , R  .L}
I1 = {S’  S. }
I 2 = {S  L . = R , R  L . }
I3 = {S  R.}
I4 = {L  *.R , R  .L , L  .* R , L . id }
I5 = {L  id. }
I6 = {S  L = . R , R  .L , L  .* R , L . id }
I7 = {L  *R. }
I8 = {R  L. }
I9 = {S  L = R. }
action[2, = ] ? s 6
(because of S  L . = R )
rRL
(because of R  L . and = follows R)
CH4.16
But Why?

Let’s consider a string that will exhibit the conflict.
id=id
$0
$0id5
$0L2
CSE244





id=id$
=id$
=id$
s5
r L id
conflict…
What is the correct move? (recall: grammar is nonambig.)
R=id is not a right sentential form!!!
Even though in general = might follow R … but it
does not in this case.
…Actually it does only when R is preceded by *
SLR finds a conflict because using Follow + LR(0)
items as the guide to find when to reduce is not the
best method.
CH4.17