Abstract data types

Transcript Abstract data types

CS 321
Programming Languages and Compilers
Bottom Up Parsing
Bottom-up Parsing: Shift-reduce parsing
Grammar H: L  E;L | E
Ea | b
Input: a;a;b
has parse tree
L
E ; L
a E ; L
a
b
2
Bottom-Up Parsing
Data for Shift-reduce Parser
• Input string: sequence of tokens being checked for
grammatical correctness
• Stack: sentential form representing the input seen
so far
• Trees Constructed: the parse trees that have been
constructed so far at some point in the parsing
3
Bottom-Up Parsing
Operations of shift-reduce parser
• Shift: move input token to the set of trees as a
singleton tree.
• Reduce: coalesce one or more trees into single
tree (according to some production).
• Accept: terminate and accept the token stream as
grammatically correct.
• Reject: Terminate and reject the token stream.
4
Bottom-Up Parsing
A parse for grammar H
Input
a;a;b
;a;b
;a;b
Stack
a
E
Action
Shift
Trees
Reduce E a
a
Shift
E
a
a;b
E;
Shift
E
;
a
5
Bottom-Up Parsing
A parse for grammar H
Input
;b
Stack
E;a
Action
Reduce E a
Trees
E ; a
a
;b
E;E
E ; E
Shift
a
b
E;E;
a
E ; E
Shift
a
6
;
a
Bottom-Up Parsing
A parse for grammar H
Input
$
Stack
E;E;b
Action
Reduce Eb
Trees
E ; E
a
$
E;E;E
Reduce LE
$
E;E;L
Reduce LE;L
;
a
E ; E
a
b
a
E ; E
a
;
a
E
b
;
L
E
b
7
Bottom-Up Parsing
A parse for grammar H
Input
$
Stack
E;L
Action
Reduce LE;L
Trees
L
E ; E
a
L
$
L
Acc
E ; E
a
a
;
L
a
E
L
b
;
L
E
b
8
Bottom-Up Parsing
Bottom-up Parsing
• Characteristic automata for an “LR”
grammar: tells when to shift/reduce/accept
or reject
• handles
• viable prefixes
9
Bottom-Up Parsing
A parse for grammar H
Input
a;a;b
;a;b
;a;b
a;b
;b
;b
b
$
$
$
$
$
Stack
a
E
E;
E;a
E;E
E;E;
E;E;b
E;E;E
E;E;L
E;L
L
Action
Shift
Reduce E a
Shift
Shift
Reduce E a
Shift
Shift
Reduce Eb
Reduce LE
Reduce LE;L
Reduce LE;L
Accept
Stack+Input
a;a;b
a;a;b
E;a;b
E;a;b
E;a;b
E;E;b
E;E;b
E;E;b
E;E;E
E;E;L
E;L
L
10
Bottom-Up Parsing
“Bottom-up” parsing?
Stack+Input
L
E;L
E;E;L
E;E;E
E;E;b
E;E;b
E;E;b
E;a;b
E;a;b
E;a;b
a;a;b
a;a;b
L
E ; E
a
L
L
;
a
L
E
L
E ; E
a
;
a
b
11
L
E
b
Bottom-Up Parsing
Handles and viable prefixes
• Stack+remaining input = sentential form
• Handle = the part of the sentential form that is
reduced in each step
• Viable prefix = the prefix of the sentential form in a
right-most derivation that do not extend beyond
the end of the handle
• E.g. viable prefixes for H: (E;)*(E | L | a | b)
• Viable prefixes form a regular set.
12
Bottom-Up Parsing
Characteristic Finite State Machine (CFSM)
Viable prefixes of H are recognized by this CFSM:
1
L
E
0
2
5
L
6
a
a
b
;
E
3
b
4
13
Bottom-Up Parsing
How a Bottom-Up Parser Works
• Run the CFSM on symbols in the stack
• If a transition possible on the incoming input
symbol, then shift, else reduce.
– Still need to decide which rule to use for the reduction.
14
Bottom-Up Parsing
Characteristic automaton
1
L
E
start
0
2
5
L
6
Viable
Prefixes
a
a
b
;
E
3
4
b
a;a;b leads to state 3 after a
E;a;b leads to state 3 after E;a
E;E;b leads to state 4 after E;E;b
E;E;E leads to state 2 after E;E;E
E;E;L leads to state 6 after E;E;L
E;L leads to state 6 after E;L
15
Bottom-Up Parsing
Characteristic automaton
1
L
E
start
0
2
3
L
5
a
a
b
;
E
State
0,5
1
2
b
4
3
4
6
16
6
Action
shift (if possible)
accept
reduce LE, if EOF
shift otherwise
reduce Ea
reduce Eb
reduce LE;L
Bottom-Up Parsing
Example: expression grammar
E  E+T | T
T  T*P | P
P  id
id+id+id+id
has parse tree:
E
E + T
E + T P
E + T P
id
T
P id
P
id
id
17
Bottom-Up Parsing
A parse in this grammar
id+id+id+id
+id+id+id
+id+id+id
+id+id+id
+id+id+id
id+id+id
+id+id
+id+id
+id+id
+id+id
id+id
Shift
Reduce
Reduce
Reduce
Shift
Shift
Reduce
Reduce
Reduce
Shift
Shift
id
P
T
E
E+
E+id
E+P
E+T
E
E+
18
Bottom-Up Parsing
A parse in this grammar (cont.)
+id
+id
+id
+id
id
$
$
$
$
$
E+ id
E+P
E+T
E
E+
E+id
E+P
E+T
E
E
Reduce
Reduce
Reduce
Shift
Shift
Reduce
Reduce
Reduce
Reduce
Accept
19
Bottom-Up Parsing
Characteristic Finite State Machine
The CFSM recognizes viable prefixes (strings of
grammar symbols that can appear on the stack):
1
E
T
5
8
id
id
0
+
4
id
*
P
T
P
2
*
7
P
9
3
20
Bottom-Up Parsing
Definitions
•
•
•
•
•
Rightmost derivation
Right-sentential form
Handle
Viable prefix
Characteristic automaton
21
Bottom-Up Parsing
Rightmost Derivation
Definition: A rightmost derivation in G is a derivation:
S  w1    w i  w i1  
such that for each step i, the rightmost non-terminal in w i
is replaced to obtain
w i 1
1. L  L; E  L; E; E  E; E; E  a; E; E  a;a;E  a;a;b
is not right-most
2. L  L; E  L; b  L; E; b  L; a; b  E; a;b  a;a;b
is right-most
22
Bottom-Up Parsing
Right-sentential Forms
Definition: A right-sentential form is any
sentential form that occurs in a right-most
derivation.
L  L; E  L; b  L; E; b  L; a; b  E; a;b  a;a;b
E.g., any of these
23
Bottom-Up Parsing
Handles
Definition: Assume the i-th step of a rightmost
derivation is:
wi=uiAvi  uivi=wi+1
Then, (, |ui|) is the handle of wi+1
In an unambiguous grammar, any sentence has a unique rightmost
derivation, and so we can talk about “the” handle rather than “a”
handle.
24
Bottom-Up Parsing
The Plan
• Construct a parser by first constructing the
CFSM, then constructing GOTO and ACTION
tables from it.
• Construction has two parts:
– “LR(0)” construction of the CFSM
– “SLR(1)” construction of tables
25
Bottom-Up Parsing
Constructing the CFSM: States
The states in the CFSM are created by
taking the “closure” of “LR(0) items”
Given a production “L  E ; L”,
these are all induced “LR(0) items”
26
L•E;L
LE•;L
LE;•L
LE;L•
Bottom-Up Parsing
What is “•”?
The “•” in “L  E • ; L” represents the state of the
parse.
L
Only this part of the
tree is fully developed
E
27
; L
Bottom-Up Parsing
A State in the CFSM: Closure of LR(0) Item
For set I of LR(0) items, calculate closure(I):
1. if “A   • B ” is in closure(I), then
for every production “B  ”,
“B  • ” is in closure(I)
2. closure(I) is the smallest set with property (1)
28
Bottom-Up Parsing
Closure of LR(0) Item: Example
H: L  E;L | E
Ea | b
closure({L  E ; • L}) =
{L  E ; • L ,
L•E;L,
L  • E,
E•a ,
E  • b}
29
Bottom-Up Parsing
LR(0) Machine
• Given grammar G with goal symbol S, augment
the grammar by adding new goal symbol S’ and
production S’  S.
• States = sets of LR(0) items
• Start state = closure({S’  •S})
• All states are considered to be final (set of viable
prefixes closed under prefix)
• transition(I, X) =
closure({A  X• | A  •XI}).
30
Bottom-Up Parsing
Example: LR(0) CFSM Construction
H:
Augment the grammar:
L’ L
L  E;L | E
Ea | b
Initial State is closure of this “augmenting rule”:
I 0 : L’ • L
L  • E;L
L•E
E•a
E•b
closure({L’ • L}) =
31
Bottom-Up Parsing
Example: Transitions from I0
transition(I 0 , L) = {L’  L •} =
I1
transition( I 0 , E) = {L  E •;L , L  E •} =
transition( I 0 , a ) = {E  a •} =
I3
transition( I 0 , b) = {E  b •} =
I4
I2
There are no other transitions from I0.
There are no transitions possible from I1.
Now consider the transitions from I2.
32
Bottom-Up Parsing
Transitions from I2
I5 :
transition( I 2 , ;) =
New state:
L  E;• L
L  • E;L
L•E
E•a
E•b
transition( I 5 , L) = { L  E;L• }=
transition( I 5 , E) =
transition( I 5 , a) =
33
I6
I2
I 3 transition( I 5 , b) = I 4
Bottom-Up Parsing
The CFSM Transition Diagram for H
I1 : L’  L •
I0 :
L’ • L
L  • E;L
L•E
E•a
E•b
L
E
I5 :
;
I 2 : L  E •;L
LE•
E
a
b
a
I3 : E  a •
L  E;• L
L  • E;L
L•E
E•a
E•b
L
I6 :
L  E;L•
b
I4 : E  b •
34
Bottom-Up Parsing
Characteristic Finite State Machine for H
1
L
E
0
2
5
L
6
a
a
b
;
E
3
b
4
35
Bottom-Up Parsing
How LR(1) parsers work
• GOTO table: transition function of characteristic
automaton in tabular form
• ACTION table: State  Action
• Procedure:
– Use the GOTO table to run the CFSM over the stack. Suppose
state reached at top of stack is .
– Take action given by ACTION(,a), where a is incoming input
symbol.
36
Bottom-Up Parsing
Action Table for H
0
a
b
Shift
Shift
;
1
$
Accept
Reduce LE
2
Shift
3
Reduce Ea Reduce Ea
4
Reduce Eb Reduce Eb
5
Shift
Shift
Reduce LE;L
6
37
Bottom-Up Parsing
SLR(1) Parser Construction
• GOTO table is the move function from the LR(0)
CFSM.
• ACTION table is constructed from the CFSM as
follows:
– If state i contains A  •a, then ACTION(i,a) = Shift.
– If state i contains A  •, then ACTION(i,a) = Reduce A(But,
if A is L’, action is Accept.)
– Otherwise, reject.
• But,...
38
Bottom-Up Parsing
SLR(1) Parser Construction
• Rules for the ACTION table can involve
shift/reduce conflicts.
• So the actual rule for reduce actions is:
If state i contains A  •, then ACTION(i,a) = Reduce A  , for
all a  FOLLOW(A).
• E.g. state 2 for grammar H yields a shift/reduce
conflict. Namely, should you shift the “;” or
reduce by “LE”. This is resolved by looking at
the “follow set” for L.
• Follow(L) = {$}
39
Bottom-Up Parsing
FIRST and FOLLOW sets
• FIRST() = {a   |   a}  { | a  }
• FOLLOW(A) = {a   | S  Aa}
40
Bottom-Up Parsing
Calculating FIRST sets
• Create table Fi mapping N to   {}; initially,
Fi(A) =  for all A.
• Repeat until Fi does not change:
– For each A   P,
Fi(A) := Fi(A)  FIRST(, Fi)
where FIRST(, Fi) is defined as follows:
– FIRST(, Fi) = {}
– FIRST(a, Fi) = {a}
– FIRST(B, Fi) = Fi(B) FIRST(b,Fi), if Fi(B)
Fi(B), o.w.
41
Bottom-Up Parsing
Calculating FOLLOW sets
• Calculate FOLLOW sets
• Create table Fo : N    {$}; initially, Fo(A) =  for
all A, except Fo(S) = {$}
• Repeat until Fo does not change:
– For each production A  B,
Fo(B) := Fo(B)  FIRST() - {}
– For each production A  B
Fo(B) := Fo(B)  Fo(A)
– For each production A  Bif  FIRST(),
Fo(B) := Fo(B)  Fo(A)
42
Bottom-Up Parsing