구문 분석

Download Report

Transcript 구문 분석




an efficient Bottom-up parser for a large and useful class of
context-free grammars.
the “L” stands for left-to-right scan of the input;
the “R” for constructing a Rightmost derivation in reverse.
The attractive reasons of LR parsers
(1) LR parsers can be constructed for most programming languages.
(2) LR parsing method is more general than LL parsing method.
(3) LR parsers can detect syntactic errors as soon as possible.
But,

it is too much work to implement an LR parser by hand for a typical
programming-language grammar.
=====>  Parser Generator

The techniques for producing LR parsing tables

Simple
LR(SLR) - LR(0) items, FOLLOW

Canonical LR(CLR) - LR(1) items

Lookahead LR(LALR) - ① LR(1) items
② LR(0), Lookahead
CLR
LALR
SLR

LR parser
a1 … ai … an
Sm
Driver
Routine
$
: input
Parsing
Table
stack

Stack : S0X1S1X2 ••• XmSm, where Si : state and Xi  V.

Configuration of an LR parser :
(S0X1S1 ••• XmSm, aiai+1 ••• an$)
stack contents unscanned input
LR Parsing Table (ACTION table + GOTO table)
ACTION Table

…
symbols <Terminals> <Nonterminals>
…
states
GOTO Table
…

The LR parsing algorithm
::= same as the shift-reduce parsing algorithm.

Four Actions :
 shift
 reduce
 accept
 error
1. ACTION[Sm,ai] = shift S
::= (S0X1S1  XmSm, aiai+1  an$)
 (S0X1S1  XmSmaiS, ai+1  an$)
2. ACTION[Sm,ai] = reduce A  α and |α| = r
::= (S0X1S1  XmSm, aiai+1  an$)
 (S0X1S1  Xm-rSm-r, aiai+1  an$), GOTO(Sm-r , A) = S
 (S0X1S1  Xm-rSm-rAS, aiai+1  an$)
3. ACTION [Sm,ai] = accept, parsing is completed.
4. ACTION [Sm,ai] = error, the parser has discovered an error
and calls an error recovery routine.
1. LIST  LIST , ELEMENT
2. LIST  ELEMENT
3. ELEMENT  a

G:

Parsing Table :
symbols
states
0
,
$
s3
1
s4
acc
2
r2
r2
3
r3
r3
4
5
where,
a
LIST
ELEMENT
1
2
5
s3
r1
r1
sj means shift and stack state j,
ri means reduce by production numbered i,
acc means accept, and blank means error.


Input :  = a, a
Parsing Configuration :
initial
configuration
STACK
INPUT
0
0a3
0 ELEMENT 2
0 LIST 1
0 LIST 1, 4
0 LIST 1, 4 a 3
0 LIST 1, 4 ELEMENT 5
0 LIST 1
a,a$
,a$
,a$
,a$
a$
$
$
$
ACTION
s3
GOTO 2
r3
GOTO 1
r2
s4
s3
GOTO 5
r3
GOTO 1
r1
accept

The method for constructing an LR parsing table from a grammar
① SLR
② LALR
③ CLR

Definition : an LR(0) item

a production with a dot at some position of the right side.
ex) A  XYZ  P,
[A  .XYZ] [A  X.YZ]
[A  XY.Z] [A  XYZ.]
 mark symbol ::= the symbol after the dot if it exists.
 kernel item ::= [A  α.] if α, A = S'.
 closure item ::= [A .α] : the result of performing the CLOSURE operation.
 reduce item ::= [A  α.]

[Aα.β] means that
 an input string derivable from α has just been seen,
 if next seeing an input string derivable from β,
we may be able to reduce by the production A  αβ.

Definition : Augmented Grammar
G = (VN, VT, P, S)
 G' = (VN  {S'},VT, P  {S'  S}, S')
where, S' is a new start symbol not in VN.

The purpose of this new starting production is to indicate to
the parser when it should stop parsing and announce acceptance of
the input. That is, acceptance occurs when and only when the
parser is about to reduce by S'  S.

Definition :
CLOSURE(I)
= I ∪ {[B  . ] | [A  .B]  CLOSURE(I), B    P}

Meaning :

[A  .B] in CLOSURE(I) indicates that, at some point in the
parsing process, we next expect to see a substring derivable from B
as input.
If B   is a production, we would also expect to see a substring
from  at this point. For this reason, we also include
[B  . ] in CLOSURE(I).

Computing Algorithm:
Algorithm CLOUSURE(I) ;
begin
CLOUSURE := I ;
repeat
if [A  .B ]  CLOSURE and B    P then
if [B  .]  CLOSURE then
CLOSURE := CLOSURE ∪ {[B  .]}
fi
fi
until no change
end.

예 1)


E'  E
EE+T|T
TTF|F
F  (E) | id
CLOSURE ({[E' .E]})
= {[E' .E], [E .E+T], [E .T], [T .TF], [T .F],
[F .(E)], [F .id]}.

CLOSURE({[E  E.+T]}) = { [E  E.+T] }.

Definition : GOTO(I,X)
= CLOSURE({[A  X.] | [A  .X]  I}).

Meaning :
If I is the set of items that are valid for some viable prefix , then
GOTO(I,X) is the set of items that are valid for the viable prefix X.

ex) I = {[E'  E.], [E  E.+T]}
GOTO(I,+) = CLOSURE({[E  E+.T]})
= {[E  E+.T], [T .TF], [T .F], [F .(E)], [F .id]}


C0 = {CLOSURE ({[S' .S]})} ∪ {GOTO(I,X) | I ∈ C0, X ∈ V}
We are now ready to give the algorithm to construct C0, the
canonical collection of sets of LR(0) items for an augmented
grammar; the algorithm is the following:

Construction algorithm of C0.
Algorithm Canonical_Collection;
begin
C0 := { CLOSURE({[S' . S]}) };
repeat
for I ∈ C0 do
Closure := CLOSURE(I);
for each X ∈ MARK SYMBOL of Closure do
J := GOTO(I,X);
if  Ji = J then GOTO[I,X] := Ji
else GOTO[I,X] := J;
C0 := C0 ∪ {J}
fi
end for
end for
until no change
end.
Text p.324
 LIST , ELEMENT
 ELEMENT
a

G : LIST
LIST
ELEMENT

 Augmented Grammar

G' : ACCEPT
LIST
LIST
ELEMENT
 LIST
 LIST , ELEMENT
 ELEMENT
a

Co :

I0 : CLOSURE({[ACCEPT .LIST]})
= {[ACCEPT .LIST], [LIST .LIST,ELEMEMT],
[LIST .ELEMENT], [ELEMENT .a]}.

GOTO(I0,LIST)





= I1 = {[ACCEPT  LIST.],
[LIST  LIST.,ELEMEMT]}.
GOTO(I0,ELEMENT) = I2 = {[LIST  ELEMENT.]}.
GOTO(I0,a)
= I3 = {[ELEMENT  a.]}.
GOTO(I1,,)
= I4 = {[LIST  LIST,.ELEMEMT],
[ELEMENT .a]}.
GOTO(I4,ELEMENT) = I5 = {[LIST  LIST,ELEMEMT.]}.
GOTO(I4,a)
= I3.

Definition
::= a directed graph in which the nodes are labeled by the sets of
items and the edges by grammar symbols.
Ex)
LIST
ELEMENT
I0
a
I1
,
I2
I3
I4
a
ELEMENT
I5

예 1) G : PR  b DL ; SL e
DL  d ; DL | d
SL  s ; SL | s
G : P → bD ; Se
D→d;D|d
S→s;S|s
(PR  P )
(DL  D )
(SL  S )
renaming
Text p.326
[예 예 8.8]
• -생성 규칙에 대
한 LR(0) 아이템
[A->.]은 closure
아이템인 동시에
reduce 아이템이
된다.

C0 :
I1
I3
I5
[P' P.]
P
I0
[P' .P]
[P .bD;Se]
b
[P bD.;Se]
;
[P bD;.Se]
[S .s;S]
[S .s]
D
I2
[P b.D;Se]
[D .d;D]
[D .d]
I8
I8
[P bD;S.e]
e
[P bD;Se.]
d
I6
s
S
I7
I4
[S s.;S]
[S s.]
;
[D d.;D]
[D d.]
;
d
[D d;.D]
[D .d;D]
[D .d]
D
I9
[D d;D.]
s
I11
[S s;.S]
[S .s;S]
[S .s]
S
I12
[S s;S.]
Three methods




SLR(simple LR) - C0, Follow
CLR(Canonical LR) - C1
LALR(Lookahead LR)  C1
 C0. Lookahead
Parsing Table
symbols
states
0
1
2
3
…

ACTION Table
GOTO Table
VT U {$}
VN
shift
reduce
accept
error
GOTO
::= The method constructing the SLR parsing table from the C0.

Constructing Algorithm: C0 = {I0,I1,I2,...,In}
1. ACTION[i,a] := "shift j"
if [A  .a ] ∈ Ii and GOTO(Ii,a) = Ij.
2. ACTION[i,a] := "reduce A  α", for all a ∈ FOLLOW(A)
if [A  .] ∈ Ii .
3. ACTION[i,$] := "accept" if [S'  S.] ∈ Ii .
4. GOTO[i,A]
:= j if GOTO(Ii, A) = Ij.
5. "error" for all undefined entries and initial state is i if [S' .S] ∈ Ii .
 reduce item에 대해 FOLLOW 를 사용하여 resolve.

G : 0. A  L (A : ACCEPT, L : LIST, E : ELEMENT)
1. L  L , E
FOLLOW(A) = {$}
2. L  E
FOLLOW(L) = {,,$}
I0
FOLLOW(E) = {,,$}
3. E  a
L
I1
[A .L]
[L .L,E]
[L .E]
[E .a]
a
E
[A L.]
[L L.,E]
I3
I2
[L E.]
I4
,
I5
[L L,.E]
[E .a]
E
[L L,E.]
[E a.]
a

Parsing Table :
ACTION Table
symbols
states
I0
a
,
$
s3
I1
s4
acc
I2
r2
r2
I3
r3
r3
I4
I5
GOTO Table
L
E
1
2
5
s3
r1
r1