Transcript Algorithm
(American Heritage Dict.)
Parse: v.
To break (a sentence) down into its component parts of
speech with an explanation of the form, function, and
syntactical relationship of each part.
the dog loves the cat
×
×
4/7/2015
the loves dog the cat
the cat the dog loves
IT 327
1
The most practical Parsers:
Predictive parser:
No back tracking.
1. input (token string)
2. Stacks, parsing table
3. output (syntax tree, intermediate codes)
4/7/2015
IT 327
2
Tow kinds of predictive parsers:
Top-Down
The syntax tree is built up from the root
Example: LL(1) parser
Left to right scanning
Leftmost derivations
1 symbol look-ahead
Bottom-Up:
The syntax tree is built up from the leaves
Example: LR(1) parser Left to right scanning
Rightmost derivations
1 symbol look-ahead
4/7/2015
IT 327
3
end-of-file symbol
A left-most derivation
1.
2.
3.
4.
5.
S ASb
S C
Aa
CcC
C
S
ASb
aSb
aASbb
aaSbb
aaASbbb
aaaSbbb
aaaCbbb
aaacCbbb
aaaccCbbb
aaaccbbb
a
b
c
$
S
1
2
2
2
A
3
5
4
5
C
LL(1) Parsing Table
LL(1) Grammar
aaaccbbb
4/7/2015
IT 327
4
Recursive-descent Parser
all possible terminal and
end-of-file symbols
S():
Switch(token) {
case a: A();S();get(b);
build S ASb;
break;
case b: C();
build S C;
break;
case c: C();
built S C;
break;
case $: C();
built S C;
break;
}
1.
2.
3.
4.
5.
S ASb
S C
Aa
CcC
C
a
b
c
$
S
1
2
2
2
A
3
5
4
5
C
LL(1) Parsing Table
4/7/2015
IT 327
5
Recursive-descent Parser
A():
Switch(token) {
case a: get(a);
build A a;
break;
case b: error;
break;
case c: error;
break;
case $: error;
break;
}
1.
2.
3.
4.
5.
S ASb
S C
Aa
CcC
C
a
b
c
$
S
1
2
2
2
A
3
5
4
5
C
LL(1) Parsing Table
4/7/2015
IT 327
6
Recursive-descent Parser
C():
Switch(token) {
case a: error;
break;
case b: build C ;
break;
case c: get(c);C();
built C cC;
break;
case $: build C ;
break;
}
1.
2.
3.
4.
5.
S ASb
S C
Aa
CcC
C
a
b
c
$
S
1
2
2
2
A
3
5
4
5
C
LL(1) Parsing Table
4/7/2015
IT 327
7
LL(1) Parsing
a b c $
S 1 2 2 2
A 3
C
5 4 5
aaaccbbb S();
A();S();get(b);
get(a);S();get(b);
aaaccbbb S();get(b);
A();S();get(b);get(b);
get(a);S();get(b);get(b);
aaaccbbb S();get(b);get(b);
A();S();get(b);get(b);get(b);
get(a);S();get(b);get(b);get(b);
aaaccbbb S();get(b);get(b);get(b);
C();get(b);get(b);get(b);
get(c);C();get(b);get(b);get(b);
aaaccbbb C();get(b);get(b);get(b);
get(c);C();get(b);get(b);get(b);
aaaccbbb C();get(b);get(b);get(b);
get(b);get(b);get(b);
aaaccbbb get(b);get(b);
aaaccbbb get(b);
IT 327
aaaccbbb
1.
2.
3.
4.
5.
S ASb
S C
Aa
CcC
C
S
ASb
aSb
aASbb
aaSbb
aaASbbb
aaaSbbb
aaaCbbb
aaacCbbb
aaaccCbbb
aaaccbbb
8
LL(1) Grammar
A grammar having an LL(1) parsing table.
i.e., There is no conflict in the parsing table
1.
2.
3.
4.
5.
S ASb
S C
Aa
CcC
C
a
b
c
$
S
1
2
2
2
A
3
5
4
5
C
LL(1) Parsing Table
LL(1) Grammars allow -production.
4/7/2015
IT 327
9
Not every CFG is an LL(1) grammar
<stmt> ::= <if-stmt> | s1 | s2
<if-stmt> ::= if <expr> then <stmt> else <stmt>
| if <expr> then <stmt>
<expr> ::= e1 | e2
if e1 then if e2 then s1 else s2
if (a > 2)
if (b > 1)
b++;
else
a++;
4/7/2015
if (a > 2)
if (b > 1)
b++;
else
a++;
IT 327
10
The recursive-descent parser does not work for
every CFG
E():
Switch(token) {
case id: E();
...
...
...
}
1.
2.
3.
4.
5.
6.
E
E
T
T
F
F
E+T
T
T*F
F
(E)
id
id+id*id
Left-recursions
4/7/2015
IT 327
11
A left-recursive grammar
1. A A
2. A
Left-recursions
4/7/2015
A’
A
1. A A’
2. A’ A’
3. A’
A
A
A
A
Remove left-recursion
A’
A’
A’
IT 327
12
Eliminating left-recursions
1.
2.
3.
4.
5.
6.
4/7/2015
E
E
T
T
F
F
E+T
T
T*F
F
(E)
id
1.
2.
3.
4.
5.
6.
7.
8.
IT 327
E T E’
E’ + T E’
E’
T F T’
T’ * F T’
T’
F (E)
F id
13
An Algorithm for Eliminating immediate left-recursions
Given a CFG G, let A be one of its non-terminal symbols such that A A
1.
Add a new non-terminal symbol A’ to G;
2.
For each production A
3.
For each production A
4.
A
such that A is not the 1st symbol in
add A A’ to G;
A
replace it by A A’;
Add A’
4/7/2015
to G;
IT 327
1. A A’
2. A’ A’
3. A’
14
Indirect left-recursions
S
1.
2.
3.
4.
S
S
A
A
Aa
b
Sd
e
d
S
a
A
S
d
b
4/7/2015
a
A
bdada
IT 327
15
Indirect left-recursions
find all immediate
left recursions
repeat
if any, remove the last
non-terminal symbol
Z with rule ZX…
find all immediate
left recursions
1.
2.
3.
4.
5.
S
S
A
A
A
1. S A a
Aa
2. S b
b
3. A SdA’
Ac
4. A eA’
Sd
5. A’ cA’
e
6. A’
1.
2.
3.
4.
5.
S SdA’ a
S eA’a
S b
A’ cA’
A’
1.
2.
3.
4.
5.
6.
S eA’aS’
S bS’
S’ dA’aS’
S’
A’ cA’
A’
A A’
A A
A’ A’
A
A’
4/7/2015
IT 327
16
An Algorithm for Eliminating left-recursions
Given a CFG G, let A1, A2, ..... An, be its nonterminal symbols
for i:= n down to 1 do {
for j := 1 to i-1 do {
For each production
// find one level of indiretion
Ai
Aj ω do {
Aj , add Ai ω to the grammar;
Remove Ai Aj ω by
For each production
}
} // end for j
Eliminate the immediate left-recursion caused by
} // end for i
4/7/2015
IT 327
Ai
17
A Grammar for if statements
1.
2.
3.
4.
5.
S
S
E
E
C
iCtSE
a
eS
b
a
S
e
2
i
t
$
1
E
3,4
C
4
5
Is it an LL(1) grammar?
Is there an LL(1) parsing table for it?
4/7/2015
b
IT 327
No!
18
ibtibtae……
A Grammar for if statements
1.
2.
3.
4.
5.
S
S
E
E
C
S
i
...
i
i
4: i
i
4/7/2015
iCtSE
a
eS
b
a
S 2
E
C
...
b t S E…
b
b
b
b
t
t
t
t
ibtSE
ibtaE
ibta
ibta
E…
E…
E…
eS…
b
e
i
t
$
1
3,4
4
Why there is a
conflict?
5
S
i
...
i
i
3: i
IT 327
...
b t S E…
b t ibtSE E…
b t ibtaE E…
b t ibtaeS E…
19
A Grammar for if statements
a
1.
2.
3.
4.
5.
S
S
E
E
C
iCtSE
a
eS
b
S
b
e
2
i
t
1
E
3,4
C
$
4
5
Can we have an unambiguous equivalent grammar for this
grammar?
Yes!
In general, No! Some inherently ambiguous languages exist.
Can we write a program to test whether a given grammar is
ambiguous?
No!
Can we write a program to get an unambiguous equivalent grammar from
any grammar of a language that is known to be not inherently ambiguous?
4/7/2015
IT 327
No!
20
Is there an LL(2) Grammar ? Yes!
We need to look two symbols ahead in order to
determine which rule should be used.
{ ambnc | m ≥ 1 and n ≥ 0 }
1.
2.
3.
4.
5.
S
A
A
B
B
AB
aA
a
bB
c
S
A a b c
2 3 3
B
b
c
4
5
LL(2) Parsing Table
a a a a a b b b b c
4/7/2015
a
1
IT 327
21
LL(2) Parsing Table
1.
2.
3.
4.
5.
S
A
A
B
B
AB
aA
a
bB
c
S();
A();B();
get(a);A();B();
A();B();
get(a);A();B();
A();B();
get(a);B()
B();
get(b);B();
B();
get(c);
4/7/2015
a
LL(2) Parsing
a a a b
a a a b
a a a b
a a b
a a b
a b
a b
b
b
IT 327
S
c
c
c
c
c
c
c
c
c
c
c
b c
1
A ab c
23 3
B
4 5
22
Is there an LL(1) grammar equivalent to the
following LL(2) grammar?
Yes
{ ambnc | m ≥ 1 and n ≥ 0 }
1.
2.
3.
4.
5.
S
A
A
B
B
1.
2.
3.
4.
5.
AB
aA
a
bB
c
S
A
A
B
B
aAB
aA
bB
c
a a a a a b b b b c
4/7/2015
IT 327
23
Every left-recursive grammar is not an LL(k) grammar
But we can effectively find an equivalent one
1.
2.
3.
4.
1. S S A
2. S a
3. A b
S
SA
SAA
SAAA
SAAAA
aAAAAA
....
abbbbbb
4/7/2015
1.
2.
3.
4.
5.
6.
E
E
T
T
F
F
S
S’
S’
A
a S’
AS’
b
1. E T E’
2. E’ + T E’
E+T
3. E’
T
4. T F T’
T*F
5. T’ * F T’
F
6. T’
(E)
7. F ( E )
id
8. F id
IT 327 Are we happy with this? 24
Does any LL(2) grammar always has an equivalent
LL(1) grammar? No
LL(k) grammar, k 2
LL(2) grammar
1.
2.
3.
4.
S
S
A
A
1.
2.
3.
4.
aSA
abS
c
S
S
A
A
aSA
ak-1bS
c
no equivalent LL(k-1) grammar
KuriKi-Sunoi [1969]
no equivalent LL(1) grammar
LL(1) LL(2) LL(3) ..... LL(k) LL(k+1) ...
4/7/2015
IT 327
25
LL(k) grammar, k 2
1.
2.
3.
4.
S
S
A
A
aSA
ak-1bS
c
(KuriKi-Sunoi [1969])
This grammar is inherently ambiguous.
Is there an unambiguous CFG that
is not an LL(k) grammar?
Yes
There exists DCFL that is not LL(k) -- Stearns [1970]
{ an | n ≥ 0 } { anbn | n ≥ 0 }
4/7/2015
IT 327
26
LL(1) Parser Implementation
1.
2.
3.
4.
5.
6.
7.
8.
E T E’
E’ + T E’
E’
T F T’
T’ * F T’
T’
F (E)
F n
n
E
(
4
$
3
3
6
6
4
6
8
)
1
2
T’
F
*
1
E’
T
+
5
7
p.s. Let n be any
positive integer
less than 32767
Programming Assignment
Details will be announced later.
4/7/2015
IT 327
27