Transcript ppt

Context-Free Languages
Hinrich Schütze
CIS, LMU, 2013-11-25
Slides based on RPI CSCI 2400
Thanks to Costas Busch
Take-away
Definition context-free grammar
Definition context-free language
Derivation, sentential form, sentence
Derivation trees
Ambiguity
Context-free grammars for natural language
2
Terminology
For our purposes in this class:
Context free grammar
=
Constituency grammar
=
Phrase structure grammar
3
Grammars
Grammars express languages
Example:
the English language
sentence ® noun _ phrase
predicate
noun _ phrase ® article noun
predicate ® verb
4
article ® a
article ® the
noun ® cat
noun ® dog
verb ® runs
verb ® walks
5
A derivation of “the dog walks”:
sentence Þ noun _ phrase
predicate
Þ noun _ phrase verb
Þ article noun
Þ the noun
verb
verb
Þ the dog verb
Þ the dog walks
6
A derivation of “a cat runs”:
sentence Þ noun _ phrase
predicate
Þ noun _ phrase verb
Þ article noun
Þ a noun
verb
verb
Þ a cat verb
Þ a cat runs
7
Language of the grammar:
L = { “a cat runs”,
“a cat walks”,
“the cat runs”,
“the cat walks”,
“a dog runs”,
“a dog walks”,
“the dog runs”,
“the dog walks” }
8
Notation
Production Rules
noun ® cat
noun ® dog
Variable
Terminal
9
Another Example
Grammar:
S ® aSb
S® l
Derivation of sentence
ab:
S Þ aSb Þ ab
S ® aSb
S® l
10
Language?
11
Grammar:
S ® aSb
S® l
Derivation of sentence
aabb :
S Þ aSb Þ aaSbb Þ aabb
S ® aSb
S® l
12
Other derivations:
S Þ aSb Þ aaSbb Þ aaaSbbb Þ aaabbb
S Þ aSb Þ aaSbb Þ aaaSbbb
Þ aaaaSbbbb Þ aaaabbbb
13
Language of the grammar
S ® aSb
S® l
L = {a b : n ³ 0}
n n
14
More Notation
Grammar
G = (V , T , S, P)
V:
Set of variables
T:
Set of terminal symbols
S:
Start variable
P:
Set of Production rules
15
Example
Grammar
G:
S ® aSb
S® l
G = (V , T , S, P)
V = {S}
T = {a, b}
P = {S ® aSb, S ® l}
16
More Notation
Sentential Form:
A sentence that contains
variables and terminals
Example:
S Þ aSb Þ aaSbb Þ aaaSbbb Þ aaabbb
Sentential Forms
sentence
17
We write:
*
S Þ aaabbb
Instead of:
S Þ aSb Þ aaSbb Þ aaaSbbb Þ aaabbb
18
In general we write:
If:
*
w1 Þ wn
w1 Þ w2 Þ w3 Þ
Þ wn
19
By default:
*
wÞ w
20
Example
Grammar
S ® aSb
S® l
Derivations
*
SÞ l
*
SÞ ab
*
SÞ aabb
*
SÞ aaabbb
21
Example
Grammar
S ® aSb
S® l
*
Derivations
SÞ aaSbb
*
aaSbb Þ aaaaaSbbbbb
22
Another Grammar Example
Grammar G : S ® Ab
A ® aAb
A® l
23
Language?
24
Grammar
G:
S ® Ab
A ® aAb
A® l
Derivations:
25
More Derivations
S Þ Ab Þ aAbb Þ aaAbbb Þ aaaAbbbb
Þ aaaaAbbbbb Þ aaaabbbbb
*
SÞ aaaabbbbb
*
SÞ aaaaaabbbbbbb
*
SÞ a b b
n n
26
Language of a Grammar
For a grammar G
with start variable
S:
*
L(G) = {w : SÞ w}
String of terminals
27
Example
For grammar
S ® Ab
G:
A ® aAb
A® l
L(G) = {a b b : n ³ 0}
n n
Since:
*
SÞ a b b
n n
28
A Convenient Notation
A ® aAb
A® l
article ® a
article ® the
A ® aAb | l
article ® a | the
29
Revisit first grammar
A context-free grammar
G:
S ® aSb
S® l
A derivation:
S Þ aSb Þ aaSbb Þ aabb
30
A context-free grammar
G:
S ® aSb
S® l
Another derivation:
S Þ aSb Þ aaSbb Þ aaaSbbb Þ aaabbb
31
S ® aSb
S® l
L(G) = {a b : n ³ 0}
n n
Describes parentheses:
(((( ))))
32
Example
A context-free grammar
G:
S ® aSa
S ® bSb
S® l
33
Language?
34
A context-free grammar
G:
S ® aSa
S ® bSb
S® l
Another derivation:
S Þ aSa Þ abSba Þ abaSaba Þ abaaba
35
S ® aSa
S ® bSb
S® l
L(G) = {ww : wÎ{a, b}*}
R
36
Example
A context-free grammar
G:
S ® aSb
S ® SS
S® l
37
Language?
38
A context-free grammar
G:
S ® aSb
S ® SS
S® l
Two derivations:
S Þ SS Þ aSbS Þ abS Þ abaSb Þ abab
S Þ SS Þ aSbS Þ abS Þ ab
39
S ® aSb
S ® SS
S® l
L(G) = {w : na ( w) = nb ( w),
and na (v) ³ nb (v)
in any prefix v}
Interpretation?
40
S ® aSb
S ® SS
S® l
L(G) = {w : na ( w) = nb ( w),
and na (v) ³ nb (v)
in any prefix v}
Describes
matched
parentheses:
() ((( ))) (( ))
41
Definition: Context-Free Grammars
Grammar
Variables
G = (V , T , S, P)
Terminal
symbols
Start
variable
Productions of the form:
A® x
Variable
String of variables
and terminals
42
G = (V , T , S, P)
*
L(G) = {w : SÞ w, wÎ T*}
43
Definition: Context-Free Languages
A language
L
is context-free
if and only if
there is a context-free grammar
with L = L(G)
G
44
Derivation Order
1. S ® AB
2. A ® aaA
4. B ® Bb
3. A ® l
5. B ® l
Leftmost derivation:
1
2
3
4
5
SÞ AB Þ aaAB Þ aaB Þ aaBbÞ aab
Rightmost derivation:
1
4
5
2
3
SÞ AB Þ ABbÞ AbÞ aaAbÞ aab
45
Language?
46
S ® aAB
A ® bBb
B ® A| l
Leftmost derivation:
S Þ aAB Þ abBbB Þ abAbB Þ abbBbbB
Þ abbbbB Þ abbbb
Rightmost derivation:
S Þ aAB Þ aA Þ abBb Þ abAb
Þ abbBbb Þ abbbb
47
Language?
48
Derivation Trees
49
B ® Bb | l
A ® aaA | l
S ® AB
SÞ AB
S
A
B
50
B ® Bb | l
A ® aaA | l
S ® AB
S Þ AB Þ aaAB
S
A
a
a
B
A
51
B ® Bb | l
A ® aaA | l
S ® AB
S Þ AB Þ aaAB Þ aaABb
S
A
a
a
B
A
B
b
52
B ® Bb | l
A ® aaA | l
S ® AB
S Þ AB Þ aaAB Þ aaABb Þ aaBb
S
A
a
a
B
A
l
B
b
53
B ® Bb | l
A ® aaA | l
S ® AB
S Þ AB Þ aaAB Þ aaABb Þ aaBb Þ aab
Derivation Tree
S
A
a
a
B
A
B
l
l
b
54
B ® Bb | l
A ® aaA | l
S ® AB
S Þ AB Þ aaAB Þ aaABb Þ aaBb Þ aab
Derivation Tree
S
A
a
a
B
A
B
l
l
yield
b
aallb
= aab
55
Partial Derivation Trees
S ® AB
A ® aaA | l
B ® Bb | l
SÞ AB
Partial derivation tree
A
S
B
56
S Þ AB Þ aaAB
Partial derivation tree
S
A
a
a
B
A
57
S Þ AB Þ aaAB
Partial derivation tree
S
A
a
a
sentential
form
B
A
yield
aaAB
58
Sometimes, derivation order doesn’t matter
Leftmost:
S Þ AB Þ aaAB Þ aaB Þ aaBb Þ aab
Rightmost:
S Þ AB Þ ABb Þ Ab Þ aaAb Þ aab
S
Same derivation tree
A
a
a
B
A
B
l
l
b
59
Ambiguity
60
E ® E + E | E * E | ( E) | a
a + a* a
E
E Þ E + E Þ a+ E Þ a+ E* E
Þ a + a* E Þ a + a* a
E
+
E
a
E
*
a
leftmost derivation
E
a
61
E ® E + E | E * E | ( E) | a
a + a* a
E Þ E* E Þ E + E* E Þ a+ E* E E
Þ a + a* E Þ a + a* a
*
E
leftmost derivation
E
a
+
E
E
a
a
62
E ® E + E | E * E | ( E) | a
a + a* a
Two derivation trees
E
E
E
+
E
a
E
*
a
E
E
a
a
E
*
E
+
E
a
a
63
E ® E + E | E * E | ( E) | a
is ambiguous:
The grammar
string
a + a* a
has two derivation trees
E
E
E
+
E
a
E
*
a
E
E
a
a
E
*
E
+
E
a
a
64
E ® E + E | E * E | ( E) | a
is ambiguous:
The grammar
string
a + a* a
has two leftmost derivations
E Þ E + E Þ a+ E Þ a+ E* E
Þ a + a* E Þ a + a* a
E Þ E* E Þ E + E* E Þ a+ E* E
Þ a + a* E Þ a + a* a
65
Definition:
A context-free grammar
G
if some string
has:
wÎ L(G)
is ambiguous
two or more derivation trees
66
In other words:
A context-free grammar
G
if some string
has:
wÎ L(G)
is ambiguous
two or more leftmost derivations
(or rightmost)
67
Why do we care about ambiguity?
a + a* a
take
E
E
+
E
a
E
*
a
a=2
E
E
a
a
E
E
*
E
+
E
a
a
68
2 + 2*2
E
E
E
+
E
2
E
*
2
E
E
2
2
E
*
E
+
E
2
2
69
2 + 2*2 = 6
2 + 2*2 = 8
8
E
6
E
2
E
+
2
2
E
2
4
E
*
2
E
2
E
2
2
4
E
*
2
E
+
2
E
2
2
70
Correct result:
2 + 2*2 = 6
6
E
2
E
+
2
2
E
2
4
E
*
2
E
2
71
• Ambiguity is bad for programming languages
• We want to remove ambiguity
72
We fix the ambiguous grammar:
E ® E + E | E * E | ( E) | a
New non-ambiguous grammar:
E ® E+T
E ®T
T ®T *F
T®F
F ® ( E)
F ®a
73
E Þ E +T ÞT +T Þ F +T Þ a+T Þ a+T *F
Þ a + F * F Þ a + a* F Þ a + a* a
a + a* a
E
E ® E+T
E ®T
T ®T *F
E
+
T
T
T
F ® ( E)
F
F
F ®a
a
a
T®F
*
F
a
74
Unique derivation tree
a + a* a
E
E
+
T
T
T
F
F
a
a
*
F
a
75
The grammar
G:
E ® E+T
E ®T
T ®T *F
T®F
F ® ( E)
F ®a
is non-ambiguous:
Every string wÎ L(G) has
a unique derivation tree
76
Another Ambiguous Grammar
IF_STMT
®
|
if EXPR then STMT
if EXPR then STMT else STMT
Ambiguity?
77
If expr1 then if expr2 then stmt1 else stmt2
IF_STMT
if
expr1
if
then
expr2
STMT
then
stmt1
else
STMT
else
stmt2
stmt2
IF_STMT
if
expr1
if
then
expr2
then
stmt1
78
Inherent Ambiguity
Some context free languages
have only ambiguous grammars
Example:
S ® S1 | S2
L = {a b c } È {a b c }
n n m
n m m
S1 ® S1c | A
S2 ® aS2 | B
A ® aAb | l
B ® bBc | l
79
The string
n n n
abc
has two derivation trees
S1
S
S
S1
S2
c
a
S2
80
Ambiguity in natural language?
81
Take-away
Definition context-free grammar
Definition context-free language
Derivation, sentential form, sentence
Derivation trees
Ambiguity
Context-free grammars for natural language
82