Context-free Languages - 法政大学 [HOSEI UNIVERSITY]
Download
Report
Transcript Context-free Languages - 法政大学 [HOSEI UNIVERSITY]
Context-free Languages
http://cis.k.hosei.ac.jp/~yukita/
Context-free grammar G1
A 0 A1
A B
B #
A grammarconsistsof su bstitu ti
on ru le s
which are oftencalled produ ctionrules.
A and B are variable.s
Especially, A is called th estart variable.
0,1, and # are te rm in als
.
2
Parse tree for 000#111 in grammar G1
A
A
A
A
B
0
0
0
#
1
1
1
3
The English Language
<SENTENSE>
<NOUN-PHRASE> <VERB-PHRASE>
<NOUN-PHRASE>
<CMPLX-NOUN>| <CMPLX-NOUN> <PREP-PHRASE>
<VERB-PHRASE>
<CMPLX-VERB>| <CMPLX-VERB> <PREP-PHRASE>
<PREP-PHRASE>
<PREP> <CMPLX-NOUN>
<CMPLX-NOUN>
<ARTICLE> <NOUN>
<CMPLX-VERB>
<VERB>|<VERB><NOUN-PHRASE>
<ARTICLE>
a | the
<NOUN>
boy | girl | flower
<VERB>
touches | likes | sees
<PREP>
with
4
Definition 2.1
A context free gram m aris a 4 - tuple(V , , R, S ), where
1. V is a finiteset of variables,
2. is a finiteset of term inals,
3. R is a finiteset of rules, where
R {" A w"| A V , w (V )*}, and
4. S V is thestart sym bol.
If u, v and w are stringsof variablesand terminals, and
A w is a rule of thegrammar,we say thatuAv yields
*
uwv, writtenuAv uwv. Writeu v if u v or
u u1 u2 uk v.
*
T helanguageof G is L(G ) {w | S w}.
*
5
Context-free Languages
If a language is generatedby a context- free grammar,we say
that thelanguage is a context- free language.
Any regular language turnsout tobe a context- free language.
Assign a variableRi for each stateqi of theDFA.
Add therule Ri aRj if (qi , a) q j .
Add therule Ri if Ri is an acceptstate.
6
Context Dependency
Let A, B, C V , and w (V )* .
B w is contextindependent, while ABC AwC is
contextdependent.
In t helatterrule, A _ C is consideredthecontextfor B
to yield w.
7
Example 2.2 G3
G3 ({S}, {a,b}, R, S ), where R consist sof only one rule
S aSb | SS | .
abab, aaabbb, and aababbare in L(G3 ).
If you regard a and b as " (" and " )", respectively,
thelanguage consist sof all stringsof properlynested
parentheses.
8
Example 2.3 G4
G4 (V , , R, EXP R ).
V { EXP R , T ERM , FACT OR }, {a,,, (, )}.
EXP R
R T ERM
FACT OR
EXP R T ERM | T ERM
T ERM FACT OR | FACT OR
( EXP R ) | a
9
Parse tree for a+aXa
<EXPR>
<EXPR>
<TERM>
<TERM>
<TERM>
<FACTOR>
<FACTOR>
a
<FACTOR>
+
a
X
a
10
Parse tree for (a+a)Xa
<EXPR>
<TERM>
<TERM>
<FACTOR>
<FACTOR>
<EXPR>
<EXPR>
<TERM>
<TERM>
<FACTOR>
(
a
<FACTOR>
+
a
)
X
a
11
Ambiguity in grammar G5
EXPR EXPR EXPR
| EXPR EXPR
| EXPR
| a
12
A parse tree for a+aXa
<EXPR>
<EXPR>
<EXPR>
a
<EXPR>
<EXPR>
+
a
X
a
13
Another parse tree for a+aXa
<EXPR>
<EXPR>
<EXPR>
<EXPR>
<EXPR>
a
+
a
X
a
14
Different derivations for the same
parse tree
E E E
<EXPR>
{ E E } E
{a E } E
<EXPR>
<EXPR>
<EXPR>
{a a} E
{a a} a
E E E
<EXPR>
E a
{ E E } a
a
+
a
X
a
{a E } a
{a a} a
15
Leftmost Derivation
• If a string has two different parse trees, we
say that the grammar is ambiguous.
• A derivation of a string is a leftmost
derivation if at every step the leftmost
remaining variable is the one replaced.
• Every parse tree has unique leftmost
derivation.
16
Definition 2.4 Ambiguity
A string w is derived am biguousl
y by a context- free grammarG
if it has two or moredifferentleftmostderivations. GrammarG
is am biguousif it generatessomestringambiguously.
Remark: An ambiguous grammarGa and non - ambiguous
grammarGna can generatethesame language.T hereare languages
thatcan not be generatedby any non - ambiguous language,
in which case we say that he
t language is inherentlyam biguous
.
17
Definition 2.5 Chomsky normal form
A context- free grammaris in Chom skynorm al form if every
rule is of thefollowingforms:
A BC
(where A V , and B, C V {S}),
Aa
(where A V , and a ), and
S .
18
Theorem 2.6
Any context- free language is generatedby a context- free
grammarin Chomskynormalform.
19
Proof of Th. 2.6
1. Add a new start stateS 0 and therule S 0 S .
T henew start symbolneveroccur in theright handsides.
2. Removea rule A where A S .
For such A , if thereis a rule R uAv, add therule R uv.
If thereis a rule R uAvAw, add therules R uvAw,
R uAvw, and R uvw. And so on.
If we have therule R A, we add R unless we have
previouslyremovedR .
We repeat this processuntil we all eliminateA rules.
20
Proof of Th. 2.6
3. We removea unit rule A B. For such A, B, and any B u,
we add A u unless this was a unit rule previouslyremoved.
We repeat the
se stepsuntil we eliminateall unit rules.
4. We replaceeach A u1u2 uk where k 3 with therules
A u1 A1 ,
A1 u2 A2 ,
A2 u3 A3 , ,
Ak 2 uk 1uk .
Here, A1 , A2 ,, Ak 2 are new variables.
If k 2, we replaceany terminal ui in t hepreceedingrules
with t henew variableU i and add therule U i ui .
21
Example 2.7 G6
1. T heoriginalG6 is shown on theleft.T heresult of applying
thefirst st ep to makea new st art symbolappearson theright .
S0 S
S ASA| aB
A B|S
B b|
S ASA| aB
A B|S
B b|
22
Example 2.7 Step 2
2. Remove rules B , and introducecompensations for it.
S0 S
S0 S
S ASA| aB
S ASA| aB | a
A B|S
Bb
A B|S |ε
Bb
Remove rules A , and introducecompensations for it.
S0 S
S0 S
S ASA| aB | a
S ASA| aB | a | SA | AS | S
A B|S
A B|S
Bb
Bb
23
Example 2.7 Step 3
3. Removeunit rules S S on theleft.RemoveS 0 S on theright .
S0 S
S 0 ASA| aB | a | SA | AS
S ASA| aB | a | SA | AS
S ASA| aB | a | SA | AS
A B|S
A B|S
Bb
Bb
Removeunit rules A B. RemoveA S .
S 0 ASA| aB | a | SA | AS
S 0 ASA| aB | a | SA | AS
S ASA| aB | a | SA | AS
S ASA| aB | a | SA | AS
Ab|S
A b | ASA| aB | a | SA | AS
Bb
Bb
24
Example 2.7 Step 4
S 0 ASA| aB | a | SA | AS
S 0 AA1 | UB | a | SA | AS
S ASA| aB | a | SA | AS
S AA1 | UB | a | SA | AS
A b | ASA| aB | a | SA | AS
A b | AA1 | UB | a | SA | AS
A1 SA
U a
Bb
25
Pushdown Automata
finite
automaton
pushdown
automaton
state
control
a
a
b
b
input
state
control
a
a
b
b
input
x
y
stack
z
...
26
Definition 2.8
A pushdown automatonis a 6 - tuple(Q, , , , q0 , F ),
where Q, , , and F are all finitesets, and
1. Q is theset of states,
2. is theinput alphabet,
3. is thestack alphabet,
4. : Q 2Q is the transit ion function,
5. q0 is thestart state,and
6. F Q is theset of acceptstates.
27
Computation
T hemachineacceptsinput w w1w2 wm , where wi
if sequences of statesr0 , r1 , , rm Q and strings s0 , s1 , , sm *
satisfy the next t hreeconditions. T hestrings si representthe
sequence of stack contentsthatM has on theacceptingbranch
of thecomput ation.
1. r0 q0 and s0 .
2. (ri 1 , b) (ri , wi 1 , a ), where si at and si 1 bt for some
a, b , t * .
3. rm F .
28
Theorem 2.12
• A language is context free if and only if
some pushdown automaton recognizes it.
• Lemma 2.13
– If a language is context free, then some
pushdown automaton recognizes it.
• Lemma 2.15
– If a pushdown automaton recognizes some
language, then it is context free.
29
Proof of Lemma 2.13
CFL Recognized by PDA
We constructP DA P (Q, , , , q1 , F ).
Let (r , u ) (q, a, s ) , where u u1 ul ,
be shorthandnotationfor
(q, a, s ) contains (ql , ul ),
(q1 , , ) {( q2 , ul 1 )},
(q2 , , ) {( q3 , ul 2 )},
(ql 1 , , ) {( r , u1 )}.
30
Proof of Lemma 2.13
We put Q {qstart , qloop , qaccept} E , where E is theset of states
needed to implementtheshorthand.
We define as follows.
(qstart , , ) {( qloop , S $)}
(qloop , , A) {( qloop , w) | where A w is a rule in R}
(qloop , a, a ) {( qloop , )}
(qloop , ,$) {( qaccept, )}
31
State Diagram of P
qstart
,S$
,Aw for rule Aw
qloop
a,a
for terminal a
,$
qaccept
32
SaTb | b
Example 2.14
qstart
,S$
TTa |
,Sb
,T
,Ta
,T
,a
qloop
,$
,Sb
,T
qaccept
a,a
b,b
33
Proof of Lemma 2.15
Recognized by PDA CFL
We constructa grammarG.
We can assume without losing generarit ythatmachineP satisfies
thefollowingconditions.
1. It has a single acceptstate,qaccept.
2. It emptiesits stack beforeaccepting.
3. Each transition eitherpushes a symbolontothestack (a push move)
or popsone off thestack (a pop move),but does not do both
at thesame time.
34
Proof of Lemma 2.15
Let P (Q, , , , q0 , {qaccept}) be given. W econst ructG.
P ut V { Apq | p, q Q},
S Aq0 ,qaccept , and t herules are :
(1) For each p, q, r , s Q, t , and a, b ,
if (r , t ) ( p, a, ) and (q, ) ( s, b, t ),
put t herule Apq aArs b in G.
(2) For each p, q, r Q, put t herule Apq Apr Arq in G.
(3) For each p Q, put t herule App in G.
35
ApqAprArq
generated
stack
by Apq
height
input
string
p
q
r
generated
generated
by Apr
by Arq
36
ApqaArsb
generated
stack
by Apq
height
input
string
r
s
q
p
a
b
generated
by Ars
37
Claim 2.16 If Apq generates x, then x can bring P
from p with empty stack to q with empty stack.
Proof. Inductionon thenumber of stepsin thederivationof x from Apq .
Basis : A derivationwith a single step must use a rule whose RHS contains
no variables. T heonly rules in G as such is App . Input takesP from p
with emptystack top with emptystack.
Induction Step : (Assume k and provek 1)
*
Assume that Apq x with k 1 steps.T hefirst step in thisderivation
is either Apq aArs b or Apq Apr Arq .
38
Proof (continued)
Proof. Inductionon thenumber of stepsin thederivationof x from Apq .
Basis : A derivationwith a single step must use a rule whose RHS contains
no variables. T heonly rules in G as such is App . Input takesP from p
with emptystack top with emptystack.
Induction Step : (Assume k and provek 1)
*
Assume that Apq x with k 1 steps.T hefirst step in thisderivation
is either Apq aArs b or Apq Apr Arq .
39
Proof (continued)
Case Apq aArs b :
Let Ars generatey, which should completewithingk steps.We have x ayb.
T heinductionhypothesistellsus that P can go from r on emptystack
to s on emptystack. Because Apq aArs b is a rule of G,
(r , t ) ( p, a, ) and (q, ) ( s, b, t ).
T herefore,x can bring P from p with emptystack toq with emptystack.
40
Proof (continued)
Case Apq Apr Arq :
*
*
Let x yz, Apr y and Arq z. Both derivations should completewithin k steps.
T heinductionhypothesistellsus that
y can bring P from p to r , and
z can bring P from r to q, with emptystacksat thebeginningand end.
Hence,
x can bring it from p to q with emptystacksat thebeginningand end.
41
Claim 2.17 If x can bring P from p with empty
stack to q with empty stack, Apq generates x.
Proof. Inductionon thenumber of stepsin thecomputation of P that
goes from p to q with emptystackson input x.
Basis : T hecomputation has 0 steps.It startsand ends at thesame state,
say p. P only has timeto read x .
By construction,G has therule App .
Induction step : Assume truefor computation lengthat most k 0, and
provetruefor computations of length k 1.
Suppose that x brings P from p to q in k 1 steps with emptystacksat
the beginningand end.
42
Proof (continued)
Case T hestack is emptyonly at thebeginning and end :
T hen,(r , t ) ( p, a, ) and (q, ) ( s, b, t ) for some r , q Q, a, b ,
and t .
So, Apq aArs b is in G.
Let x ayb. T hen, input y brings P from r to s within k-1 steps with
emptystacksat thebeginning and end.
*
*
T heinduction hypothesistellsus that Ars y. Hence, Apq x.
43
Proof (continued)
Case T hest ack becomesempt yat st ater other thanthebeginningor end :
Let x yz, where y brings P from p to r , and z brings P from r to q.
T he two computations completewithin k st eps.
*
*
T heinductionhypothesistellsus that Apr y and Arq z.
*
Because Apq Apr Arq is in G, Apq yz x.
44
Corollary 2.18 Every regular language is context
free.
context-free languages
regular
languages
45
Theorem 2.19 [Pumping Lemma]
If A is a context- free language, then ther
e is a number p (thepumpinglength)
where, if s is any stringin A of length at least p, thens may be devided into
s uvxyz satisfyingthefollowingconditions:
1. For each i 0, uvi xyi z A ,
2. | vy | 0, and
3. | vxy | p.
46
T
Proof
R
R
u
v
x
y
T
T
R
R
R
u
z
x
v
R
y
v
x
y
z
u
z
47
Proof
Let b 2 be themaximumnumber of symbolsin theRHS of a rule.
In any parse tree,no node can havemore thanb children.
If theheight of theparse treeis at most h, thelength of stringgenerated
is at most b h .
We set p b|V |2 b|V |1. T hen,a parse tree (havingthesmallest number of
nodes) for any string s of length at least p requires height at least | V | 2.
T helongest path must havelength at least | V | 2, which
must haveat least | V | 1 variablessince only leavesconsist of terminals.
T hussome variable,say R, repeat s.
For lat er convenience, let R be theone thatrepeat sthelowest in thepath.
Condition1 is obvious.
48
Proof
Condition2 requires thatboth v and y are not . If they were, theminimality
of is broken.See thefigure in thepreviousslide.
Condition3 :
We chose R so thatit repeatsin thebottom| V | 1 variableson thepath.
So, thesubtree where R generatesvxy is at most | V | 2 high. A treeof thisheight
can generatea stringof length at most b|V |2 p.
49
Example 2.20 B {a nbnc n | n 0}. Let s a pb pc p uvxyz.
ap
Case
v
v and y are
homogeneous
bp
cp
y
v
y
v
y
v
y
v
y
Case
v or y is
heterogeneous
or
v
y
50
Example 2.21 C {aib j ck | 0 i j k}.
Let s a pb pc p .
Case v and y are homogeneous
ap
v
bp
cp
y
See if uv2xy2z
breaks the balance.
v
y
v
y
v
See if uv0xy0z=uxz
breaks the balance.
y
v
y
Case v or y is heterogeneous
or
See if uv2xy2z or
uxz destroys the
51
order.
v
y
Example 2.22
0p
1p
D {ww | w {0,1}*}. Let s 0 p1p 0 p1p.
0p
1p
0p
1p
See if the first half of uv2xy2z
begins with 0 while the latter
half begins with 1.
vxy
0p
1p
vxy
0p
1p
0p
1p
See if the first half of uv2xy2z
ends with 0 while the latter half
ends with 1.
See if uv0xy0z=uxz =0p1i0j1p,
where i and j can not both be p.
vxy
52