Pumping Lemma for CFL`s

Download Report

Transcript Pumping Lemma for CFL`s

Chapter 6 Properties of Context-free
Languages
6.1 Pumping Lemma for CFL's
Lemma 1 (The pumping lemma for context-free languages).
Let L be any infinite CFL.
Then there exists a constant n, depending on L, such that if z
is in L and |z|  n, then we can write z = uvwxy such that
|vx|  1,
|vwx|  n, and
for all i  0, uviwxiy is in L.
Proof (sketch of the proof)
Let G be a Chomsky normal-form grammar
generating L - {}.
Since the size of the variables of a context-free grammar for L
is fixed and L is infinite. There exits a long sentence z in L such
that any parse tree for z must contain a long path of variables.
And there would be at least one variable that appear twice in the
path.
For Chomsky normal-form grammar, there are only two types
of the production rules: ABC and Aa.
Hence for a parse tree of sentence z in L having no path of
length greater than i, then the length of z is no more than 2 i-1.
We consider the length of a path as the length of the internal
nodes of the path, not including the leaf.
S
a
|Path|=k=1,
|word|=1 ≦ 2k-1
S
A
a
S
B
T1
b
|path|=k=2
|word|=2 ≦ 2k-1
T2
|path of T1| ≦ i
|word of T1| ≦ 2i-1
|path of T2| ≦ i
|word of T2| ≦ 2i-1
=>
|path| ≦ i+1
|word| ≦ 2i
Suppose that G has k variables, and let n = 2k.
If z is in L and |z|  n, then any parse tree of z must have a path
of length > k.
Hence there must be at least one variables that appears at least
twice in the path of the parse tree.
Let the two vertices v1 and v2 in the parse tree be the vertices
labeled with the same variable A, and the vertex v1 is closer to
the root than vertex v2 . Also they are the vertices with the
same label closest to leaves of the tree.
S
A
≦ k+1
A
u
v
w
x
≦ 2k = n
Since ABC, we have that |vx|≧1.
A * vAx and A*w, where |vwx| ≦n.
y
Now, we have that A * vAx and A*w, where
|vwx| ≦n.
A * viwxi, for i=0, 1, 2, … .
Hence S * uviwxiy in L, for i=0, 1, 2, … .
End of lemma 1.
Example 1 L={0n1n0n | n=0, 1, 2,…} is not context-free.
Pf Suppose that L is context-free. Let n be the constant
in the pumping lemma for CFL.
For a sentence z in L, |z|  n. Write z = uvwxy, for all u,
v, w, x, and y with |vx|  1 and |vwx|  n.
If the substring vwx contains only one kind of symbol, then
for i=0, uwx is not in L. Since the number of 1’s is not equal to
the number of consecutive 0’s.
If the substring vwx contains two kinds of symbols, then for
i=2, uv2wx2y is not in L. Since there are 0’s between two 1’s.
Contradiction.
Example 2 L={0n | n is a prime numer} is not context-free.
Pf Suppose that L is context-free. Let n be the constant
in the pumping lemma for CFL.
For a sentence z in L, |z| = p  n, where p is a prime.
Write z = uvwxy, for all u, v, w, x, and y satisfying that
|vx|  1, and |vwx|  n.
Let |vx|=k  1. Choose i = (p+1)k.
We have that |uviwxiy| = (p – k) + i*k = (p – k) + (p + 1)k =
p(k + 1), a composite number. Hence uviwxiy is not in L.
Contradiction.
Example 3 L={ambrcsdt | m = 0 or r = s = t} is not context-free.
Pumping lemma for CFL fails.
For z = brcsdt = uvwxy. When v and x are of the same
type of symbol, say b, for i = 0, 1, 2, …, uviwxiy are all
in L.
For z = ambrcrdr = uvwxy. When v and x are of the same
type of symbol, say a, for i = 0, 1, 2, …, uviwxiy are all
in L.
Ogden’s lemma for CFL works.
Lemma 2 (Ogden’s lemma).
Let L be any infinite CFL.
Then there exists a constant n, depending on L, such that if z
is in L and |z|  n, and we mark any n or more positions of z
“distinguished,” then we can write
z = uvwxy, such that:
 v and x together have at least one distinguished position,
 vwx has at most n distinguished positions, and
 for all i  0, uviwxiy is in L.
Proof (sketch of the proof)
Let G be a Chomsky normal-form grammar
generating L - {}.
Suppose that G has k variables, and let n = 1+ 2k.
Suppose that z is in L and |z|  n. We can mark any n or more
positions of z “distinguished”.
Select a path in a parse tree of z so that each vertex of the path
is a branch point, i.e., both branches of the vertex have
distinguished descendants.
The selection is as follows:
Step 1: path = {}, empty.
Step 2: path  S; P := S; // path = <S>
Step 3: If P is leaf, done.
Step 4: If P has two children, say A and B.
If the sub-tree of B has less number of
distinguished descendant than the sub-tree of A,
then path  A; P := A; // path = <S, …, A>
else path  B; P := B; // path = <S, …, B>
goto step 3.
Since there are at least n markers on the leaves, the selected path
must have at least k+1 branch points. Hence there are at least
one variable on the path appears twice or more.
The rest of the proof is similar to the proof of pumping lemma
for context-free languages.
End of lemma 2.
Example 4 L = {brcsdt | r≠s ≠ t ≠r} is not context-free.
Proof (by Ogden’s lemma)
Suppose that L is CF. Let n be the constant in the Ogden’s lemma.
Choose z = bncn+n!dn+2n!.
Let positions of the b’s be distinguished and z = uvwxy.
v and x together have at least one b .
vwx has at most n b’s.
If either v or x contains two different kinds of symbols, then
uv2wx2y is not in L.
If each of v and x contains only one kind of symbol, then one
of v and x must be a substring of b+.
1. If x is in c* or d*, then v must be in b+.
Assume that x is in c* and let |v| = s, i.e., v = bs.
Then 1≦s ≦n. We have that s | n!.
Let t = n!/s. Choose i = 2t+1.
Then z’= uv2t+1wx2t+1y is in L. But v2t+1 = bs+2st = bs+2n!.
uwx has (n – s) b’s  z’ has (n – s) + s + 2n! = n+2n! b’s. We
have that z’ is not in L. Contradiction.
Assume that x is in d* and let |v| = s, i.e., v = bs.
Let t = n!/s. Choose i = t+1. Then z’= uvt+1wxt+1y is in L. But
vt+1 = bs+st = bs+n!. uwx has (n – s) b’s.
Then z’ has (n – s) + s + n! = n+n! b’s and z’ is not in L.
Contradiction.
2. If x is in b+, then v must be in b+. And w is in b*.
Let |vx| = s, i.e., vx = bs.
Then 1≦s ≦n. We have that s | n!.
Let t = n!/s. Choose i = 2t+1. Then z’= uv2t+1wx2t+1y is in L.
But v2t+1 x2t+1= bs+2st = bs+2n!.
The substring uwx has (n – s) b’s.
Then z’ has n+2n! b’s and hence is not in L.
Contradiction.
End of example 4.
Example 5 L={ambrcsdt | m = 0 or r = s = t} is not context-free.
Proof (by Ogden’s lemma)
Choose z = anbncndn.
Mark all positions of b’s “distinguished.
Write z = uvwxy, where vx contains at least one b, and vx
contains at most n b’s.
Choose i = 0. We have that z’ = uwy is in L.
But the number of b’s in z’ ≠ the number of c’s in z’.
Contradiction.