Chapter 12: Context-Free Languages and Pushdown Automata

Download Report

Transcript Chapter 12: Context-Free Languages and Pushdown Automata

Chapter 12: Context-Free
Languages and Pushdown
Automata
Section 12.4 Context-Free
Language Topics
James L. Hein - Discrete
Structures, Logic, and
1
Remove Λ-productions
Algorithm. Remove Λ-productions from
grammars for langauges without Λ.
1. Find nonterminals that derive Λ.
2. For each production A → w construct all
productions A → w’ where w’ is obtained from
w by removing one or more occurrences of the
nonterminals from Step 1.
3. Combine the original productions with those of
step 2 and eliminate any Λ-productions.
James L. Hein - Discrete
Structures, Logic, and
2
Example
Remove Λ-productions from the grammar
S → ABc
A → aA | Λ
B → bB | Λ.
Solution.
• Step 1: The nonterminals A and B derive Λ.
• Step 2: From the production S → ABc we construct S → Bc | Ac | c.
From the production A → aA we construct A → a.
From the production B → bB we construct B → b.
• Step 3:
S → ABc | Bc | Ac | c
A → aA | a
B → bB | b.
James L. Hein - Discrete
Structures, Logic, and
3
Quiz
Remove Λ -productions from
S → ABc | Ab | c
A → ABa | Λ
B → Bbc | Λ.
Solution.
S → ABc | Ab | c| Bc | Ac | b
A → ABa | Ba | Aa | a
B → Bbc | bc.
James L. Hein - Discrete
Structures, Logic, and
4
Chomsky Normal Form
Productions have one of the following forms
• A → b (b a terminal)
• A → BC
• S → Λ (if Λ is in the language).
Advantages: Parse trees are binary, which
are easy to represent. Any string of length
n > 0 can be derived in 2n – 1 steps.
James L. Hein - Discrete
Structures, Logic, and
5
Transform context-free grammar
to Chomsky normal form
Algorithm. Transform context-free grammar to Chomsky normal
form
1. Remove A → Λ (if A ≠ S) by previous algorithm. (If S → Λ is removed,
add it back.)
2. Remove unit productions (i.e., A → B): If A → B or A + B, then
construct productions A → w where B → w is not a unit production.
Now remove all unit productions.
3. For each production whose right side has two or more symbols,
replace all occurrences of each terminal a with a new nonterminal A
and also add the new production A → a.
4. Replace each production B → C1…Cn with n > 2 with B → C1D
where D → C2 …Cn.
Repeat this step until all right sides have length two.
James L. Hein - Discrete
Structures, Logic, and
6
Example
Construct a Chomsky normal form for the grammar
S → aSb | D
D → Dc | Λ.
Solution.
Step 1: S → aSb | ab | D | Λ
D → Dc | c.
Step 2: S → aSb | ab | Dc | c | Λ
D → Dc | c.
Step 3: S → ASB | AB | DC | c | Λ
D → DC | c
A→a
B→b
C → c.
Step 4: Replace S → ASB by
S → AE and E → SB.
James L. Hein - Discrete
Structures, Logic, and
7
Quiz
Construct a Chomsky normal form for the grammar
S → aSbb | T | Λ
T → cT | d.
Solution.
Step 1: No change in Λ -productions.
Step 2: Remove unit production S → T to obtain
S → aSbb | cT | d | Λ
T → cT | d.
Step 3: Transform right sides of length at least two into strings of nonterminals.
S → ASBB | CT | d | Λ
T → CT | d
A→a
B→b
C → c.
Step 4: Transform right sides into strings of length at most two.
S → AD | CT | d | Λ
D → SE
E→ BB
T → CT | d
A→a
B→b
C → c.
James L. Hein - Discrete
Structures, Logic, and
8
Greibach Normal Form
Productions have one of the following forms
A → b (b a terminal)
A → bD1…Dk
S → Λ (if Λ is in the language).
Advantage: Any string of length n > 0 can be
derived in n steps.
James L. Hein - Discrete
Structures, Logic, and
9
Transform context-free grammar
to Greibach normal form
Algorithm (idea). Transform context-free
grammar to Greibach normal form.
1. Remove all left-recursion.
2. Remove Λ−productions. (If S → Λ is
removed, add it back.)
3. Make substitutions to transform the
grammar into the proper form.
James L. Hein - Discrete
Structures, Logic, and
10
Example
Put the following grammar into Greibach normal form.
S → AB | Ac | d
A → aA | a
B→ Ab | c.
Solution: Steps 1 and 2 are unnecessary for this grammar.
Step 3: Replace A in S → AB | Ac | d with aA | a to obtain
S → aAB | aB | aAc | ac | d.
Replace A in B→ Ab | c with aA | a to obtain
B→ aAb | ab | c.
Add the new productions C → c and D → b to obtain the proper form:
S → aAB | aB | aAC | aC | d
A → aA | a
B→ aAD | aD | c
C→c
D → b.
James L. Hein - Discrete
Structures, Logic, and
11
Quiz
Put the following grammar into Greibach normal form.
S → BaS | B
B→ cSd | a.
Solution: Steps 1 and 2 are unnecessary for this grammar.
Step 3: Replace B in S → BaS | B with cSd | a to obtain:
S → cSdaS | aaS | cSd | a.
Add the new productions A → a and D → d to obtain
the proper form:
S → cSDAS | aAS | cSD | a
A→a
D→d
B→ cSD | a (Not needed in this example).
James L. Hein - Discrete
Structures, Logic, and
12
Properties of Context-Free
Languages
When we know some properties of contextfree languages they can help us argue,
BWOC, that certain languages are not
context-free.
James L. Hein - Discrete
Structures, Logic, and
13
The Pumping Lemma
If L is an infinite context-free language, then any grammar for L must be
recursive, so there must be derivations of the following form where
u, v, w, x, and y are terminal strings.
S ➾+ uNy
N ➾+ vNx (where v and x are not both Λ)
N ➾+ w.
These derivations lead to derivations like
S ➾+ uNy ➾+ uvNxy ➾+ uv2Nx2y ➾+ uvkNxky ➾+ uvkwxky ∊ L for all k
∊ N.
This is the basis for the Pumping Lemma:
There is an integer m > 0 such that if z ∊ L and | z | ≥ m, then z has the
form z = uvwxy where
1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky ∊ L for all k ∊ N.
Note: The number m depends on the grammar as we’ll see in the
following example.
James L. Hein - Discrete
Structures, Logic, and
14
Example
Suppose we have the following grammar for {Λ, bbc} ⋃ {abcnd | n ∊ N}.
S → aNd | bbc | Λ
N→ Nc | b.
Here are a few derivations:
S ➾ aNd ➾ abd
S ➾ aNd ➾ aNcd ➾ abcd
S ➾ aNd ➾ aNcd ➾ aNccd ➾ abccd
S ➾+ abckd for any k in N.
For this grammar m = 4 can be used in the pumping lemma because
any derivation of a string z with | z | ≥ 4 must use the nonterminal N.
For example, if | z | = 8 and z = abcccccd,
then the pumping lemma factors z = abcccccd = uvwxy where 1 ≤ | vx |
≤ | vwx | ≤ 4 and
uvkwxky ∊ L for all k ∊ N. In this case let u = a, v = Λ, w = b, x = c, and y
= ccccd.
James L. Hein - Discrete
Structures, Logic, and
15
Example
The language L = {anbncn+k | k, n ∊ N} is not context-free.
Proof: Assume, BWOC, that L is context-free. L is infinite, so pumping lemma applies.
Choose z = ambmcm where m is the positive integer from the lemma.
Then z = ambmcm = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky ∊ L for all k ∊ N.
Observe neither v nor x can contain distinct letters. For example, if v = …a…b…,
then
v2 = …a…b……a…b…, which can’t appear as a substring of any string in L.
So v and x must be strings of repeated occurrences of a single letter.
Now since | vwx | ≤ m, there are two possible places in ambmcm where v and x must
occur:
(1) v and x occur in ambm.
(2) v and x occur in bmcm.
But we obtain the following contradictions because v and x are not both Λ.
(1) Let k = 2 to obtain uv2wx2y = am+ibm+jcm, where i > 0 or j > 0. So uv2wx2y ∉ L
(2) Let k = 0 to obtain uwy = ambm-icm–j, where i > 0 or j > 0. So we have uwy ∉ L.
These contradictions imply that L is not context-free. QED.
James L. Hein - Discrete
Structures, Logic, and
16
Example/Quiz
Prove that the language L = {ss | s ∊ {a, b}*} is not context-free.
Proof: Assume, BWOC, that L is context-free. L is infinite, so pumping lemma
applies.
Choose z = ambmambm where m is the positive integer from the lemma.
Then z = ambmambm = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky ∊ L
for all k ∊ N.
Now since | vwx | ≤ m, there are three possible places in ambmambm where v
and x must occur:
(1) v and x occur in ambm (on the left of z).
(2) v and x occur in bmam (in the center of z).
(3) v and x occur in ambm (on the right of z).
Notice that v and x can consist only of repetitions a single letter. For
example, in case (1) suppose v = aibj for some i > 0 and j > 0 and x = bn for
some n ≥ 0. Then, letting k = 0, we would obtain uwy = am–ibm–j–nambm, which
cannot be in L. The argument is similar for the other cases. So v and x must
consist only of repetitions of a single letter.
James L. Hein - Discrete
Structures, Logic, and
17
Example/Quiz Cont’d
We need to find a contradiction in each of the three cases. We’ll do it by
using k = 0.
This tells us that uwy ∊ L. But we obtain the following contradictions
because v and x are not both Λ.
(1) uwy = am–ibm–jambm where either i > 0 or j > 0 So uwy ∉ L,
(2) uwy = ambm–iam–jbm where either i > 0 or j > 0. So uwy ∉ L.
(3) uwy = ambmam–ibm–j where either i > 0 or j > 0. So uwy ∉ L.
Therefore L is not context-free. QED.
Remark: Be careful that the choice of z is not in a context-free
sublanguage of L. For example, if we chose z = (ab)m(ab)m in the
preceding example, we would not get any contradictions.
James L. Hein - Discrete
Structures, Logic, and
18
The End of Chapter 12 - 4
James L. Hein - Discrete
Structures, Logic, and
19