Transcript Chapter 3

Chapter 3 - 2
Construction Techniques
1
Section 3.3 Grammars
• A grammar is a finite set of rules, called productions, that
are used to describe the strings of a language.
• Notational Example. The productions take the form a →
ß, where a and ß are strings over an alphabet of
terminals and nonterminals. Read a → ß as, “a produces
ß,” “a derives ß,” or “a is replaced by ß.” The following
four expressions are productions for a grammar.
S → aSB
Alternative Short form
S→Λ
S → aSB | Λ
B → bB
B → bB | b
B → b.
2
Terms
• Terminals: {a, b}, the alphabet of the
language.
• Nonterminals: {S, B}, the grammar
symbols (uppercase letters), disjoint from
terminals.
• Start symbol: S, a specified nonterminal
alone on the left side of some production.
• Sentential form: any string of terminals
and/or nonterminals.
3
Derivation
• a transformation of sentential forms by means of
productions as follows: If xay is a sentential form and
a→ß is a production, then the replacement of a by ß in
xay to obtain xßy is a derivation step, which we denote
by xay → xßy.
• Example Derivation: S ➯ aSB ➯ aaSBB ➯ aaBB ➯
aabBB ➯ aabbB ➯ aabbb.
This is a leftmost derivation, where each step replaces
the leftmost nonterminal.
• The symbol ➯+ means one or more steps and ➯* means
zero or more steps. So we could write S ➯+ aabbb or S
➯* aabbb or aSB ➯* aSB, and so on.
4
The Language of a Grammar
• The language of a grammar is the set of terminal strings derived
from the start symbol.
• Example. Can we find the language of the grammar S → aSB | Λ
and B → bB | b?
• Solution: Examine some derivations to see if a pattern emerges.
S➯Λ
S ➯ aSB ➯ aB ➯ ab
S ➯ aSB ➯ aB ➯ abB ➯ abbB ➯ abbb
S ➯ aSB ➯ aaSBB ➯ aaBB ➯ aabB ➯ aabb
S ➯ aSB ➯ aaSBB ➯ aaBB ➯ aabBB ➯ aabbBB ➯ aabbbB ➯
aabbbb.
So we have a pretty good idea that the language of the grammar is
{anbn+k | n, k ∊ N}.
• Quiz (1 minute). Describe the language of the grammar S → a | bcS.
• Solution: {(bc)na | n ∊ N}.
5
Construction of Grammars
• Example. Find a grammar for {anb | n ∊ N}.
• Solution: We need to derive any string of
a’s followed by b. The production S → aS
can be used to derive strings of a’s. The
production S → b will stop the derivation
and produce the desired string ending with
b. So a grammar for the language is
S → aS | b.
6
Quizzes
• Quiz (1 minute). Find a grammar for {ban |
n ∊ N}.
• Solution: S → Sa | b.
• Quiz (1 minute). Find a grammar for {(ab)n
| n ∊ N}.
• Solution: S → Sab | Λ or S → abS | Λ.
7
Rules for Combining Grammars
• Let L and M be two languages with grammars
that have start symbols A and B, respectively,
and with disjoint sets of nonterminals. Then the
following rules apply.
• L ∪ M has a grammar starting with S → A | B.
• LM has a grammar starting with S → AB.
• L* has a grammar starting with S → AS | Λ.
8
Example
• Find a grammar for {ambmcn | m, n ∊ N}.
• Solution: The language is the product LM, where
L = {ambm | m ∊ N} and M = {cn | n ∊ N}.
So a grammar for LM can be written in terms of
grammars for L and M as follows.
S → AB
A → aAb | Λ
B → cB | Λ.
9
Example
• Find a grammar for the set Odd, of odd decimal numerals with no
leading zeros, where, for example, 305 ∊ Odd, but 0305 ∉ Odd.
• Solution: Notice that Odd can be written in the form Odd = (PD*)*O,
where
O = {1, 3, 5, 7, 9}, P = {1, 2, 3, 4, 5, 6, 7, 8, 9}, and D = {0} ∪ P.
Grammars for O, P, and D can be written with start symbols A, B,
and C as:
A → 1 | 3 | 5 | 7 | 9, B → A | 2 | 4 | 6 | 8, and C → B | 0.
Grammars for D* and PD* and (PD*)* can be written with start
symbols E, F, and G as:
E → CE | Λ, F → BE, and G → FG | Λ.
So a grammar for Odd with start symbol S is
S → GA.
10
Example
• Find a grammar for the language L defined inductively by,
– Basis: a, b, c ∊ L.
– Induction: If x, y ∊ L then ƒ(x), g(x, y) ∊ L.
• Solution: We can get some idea about L by listing some of its
strings.
a, b, c, ƒ(a), ƒ(b), …, g(a, a), …, g(ƒ(a), ƒ(a)), …, ƒ(g(b, c)), …,
g(ƒ(a), g(b, ƒ(c))), …
So L is the set of all algebraic expressions made up from the letters
a, b, c, and the function symbols ƒ and g of arities 1 and 2,
respectively. A grammar for L can be written as
S → a | b | c | ƒ(S) | g(S, S).
For example, a leftmost derivation of g(ƒ(a), g(b, ƒ(c))) can be
written as
S ➯ g(S, S) ➯ g(ƒ(S), S) ➯ g(ƒ(a), S) ➯ g(ƒ(a), g(S, S))
➯ g(ƒ(a), g(b, S)) ➯ g(ƒ(a), g(b, ƒ(S))) ➯ g(ƒ(a), g(b, ƒ(c))).
11
Parse Tree
• A Parse Tree is a tree that represents
a derivation. The root is the start
symbol and the children of a
nonterminal node are the
symbols(terminals, nonterminals, or
Λ) on the right side of the production
used in the derivation step that
replaces that node.
• Example. The tree shown in the
picture is the parse tree for the
following derivation:
S ➯ g(S, S) ➯ g(ƒ(S), S) ➯ g(ƒ(a), S)
➯ g(ƒ(a), b).
S
g
(
f
S
(
S
,
S
)
b
a
12
)
Ambiguous Grammar
• Means there is at least one string with two distinct parse
trees, or equivalently, two distinct leftmost derivations or
two distinct rightmost derivations.
• Example. Is the grammar S → SaS | b ambiguous?
• Solution: Yes. For example, the string babab has two
distinct leftmost derivations:
S ⇒ SaS ⇒ SaSaS ⇒ baSaS ⇒ babaS ⇒ babab.
S ⇒ SaS ⇒ baS ⇒ baSaS ⇒ babaS ⇒ babab.
The parse trees for the derivations are pictured in the
next slide.
13
parse trees
S
S
S
b
S
a
S
S
a
S
b
b
b
a
S
S
b
a
S
b
14
Quiz (2 minutes)
• Show that the grammar S → abS | Sab | c
is ambiguous.
• Solution: The string abcab has two distinct
leftmost derivations:
S ⇒ abS ⇒ abSab ⇒ abcab
S ⇒ Sab ⇒ abSab ⇒ abcab.
15
Unambiguous Grammar
• Sometimes one can find a grammar that is not
ambiguous for the language of an ambiguous
grammar.
• Example. The previous example showed
S → SaS | b is ambiguous. The language of the
grammar is {b, bab, babab, …}. Another
grammar for the language is S → baS | b. It is
unambiguous because S produces either baS or
b, which can’t derive the same string.
16
Example
• The previous quiz showed S → c | abS | Sab is
ambiguous. Its language is
{(ab)mc(ab)n | m, n ∈ N}.
Another grammar for the language is
S → abS | cT and T → abT | ٨.
• It is unambiguous because S produces either
abS or cT, which can’t derive the same string.
• Similarly, T produces either abT or ٨, which can’t
derive the same string.
17
The End of Chapter 3 - 2
18