Chapter 5

Transcript Chapter 5

Chapter 7
Pushdown Automata
Context Free Languages
•
•
A context-free grammar is a simple recursive way
of specifying grammar rules by which strings of a
language can be generated.
All regular languages, and some non-regular
languages, can be generated by context-free
grammars.
7/16/2015 5:18 PM
Context Free Languages
•
•
Regular Languages are represented by regular
expressions
Context Free Languages are represented by a
context-free grammar
7/16/2015 5:18 PM
Context Free Languages
•
Regular Languages are accepted by
deterministic finite automata (DFAs).
•
Context Free Languages are accepted by
pushdown automata, which are nondeterministic finite state automata with a stack
as auxiliary memory.
•
Note that pushdown automata which are
deterministic can represent some but not all of
the context-free languages.
7/16/2015 5:18 PM
Definition
A context-free grammar (CFG) is a 4-tuple
G = (V, T, S, P) where V and T are disjoint
sets, S  V, and P is a finite set of rules of the
form A  x, where A  V and x  (V  T)*.
V = non-terminals or variables
T = terminals
S = Start symbol
P = Productions or grammar rules
7/16/2015 5:18 PM
Example
Let G be the CFG having productions:
S  aSa | bSb | c
Then G will generate the language
L = {xcxR | x  {a, b}*}
This is the language of odd palindromes palindromes with a single isolated character in
the middle.
Memory





What kind of memory do we need to be able to
recognize strings in L, using a single left-to-right pass?
Example: aaabbcbbaaa
We need to remember what we saw before the c
We could push the first part of the string onto a stack
and, when the c is encountered, start popping
characters off of the stack and matching them with
each character from the center of the string on to the
end.
If everything matches, this string is an odd palindrome.
7/16/2015 5:18 PM
Counting
We can use a stack for counting out equal
numbers of a’s and b’s on different sides of a
center marker.
 Example: L = ancbn
aaaacbbbb
 Push the a’s onto the stack until you see a c,
then pop an a off and match it with a b
whenever you see a b.
 If we finish processing the string successfully
(and there are no more a’s on our stack), then
the string belongs to L.

7/16/2015 5:18 PM
Definition 7.1: Pushdown Automaton
A nondeterministic pushdown automaton (NPDA) is a
7-tuple
M = (Q, S, G, q0, d, z, F), where
Q is a finite set of states
S is the input alphabet (a finite set)
G is the stack alphabet (a finite set)
d : Q  (S  {l})  G  (finite subsets of Q  G*)
is the transition function
q0  Q is the start state
z  G is the initial stack symbol
F  Q is the set of accepting states
7/16/2015 5:18 PM
Production rules
So we can fully specify any NPDA like this:
Q = {q0, q1, q2, q3}
S = {a, b}
G = {0, 1}
q0 is the start state
z = # (the empty stack marker)
F = {q3}
d is the transition function:
7/16/2015 5:18 PM
Production rules
δ(q0, a, #)  {(q1, 1#), (q3, λ)}
δ(q0, λ, #)  {(q3, λ)}
δ(q1, a, 1)  {(q1, 11)}
δ(q1, b, 1)  {(q2, λ)}
δ(q2, b, 1)  {(q2, λ)}
δ(q2, λ, #)  {(q3, λ)}
This PDA is nondeterministic. Why?
7/16/2015 5:18 PM
Production rules
Note that in an FSA, each rule told us
that when we were in a given state and
saw a specific character, we moved to
a specified state.
In a PDA, we also need to know what is
on the stack before we can decide what
new state to move to. After moving to
the new state, we also need to decide
what to do with the stack.
7/16/2015 5:18 PM
Working with a stack:





You can only access the top element of the stack.
To access the top element of the stack, you have to
POP it off the stack.
Once the the top element of the stack has been
POPped, if you want to save it, you need to PUSH it
back onto the stack immediately.
Characters from the input string must be read one
character at a time. You cannot back up.
The current configuration of the machine includes:
the current state, the remaining characters left in the
input string, and the entire contents of the stack
7/16/2015 5:18 PM
L={anbn:n0}  {a}

In the previous example we had two key transitions:
a, 1)  {(q1, 11)}, which adds a 1
to the stack when an a is read
 δ(q1, b, 1)  {(q2, λ)}, which removes a
1 when a b is encountered
 We also have the rule: δ(q0, a, #)  {(q1,
1#), (q3, λ)}, which allows us to
transition directly to the acceptance
state, q3, if we initially see an a
 δ(q1,
7/16/2015 5:18 PM
Instantaneous description
Given the transition function
d : Q  (S  {l})  G  (finite subsets of Q  G*)
a configuration, or instantaneous description, of M is a
snapshot of the current status of the PDA. It consists of
a triple:
(q, w, u)
where:
q  Q (q is the current state of the control unit)
w  S* (w is the remaining unread part of the input
string), and
u  G* (u is the current stack contents, with the
leftmost symbol indicating the top of the stack)
7/16/2015 5:18 PM
Instantaneous description
To indicate that the application of a transition rule has
caused our PDA to move from one state to another,
we use the following notation:
(q1, aw, bx) |- (q2, w, yx)
To indicate that we have moved from one state to
another via the application of several rules, we use:
(q1, aw, bx) |-* (q2, w, yx)
or
(q1, aw, bx) |-M* (q2, w, yx) to indicate a specific
PDA
7/16/2015 5:18 PM
Definition 7.2: Acceptance
If M = (Q, S, G, d, q0, z, F) is a push-down
automaton and w  S*, the string w is accepted
by M if:
(q0, w, #) |-M* (qf, l, u)
for some u  G* and some qf  F.
This means that we start at the start state, with the
stack empty, and after processing the string w,
we end up in an accepting state, with no more
characters left to process in the original string.
We don’t care what is left on the stack.
This is called acceptance by final state.
7/16/2015 5:18 PM
2 types of acceptance
An alternate type of acceptance is acceptance by
empty stack.
This means that we start at the start state, with the
stack empty, and after processing the string w,
we end up with no more characters left to process
in the original string, and no more characters
(except the empty-stack character) left on the
stack.
7/16/2015 5:18 PM
2 types of acceptance
The two types of acceptance are equivalent; if we
can build a PDA to accept language L via
acceptance by final state, we can also build a
PDA to accept L via acceptance by empty stack.
7/16/2015 5:18 PM
Definition 7.2: Acceptance
A language L  S* is said to be accepted by
M if L is precisely the set of strings
accepted by M. In this case, we say that
L = L(M).
7/16/2015 5:18 PM
Determinism/non-determinism:



A deterministic PDA must have only one transition
for any given pair of input symbol/ stack symbol.
A non-deterministic PDA (NPDA) may have no
transition or several transitions defined for a
particular input symbol/stack symbol pair.
In an NPDA, there may be several paths to follow to
process a given input string. Some of the paths may
result in accepting the string. Other paths may end
in a non-accepting state. An NPDA can “guess”
which path to follow through the machine in order to
accept a string.
7/16/2015 5:18 PM
Example: anbcn
l, # / #
b, a / a
q0
a, # / a#
a, a / aa
q1
q2
c, a / l
L = {anbcn | n>0}
7/16/2015 5:18 PM
Production rules for anbcn
Rule # State
Input
Top of Stack Move(s)
1
q0
a
#
(q0, a#)
2
q0
a
a
(q0, aa)
3
q0
b
a
(q1, a)
4
q1
c
a
(q1, λ)
5
q1
λ
#
(q2, #)
7/16/2015 5:18 PM
Example: aabcc
l, # / #
b, a / a
q0
a, # / a#
a, a / aa
q1
c, a / l
q2
#
aabcc
7/16/2015 5:18 PM
Example: aabcc
l, # / #
b, a / a
q0
a, #/ a#
a, a / aa
q1
c, a / l
q2
a
#
abcc
7/16/2015 5:18 PM
Example: aabcc
l, # / #
b, a / a
q0
a, # / a#
a, a / aa
q1
c, a / l
q2
a
a
bcc
#
7/16/2015 5:18 PM
Example: aabcc
l, # / #
b, a / a
q0
a, # / a#
a, a / aa
q1
c, a / l
q2
a
a
cc
#
7/16/2015 5:18 PM
Example: aabcc
l, # / #
b, a / a
q0
a, # / a#
a, a / aa
q1
c, a / l
q2
a
#
c
7/16/2015 5:19 PM
Example: aabcc
l, # / #
b, a / a
q0
a, # / a#
a, a / aa
q1
c, a / l
q2
#
λ
7/16/2015 5:19 PM
Example: aabcc
l, # / #
b, a / a
q0
a, # / a#
a, a / aa
q1
c, a / l
q2
#
λ
7/16/2015 5:19 PM
Example: Odd palindrome
c, # / #
c, a / a
c, b / b
q0
a, # / a#
b, # / b#
a, a / aa
b, a / ba
a, b / ab
b, b / bb
l, # / #
q2
q1
a, a / l
b, b / l
L = {xcxR | x  {a, b}*}
7/16/2015 5:19 PM
Production rules for Odd palindromes
Rule # State
1
q0
2
q0
Input
a
b
Top of Stack
#
#
Move(s)
(q0, a#)
(q0, b#)
3
q0
a
a
(q0, aa)
4
5
6
7
q0
q0
q0
q0
b
a
b
c
a
b
b
#
(q0, ba)
(q0, ab)
(q0, bb)
(q1, #)
8
9
10
q0
q0
q1
c
c
a
a
b
a
(q1, a)
(q1, b)
(q1, λ)
11
12
q1
q1
b
λ
b
#
(q1, λ)
(q2, #)
7/16/2015 5:19 PM
Processing abcba
Rule #
(initially)
Resulting state Unread input
q0
abcba
Stack
#
1
4
9
q0
q0
q1
bcba
cba
ba
a#
ba#
ba#
11
10
12
q1
q1
q2
a
-
a#
#
#
accept
7/16/2015 5:19 PM
Processing ab
Rule #
(initially)
Resulting state Unread input
q0
ab
Stack
#
1
4
q0
q0
crash
a#
ba#
b
-
7/16/2015 5:19 PM
Processing acaa
Rule #
(initially)
Resulting state Unread input
q0
acaa
Stack
#
1
8
10
q0
q1
q1
caa
aa
a
a#
a#
#
12
q2
crash
a
#
7/16/2015 5:19 PM
Crashing:
What is happening in the last example? We process
the first 3 letters of acaa and are in state q1. We have an a
left to process in our input string. We have the emptystack marker as the top character in our stack. Rule 12
says that if we are in state q1 and have # on the stack, then
we can make a free move (a l-move) to q2, pushing # back
onto the stack. So this is legal. So far, the automaton is
saying that it would accept aca. But note that we are in
state q2 and we still have the last a in our input string left
to process. There are no rules like this. On the next move,
when we try to process the a, the automaton will crash,
rejecting acaa.
7/16/2015 5:19 PM
Example: Even palindromes
Consider the following context-free
language:
L = {wwR | w  {a, b}*}
This is the language of all even-length
palindromes over {a, b}.
7/16/2015 5:19 PM
Production rules for Even palindromes
Rule # State
1
q0
2
q0
Input
a
b
Top of Stack
#
#
Move(s)
(q0, a#)
(q0, b#)
3
q0
a
a
(q0, aa)
4
5
6
7
q0
q0
q0
q0
b
a
b
λ
a
b
b
#
(q0, ba)
(q0, ab)
(q0, bb)
(q1, #)
8
9
10
q0
q0
q1
λ
λ
a
a
b
a
(q1, a)
(q1, b)
(q1, λ)
11
12
q1
q1
b
λ
b
#
(q1, λ)
(q2, #)
7/16/2015 5:19 PM
Example: Even palindromes
This PDA is non-deterministic.
Note moves 7, 8, and 9. Here the PDA is
“guessing” where the middle of the string
occurs. If it guesses correctly (and if the PDA
doesn’t accept any strings that aren’t actually in
the language), this is OK.
7/16/2015 5:19 PM
Example: Even palindromes
(q0, baab, #) ||||||-
(q0, aab, b#)
(q0, ab, ab#)
(q1, ab, ab#)
(q1, b, b#)
(q1, λ, #)
(q2, λ, #) (accept)
7/16/2015 5:19 PM
Example: All palindromes
Consider the following context-free
language:
L = pal = {x  {a, b}* | x = xR}
This is the language of all palindromes,
both odd and even, over {a, b}.
7/16/2015 5:19 PM
Production rules for All palindromes
Rule # State
1
q0
2
q0
Input
a
b
Top of Stack
#
#
Move(s)
(q0, a#), (q1, #)
(q0, b#), (q1, #)
3
q0
a
a
(q0, aa), (q1, a)
4
5
6
7
q0
q0
q0
q0
b
a
b
λ
a
b
b
#
(q0, ba), (q1, a)
(q0, ab), (q1, b)
(q0, bb), (q1, b)
(q1, #)
8
9
10
q0
q0
q1
λ
λ
a
a
b
a
(q1, a)
(q1, b)
(q1, λ)
11
12
q1
q1
b
λ
b
#
(q1, λ)
(q2, #)
7/16/2015 5:19 PM
Production rules for All palindromes
At each point before we start processing the second half of
the string, there are three possibilities:
1. The next input character is still in the first half of the
string and needs to be pushed onto the stack to save it.
2.The next input character is the middle symbol of an oddlength string and should be read and thrown away
(because we don’t need to save it to match it up with
anything).
3. The next input character is the first character of the
second half of an even-length string.
7/16/2015 5:19 PM
Production rules for All palindromes
Why is this PDA non-deterministic? Note
the first 6 rules of this NPDA. This PDA is
obviously non-deterministic, because in each of
these rules, there are two moves that may be
chosen.
7/16/2015 5:19 PM
Production rules for All palindromes
Each move in a PDA has three pre-conditions: the
current state you are in, the next character to be processed
from the input string, and the top character on the stack.
In rule 1, our current state is q0, the next character in
the input string is a, and the top character on the stack is the
empty-stack marker. But there are two possible moves for
this one set of preconditions:
1) move back to state q0 and push a# onto the stack
or
2) move to state q1 and push # onto the stack
Whenever we have multiple moves possible from a given
set of preconditions, we have nondeterminism.
7/16/2015 5:19 PM
Definition 7.3:
Let M = (Q, S, G, q0, z, A, d), be a pushdown automaton.
M is deterministic if there is no configuration for
which M has a choice of more than one move. In other
words, M is deterministic if it satisfies both of the
following:
1. For any q  Q, a  S  {l}, and X  G, the set d(q, a,
X) has at most one element.
2. For any q  Q and X  G, if d(q, l, X)  , then d(q,
a, X) =  for every a  S.
A language L is a deterministic context-free language if
there is a deterministic PDA (DPDA) accepting L.
7/16/2015 5:19 PM
Definition 7.3:
If M is deterministic, then multiple moves for a
single input/stack configuration are not
allowed. That is:
 Given stack = Y and input = X, there cannot
exist another move with the same stack value
and the same input from the same state.
 There may be l-productions, BUT for input =
l and stack = X, there cannot exist another
move with stack = X, from the same state.
7/16/2015 5:19 PM
Non-determinism
Some PDA’s which are initially described
in a non-deterministic way can also be
described as deterministic PDA’s.
 However, some CFLs are inherently nondeterministic, e.g.:

L = pal = {x  {a, b}* | x = xR} cannot be
accepted by any DPDA.
7/16/2015 5:19 PM
Example:
L = {w  {a, b}* | na(w) > nb(w)}
This is the set of all strings over the alphabet
{a, b} in which the number of a’s is greater
than the number of b’s. This can be
represented by either an NPDA or a DPDA.
7/16/2015 5:19 PM
Example (NPDA):
L = {w  {a, b}* | na(w) > nb(w)}
Rule #
State
Input
Top of Stack
Move(s)
1
q0
a
#
(q0, a#)
2
q0
b
#
(q0, b#)
3
q0
a
a
(q0, aa)
4
q0
b
b
(q0, bb)
5
q0
a
b
(q0, λ)
6
q0
b
a
(q0, λ)
7
q0
λ
a
(q1, a)
7/16/2015 5:19 PM
Example (NPDA):
What is happening in this PDA?
We start, as usual, in state q0. If the stack is
empty, we read the first character and push it
onto the stack. Thereafter, if the stack character
matches the input character, we push both
characters onto the stack. If the input character
differs from the stack character, we throw both
away. When we run out of characters in the
input string, then if the stack still has an a on
top, we make a free move to q1 and halt; q1 is the
accepting state.
7/16/2015 5:19 PM
Example (NPDA):
Why is it non-deterministic?
Rules 6 and 7 both have preconditions of: the
starting state is q0 and the stack character is a.
But we have two possible moves from here, one
of them if the input is a b, and one of them any
time we want (a l-move), including if the input
is a b. So we have two different moves allowed
under the same preconditions, which means this
PDA is non-deterministic.
7/16/2015 5:19 PM
Example (DPDA):
L = {w  {a, b}* | na(w) > nb(w)}
Rule #
1
2
State
q0
q0
Input
a
b
Top of Stack
#
#
Move(s)
(q1, #)
(q0, b#)
3
4
5
q0
q0
q0
a
b
a
b
b
#
(q0, λ)
(q0, bb)
(q1, a#)
6
7
8
q0
q0
q0
b
a
b
#
a
a
(q0, #)
(q1, aa)
(q1, λ)
7/16/2015 5:19 PM
Example (DPDA):
What is happening in this PDA?
Here being in state q1 means we have seen more a’s than
b’s. Being in state q0 means we have not seen more
a’s than b’s. We start in state q0.
If we are in state q0 and read a b, we push it onto the
stack. If we are in state q1 and read an a, we push it
onto the stack. Otherwise we don’t push a’s or b’s
onto the stack. Any time we read an a from the input
string and pop a b from the stack, or vice versa, we
throw the pair away and stay in the same state.
When we run out of characters in the input string, then
we halt; q1 is the accepting state.
7/16/2015 5:19 PM
7.2: PDAs and CFLs:
Theorem 7.1: For any context-free
language L, there exists an NPDA M such
that L = L(M).
7/16/2015 5:19 PM
7.2: PDAs and CFLs:
Proof:
If L is a context-free language (without λ),
there exists a context-free grammar G that
generates it.
We can always convert a context-free
grammar into Greibach Normal Form.
We can always construct an NPDA which
simulates leftmost derivations in the GNF
grammar.
QED
7/16/2015 5:19 PM
Greibach Normal Form:
Greibach Normal Form (GNF) for Context-Free
Grammars requires the Context-Free Grammar
to have only productions of the following form:
A  ax
where a  T and x  V*. That is,
Nonterminal  one Terminal concatenated
with a string of 0 or more Nonterminals
Convert the following Context-Free Grammar to
GNF:
S  abSb | aa
7/16/2015 5:19 PM
Greibach Normal Form:
S  abSb | aa
Let’s fix S  aa. Get rid of the terminal at the
end by changing this to S  aA and creating a
new rule, A  a.
Now let’s fix S  abSb. Get rid of bSb by
replacing the original rule with S  aX and
creating a new rule, X  bSb.
Unfortunately, this rule itself needs fixing.
Replace the rule with X  bSB by creating a
new rule, B  b.
7/16/2015 5:19 PM
Greibach Normal Form:
So, starting with this set of production rules:
S  abSb | aa
we now have:
S  aA | aX
X  bSB
Aa
Bb
(other solutions are possible)
7/16/2015 5:19 PM
7.2: CFG to PDA
To convert a context-free grammar to an equivalent pushdown
automaton:
1. Convert the grammar to Greibach Normal Form (GNF).
2. Write a transition rule for the PDA that pushes S (the Start symbol
in the grammar) onto the stack.
3. For each production rule in the grammar, write an equivalent
transition rule.
4. Write a transition rule that takes the automaton to the accepting
state when you run out of characters in the input string and the stack
is empty.
5. If the empty string is a legitimate string in the language described
by the grammar, write a transition rule that takes the automaton to the
accepting state directly from the start state.
7/16/2015 5:19 PM
7.2: CFG to PDA
How do you write the transition rules? It’s really simple:
1. Every rule in the GNF grammar has the following form:
One variable  one terminal + 0 or more variables
Example: A  bB
7/16/2015 5:19 PM
7.2: CFG to PDA
2. The left side of each transition rule is the precondition, a
triple that specifies what conditions must be true before you
can move to the next state. The precondition consists of the
current state, the character just read in from the input string,
and the symbol just popped off the top of the stack. So
write a transition rule that has as its precondition the current
state, the terminal from the grammar rule, and the left-hand
variable from the grammar rule.
Our grammar rule: A  bB
The left side of the transition rule: d(q1, b, A)
(What about the B? See the next slide.)
7/16/2015 5:19 PM
7.2: CFG to PDA
3. The right side of the transition rule is the post-condition.
The post-condition consists of the state to move to, and the
symbol(s) to push onto the stack. So the post-condition for
this transition rule will be the state to move to, and the
variable (or variables) on the right-hand side of the
grammar rule.
Example: d(q1, b, A) = {(q1, B)}
If there are no variables on the right-hand side of the
grammar rule, don’t push anything onto the stack. In the
transition rule, put a l where you would show what symbol
to push onto the stack.
Example: A  a[no variable here]
would be represented in transition rule form as:
d(q1, a, A) = {(q1, l)}
7/16/2015 5:19 PM
7.2: CFG to PDA
How do you know which state to move to? It’s really
simple:
1. We always start off with this special transition rule:
d(q0, l, #) = {(q1, S#)}
This rule says:
a. begin in state q0
b. pop the top of the stack. If it is # (the empty stack
symbol), then
c. take a free move to q1 without reading anything from the
input string, push # back onto the stack, and then push S
(the Start symbol in the grammar) onto the stack.
7/16/2015 5:19 PM
7.2: CFG to PDA
2. We always end up with this special transition rule:
d(q1, l, #) = {(q2, #)}
This rule says:
a. begin in state q1
b. pop the top of the stack. If it is # (the empty stack
symbol), then
c. take a free move to q2 without reading anything from the
input string, and push # back onto the stack.
In order to be in state q1 we previously must have
pushed something onto the stack. If we now pop the stack
and find the empty stack symbol, it tells us that we have
finished processing the string, so we can move on to state
q2.
7/16/2015 5:19 PM
7.2: CFG to PDA
3. Every other transition rule leaves us in state 1.
7/16/2015 5:19 PM
7.2: CFG to PDA
Here is a grammar in GNF:
G = (V, T, S, P), where
V = {S, A, B, C},
T = {a, b, c,},
S = S,
and P =
S  aA
A  aABC | bB | a
Bb
Cc
Let’s convert this grammar to a PDA.
7/16/2015 5:19 PM
7.2: CFG to PDA
Grammar rule:
(none)
S  aA
A  aABC
A  bB
Aa
Bb
Cc
(none)
PDA transition rule:
d(q0, l, #) = {(q1, S#)}
d(q1, a, S) = {(q1, A)}
d(q1, a, A) = {(q1, ABC)})
d(q1, b, A) = {(q1, B)}
d(q1, a, A) = {(q1, l)}
d(q1, b, B) = {(q1, l)}
d(q1, c, C) = {(q1, l)}
d(q1, l, #) = {(q2, #)}
So the equivalent PDA can be defined as: M = ({q0, q1,
q2}, T, V  {#}, d, q0, #, {q2}), where d is the set of
transition rules given above.
7/16/2015 5:19 PM
7.2: CFG to PDA
Is this grammar deterministic? Let’s group the transition
rules together so that all rules with the same precondition
are described in a single rule.
1.
d(q0, l, #) = {(q1, S#)}
2.
d(q1, a, S) = {(q1, A)}
3.
d(q1, a, A) = {(q1, ABC), (q1, l)}
4.
d(q1, b, A) = {(q1, B)}
5.
d(q1, b, B) = {(q1, l)}
6.
d(q1, c, C) = {(q1, l)}
7.
d(q1, l, #) = {(q2, #)}
Here we see that rule three has the same precondition but
two different possible post-conditions. Thus this PDA is
nondeterministic.
7/16/2015 5:19 PM
7.2: CFG to PDA
Let’s follow the steps that the PDA would go through to
process the string aaabc, starting with the initial
precondition:
(q0, aaabc, #) |- (q1, aaabc, S#) rule 1
|- (q1, aabc, A#) rule 2
|- (q1, abc, ABC#) rule 3, first alternative
|- (q1, bc, BC#) rule 3, second alternative
|- (q1, c, C#)
rule 5
|- (q1, l, #)
rule 6
|- (q2, l, #)
rule 7
Notice that this corresponds to the following leftmost
derivation in the grammar:
S  aA  aaABC  aaaBC  aaabC  aaabc
7/16/2015 5:19 PM
7.2: CFG to PDA
In fact, this is exactly what our set of PDA
transformational rules does. It carries out a
leftmost derivation of any string in the language
described by the CFG. After each step, the
remaining unprocessed sentential form (the asyet unprocessed variables) is on the stack, as can
be seen by looking down the post-condition
column above. This corresponds precisely to the
left-to-right sequence of unprocessed variables in
each step of the leftmost derivation given above.
7/16/2015 5:19 PM
7.2: Alternative Approach to Constructing
a PDA from a CFG
Let G = (V, T, S, P) be a context-free
grammar. Then there is a push-down
automaton M so that L(M) = G.
Can we generate an NPDA from a CFG
without converting to GNF first? Yes.
7/16/2015 5:19 PM
7.2: Alternative Approach to Constructing
a PDA from a CFG
In this approach, the plan is to let the
production rules directly reflect the
manipulation of the stack implied by the
grammar rules. With this method, you do
not need to convert to GNF first, but the
technique is harder to understand.
The beginning and ending production rules
are the same in the GNF method.
7/16/2015 5:19 PM
7.2: Alternative Approach to Constructing
a PDA from a CFG
So we will always need to have the
following 2 sets of production rules in our
PDA:
d (q0, l, #) = {(q1, S#)}
and
d (q1, l, #) = {(q2, #)}
7/16/2015 5:19 PM
7.2: Alternative Approach to Constructing
a PDA from a CFG
The other production rules are derived from
the grammar rules:
•If you pop the top of the stack and it is a
variable, don’t read anything from the input
string. Push the right side of the grammar
rule involving this variable onto the stack.
•If you pop the top of the stack and it is a
terminal, read the next character in the input
string. Don’t push anything onto the stack.
7/16/2015 5:19 PM
7.2: Constructing a PDA from a CFG
Given G = (V, T, S, P),
construct M = (Q, S, G, δ, q0, #, F, ), with
Q = {q0, q1, q2}
G = V  S  {#} | #  V  S
F = q2
d (q0, l, #) = {(q1, S #)}
For A  V, d (q1, l, A) = {(q1, a)}, where A  a
For a  S, d (q1, a, a) = {(q1, l)}
d (q1, l, #) = {(q2, #)}
7/16/2015 5:19 PM
7.2: Constructing a PDA from a CFG
Language:
L = {x  {a, b}* | na(x) > nb(x)}
Context-free grammar:
S  a | aS | bSS | SSb | SbS
7/16/2015 5:19 PM
7.2: Constructing a PDA from a CFG
S  a | aS | bSS | SSb | SbS
Let M = (Q, S, G, q0, z, A, d), be a pushdown automaton
as previously described. The production rules will be:
Rule # State Input Top of Stack Move(s)
1
q0
λ
#
(q1, S#)
2
q1
λ
S
(q1, a), (q1, aS), (q1, bSS),
(q1, SSb), (q1, SbS)
3
q1
a
a
(q1, λ)
4
q1
b
b
(q1, λ)
5
q1
λ
#
(q2, #)
7/16/2015 5:19 PM
7.2: CFG to PDA
Let’s follow the steps that the PDA would go through to process the
string baaba, starting with the initial precondition:
(q0, aaabc, #) ||||||||||||-
(q1, baaba, S#)
(q1, baaba, bSS#)
(q1, aaba, SS#)
(q1, aaba, aS#)
(q1, aba, S#)
(q1, aba, SbS#)
(q1, aba, abS#)
(q1, ba, bS#)
(q1, a, S#)
(q1, a, a#)
(q1, λ, #)
(q2, λ, #)
rule 1
rule 2, 3rd alternative
rule 4
rule 2, 1st alternative
rule 3
rule 2, 5th alternative
rule 2, 1st alternative
rule 3
rule 4
rule 2, 1st alternative
rule 3
rule 5
7/16/2015 5:19 PM
7.2: PDA to CFG
Theorem 7.2: If L = L(M) for some NPDA, then
L is a context-free language.
Proof:
•Convert the NPDA into a particular form (if
needed).
•From the NPDA, generate a corresponding
context-free grammar, G, where the language
generated by G = L(M).
•Since any language generated by a CFG is a
context-free language, L must be a CFL.
7/16/2015 5:19 PM
7.2: PDA to CFG
It is possible to convert any PDA into a CFG.
In order to do this, we need to convert our PDA
into a form in which:
1. there is just one final state, which is entered iff
the stack is empty, and
2. each transition rule must either increase or
decrease the stack content by one.
This means that all transition rules must be of
the form:
a. d (qi, a, A) = (qj, l) or
b. d (qi, a, A) = (qj, BC)
7/16/2015 5:19 PM
7.2: PDA to CFG
•For transition rules that delete a variable from the stack,
we will have production rules in the grammar that
correspond to:
(qi, A, qj)  a
•For transition rules that add a variable to the stack, we
will have production rules in the grammar that
correspond to:
(qi, A, qj)  a(qi, B, ql)(ql, C, qk)
•The start variable in the grammar will correspond to:
(q0, #, qf)
7/16/2015 5:19 PM
7.2: PDA to CFG
We will not go into the details of this process, as it
is tedious, and the grammar rules derived are often
somewhat complicated, and don’t look much like
the rules we are used to seeing; just remember that
it can be done.
7/16/2015 5:19 PM
7.4: Parsing
Starting with a CFG G and a string x in L(G), we
would like to be able to parse x, or find a
derivation for x.
There are two basic approaches to parsing, topdown parsing and bottom up parsing.
7/16/2015 5:19 PM
7.4: Parsing
Remember that Chomsky Normal Form (CNF) requires
that every production be one of these two types:
A  BC
Aa
If G is in Chomsky Normal Form, we can bound the
length of a derivation string. Every rule in a CNF
grammar replaces a variable with either two variables
or a single terminal. We always start off with a single
variable, S. Therefore, every CNF derivation must
have 2n - 1 rule applications, where n is the number of
characters in the input string.
7/16/2015 5:19 PM
Parsing:
Example:
S  SA
A  AA | a
Starting with the S symbol, to derive the string
aaa we would need 5 rule applications:
S  SA  AAA  aAA  aaA  aaa
If we want to automate this process, using a
nondeterministic PDA may require following
many alternatives; a deterministic PDA (if
available for this grammar) is more efficient.
7/16/2015 5:19 PM
LL(k) grammars
A grammar is an LL(k) grammar if, while
trying to generate a specific string, we can
determine the (unique) correct production rule to
apply, given the current character from the input
string, plus a “look-ahead” of the next k-1
characters.
A simple example is the following:
S  aSb | ab
7/16/2015 5:19 PM
LL(k) grammars
S  aSb | ab
Assume that we want to generate the string ab.
We look at the first character, which is an a, plus a
look-ahead of one more (a b), for a total of 2
characters. We MUST use the second rule to
produce this string.
7/16/2015 5:19 PM
LL(k) grammars
S  aSb | ab
Now assume that we want to generate the string
aabb. We look at our current character, the first symbol
(an a), plus one more (another a), and we immediately
know that we must use the first rule.
But we still have more letters to produce, so we
make the second character our current character, and look
ahead one more character (the first b), and now we have
ab, so we know we must use the second rule.
7/16/2015 5:19 PM
LL(k) grammars
S  aSb | ab
This is an LL(2) grammar. All LL(k)
grammars are deterministic context-free grammars,
but not all deterministic context-free grammars are
LL(k) grammars. LL(k) grammars are often used
to define programming languages. If you take
Compilers, you will study this in more depth.
7/16/2015 5:19 PM
Top Down Parsing
S  T$
T  [T]T | l
This is the language of balanced strings of square
brackets. The $ is a special end-marker added
to the end of each string.
This CFG is non-deterministic since there are two
rules for T. Its grammar is not in CNF.
We can convert this to a DCFG by using lookahead.
7/16/2015 5:19 PM
Top Down Parsing
Here is the derivation of []$:
(q0, []$, #) |- (q1, []$, S#)
S
|- (q1, []$, T$#)
|- (q[, ]$, [T]T$#)
|- (q1, ]$, T]T$#)
|- (q], $, ]T$#)
|- (q1, $, T$#)
|- (q$, l, $#)
|- (q1, l, #)
|- (q2, l, #)
 T$
 [T]T$
 []T$
 []$
7/16/2015 5:19 PM
Top Down Parsing
Top-down parsing involves finding the left hand
(precondition) part of the production rule on
the stack and replacing it with the right hand
(postproduction) sides.
In a way, the PDA is saving information so that
it can backtrack if, during parsing, it finds that
it has made the wrong choice of how to
process a string.
7/16/2015 5:19 PM
Left recursion
Example:
T  T[T]
Here the first symbol on the right side is
the same as the variable on the left side.
With left recursion, the PDA will never
crash, and can never backtrack.
There is an easy method for eliminating
left-recursion.
7/16/2015 5:19 PM
Recursive Descent
LL(x) grammars perform a left to right scan
generating a leftmost derivation with a lookahead of x characters.
 Recursive descent means that the PDA
contains a collection of mutually recursive
procedures corresponding to the variables in
the grammar. LL(1) grammars perform
recursive descent parsing.
 Recursive descent is deterministic

7/16/2015 5:19 PM
Bottom Up Parsers
Input symbols are read in and pushed
(“shifted”) onto the stack until the stack
matches the right-hand side of a production
rule; then the string is popped off the stack
(“reduced”) and replaced by the variable on
the left side of that production rule.
 Bottom-up parsers perform a rightmost
derivation.
 Bottom-up parsers can be deterministic under
some conditions

7/16/2015 5:19 PM

Chapter 5

Transcript Chapter 5

Directory