Transcript Slide 1
Pushdown Automata
Chapter 12
Recognizing Context-Free Languages
Two notions of recognition:
(1) Say yes or no, just like with FSMs
(2) Say yes or no, AND
if yes, describe the structure
a
+
b
*
c
Just Recognizing
We need a device similar to an FSM except that it
needs more power.
The insight: Precisely what it needs is a stack, which
gives it an unlimited amount of memory with a
restricted structure.
Example: Bal (the balanced parentheses language)
(((()))
Definition of a Pushdown Automaton
M = (K, , , , s, A), where:
K is a finite set of states
is the input alphabet
is the stack alphabet
s K is the initial state
A K is the set of accepting states, and
is the transition relation. It is a finite subset of
(K
state
( {})
*)
input or string of
symbols
to pop
from top
of stack
(K
state
*)
string of
symbols
to push
on top
of stack
Definition of a Pushdown Automaton
A configuration of M is an element of K * *.
The initial configuration of M is (s, w, ).
Manipulating the Stack
c
will be written as
cab
a
b
If c1c2…cn is pushed onto the stack:
c1
c2
cn
c
a
b
c1c2…cncab
Yields
Let c be any element of {},
Let 1, 2 and be any elements of *, and
Let w be any element of *.
Then:
(q1, cw, 1) |-M (q2, w, 2) iff ((q1, c, 1), (q2, 2)) .
Let |-M* be the reflexive, transitive closure of |-M.
C1 yields configuration C2 iff C1 |-M* C2
Computations
A computation by M is a finite sequence of configurations
C0, C1, …, Cn for some n 0 such that:
● C0
is an initial configuration,
● Cn is of the form (q, , ), for some state q KM and
some string in *, and
● C0 |-M C1 |-M C2 |-M … |-M Cn.
Nondeterminism
If M is in some configuration (q1, s, ) it is possible that:
●
contains exactly one transition that matches.
●
contains more than one transition that matches.
●
contains no transition that matches.
Accepting
A computation C of M is an accepting computation iff:
● C = (s, w, ) |-M* (q, , ), and
● q A.
M accepts a string w iff at least one of its computations accepts.
Other paths may:
● Read all the input and halt in a nonaccepting state,
● Read all the input and halt in an accepting state with the stack not
empty,
● Loop forever and never finish reading the input, or
● Reach a dead end where no more input can be read.
The language accepted by M, denoted L(M), is the set of all strings
accepted by M.
Rejecting
A computation C of M is a rejecting computation iff:
= (s, w, ) |-M* (q, w, ),
● C is not an accepting computation, and
● M has no moves that it can make from (q, , ).
●C
M rejects a string w iff all of its computations reject.
So note that it is possible that, on input w, M neither
accepts nor rejects.
A PDA for Balanced Parentheses
A PDA for Balanced Parentheses
M = (K, , , , s, A), where:
K = {s}
the states
= {(, )}
the input alphabet
= {(}
the stack alphabet
A = {s}
contains:
((s, (, **), (s, ( ))
((s, ), ( ), (s, ))
**Important: This does not mean that the stack is empty
A PDA for AnBn = {anbn: n 0}
A PDA for AnBn = {anbn: n 0}
A PDA for {wcwR: w {a, b}*}
A PDA for {wcwR: w {a, b}*}
M = (K, , , , s, A), where:
K = {s, f}
the states
= {a, b, c}
the input alphabet
= {a, b}
the stack alphabet
A = {f}
the accepting states
contains: ((s, a, ), (s, a))
((s, b, ), (s, b))
((s, c, ), (f, ))
((f, a, a), (f, ))
((f, b, b), (f, ))
A PDA for {anb2n: n 0}
A PDA for {anb2n: n 0}
Exploiting Nondeterminism
A PDA M is deterministic iff:
● M contains no pairs of transitions that compete with each other, and
● Whenever M is in an accepting configuration it has no available moves.
But many useful PDAs are not deterministic.
A PDA for PalEven ={wwR: w {a, b}*}
S
S aSa
S bSb
A PDA:
A PDA for PalEven ={wwR: w {a, b}*}
S
S aSa
S bSb
A PDA:
A PDA for {w {a, b}* : #a(w) = #b(w)}
A PDA for {w {a, b}* : #a(w) = #b(w)}
More on Nondeterminism
Accepting Mismatches
L = {ambn : m n; m, n > 0}
Start with the case where n = m:
b/a/
a//a
b/a/
1
2
More on Nondeterminism
Accepting Mismatches
L = {ambn : m n; m, n > 0}
Start with the case where n = m:
b/a/
a//a
b/a/
1
2
● If stack and input are empty, halt and reject.
● If input is empty but stack is not (m > n) (accept):
● If stack is empty but input is not (m < n) (accept):
More on Nondeterminism
Accepting Mismatches
L = {ambn : m n; m, n > 0}
b/a/
a//a
b/a/
2
1
● If input is empty but stack is not (m < n) (accept):
b/a/
a//a
/a/
b/a/
1
/a/
2
3
More on Nondeterminism
Accepting Mismatches
L = {ambn : m n; m, n > 0}
b/a/
a//a
b/a/
2
1
● If stack is empty but input is not (m > n) (accept):
b//
b/a/
1
b//
b/a/
a//a
2
4
Putting It Together
L = {ambn : m n; m, n > 0}
● Jumping to the input clearing state 4:
Need to detect bottom of stack.
● Jumping to the stack clearing state 3:
Need to detect end of input.
The Power of Nondeterminism
Consider AnBnCn = {anbncn: n 0}.
PDA for it?
The Power of Nondeterminism
Consider AnBnCn = {anbncn: n 0}.
Now consider L = AnBnCn. L is the union of two
languages:
1. {w {a, b, c}* : the letters are out of order}, and
2. {aibjck: i, j, k 0 and (i j or j k)} (in other words,
unequal numbers of a’s, b’s, and c’s).
A PDA for L = AnBnCn
Are the Context-Free Languages
Closed Under Complement?
AnBnCn is context free.
If the CF languages were closed under complement,
then
AnBnCn = AnBnCn
would also be context-free.
But we will prove that it is not.
L = {anbmcp: n, m, p 0 and n m or m p}
S NC
S QP
NA
NB
Aa
A aA
A aAb
Bb
B Bb
B aBb
C | cC
P B'
P C'
B' b
B' bB'
B' bB'c
C' c | C'c
C' C'c
C' bC'c
Q | aQ
/* n m, then arbitrary c's
/* arbitrary a's, then p m
/* more a's than b's
/* more b's than a's
/* add any number of c's
/* more b's than c's
/* more c's than b's
/* prefix with any number of a's
Reducing Nondeterminism
● Jumping to the input clearing state 4:
Need to detect bottom of stack, so push # onto the
stack before we start.
● Jumping to the stack clearing state 3:
Need to detect end of input. Add to L a termination
character (e.g., $)
Reducing Nondeterminism
● Jumping to the input clearing state 4:
Reducing Nondeterminism
● Jumping to the stack clearing state 3:
More on PDAs
A PDA for {wwR : w {a, b}*}:
What about a PDA to accept {ww : w {a, b}*}?
PDAs and Context-Free Grammars
Theorem: The class of languages accepted by PDAs is
exactly the class of context-free languages.
Recall: context-free languages are languages that
can be defined with context-free grammars.
Restate theorem:
Can describe with context-free grammar
Can accept by PDA
Going One Way
Lemma: Each context-free language is accepted by
some PDA.
Proof (by construction):
The idea: Let the stack do the work.
Two approaches:
• Top down
• Bottom up
Top Down
The idea: Let the stack keep track of expectations.
Example: Arithmetic expressions
EE+T
ET
TTF
TF
F (E)
F id
(1)
(2)
(3)
(4)
(5)
(6)
(q, , E), (q, E+T)
(q, , E), (q, T)
(q, , T), (q, T*F)
(q, , T), (q, F)
(q, , F), (q, (E) )
(q, , F), (q, id)
(7) (q, id, id), (q, )
(8) (q, (, ( ), (q, )
(9) (q, ), ) ), (q, )
(10) (q, +, +), (q, )
(11) (q, , ), (q, )
A Top-Down Parser
The outline of M is:
M = ({p, q}, , V, , p, {q}), where contains:
● The start-up transition ((p, , ), (q, S)).
● For each rule X s1s2…sn. in R, the transition:
((q, , X), (q, s1s2…sn)).
● For each character c , the transition:
((q, c, c), (q, )).
Example of the Construction
L = {anb*an}
(1) S
(2) S B
(3) S aSa
(4) B
(5) B bB
*
input = a a b b a a
trans
0
3
6
3
6
2
5
7
5
7
4
6
6
0 (p, , ), (q, S)
1 (q, , S), (q, )
2 (q, , S), (q, B)
3 (q, , S), (q, aSa)
4 (q, , B), (q, )
5 (q, , B), (q, bB)
6 (q, a, a), (q, )
7 (q, b, b), (q, )
state
p
q
q
q
q
q
q
q
q
q
q
q
q
q
a a b b
a a b b
a a b b
a b b
a b b
b b
b b
b b
b
b
unread input
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
S
aSa
Sa
aSaa
Saa
Baa
bBaa
Baa
bBaa
Baa
aa
a
stack
Another Example
L = {anbmcpdq : m + n = p + q}
Another Example
L = {anbmcpdq : m + n = p + q}
(1) S aSd
(2) S T
(3) S U
(4) T aTc
(5) T V
(6) U bUd
(7) U V
(8) V bVc
(9) V
input = a a b c d d
Another Example
L = {anbmcpdq : m + n = p + q}
0 (p, , ), (q, S)
(1) S aSd
1 (q, , S), (q, aSd)
(2) S T
2 (q, , S), (q, T)
(3) S U
3 (q, , S), (q, U)
(4) T aTc
4 (q, , T), (q, aTc)
(5) T V
5 (q, , T), (q, V)
(6) U bUd
6 (q, , U), (q, bUd)
(7) U V
7 (q, , U), (q, V)
(8) V bVc
8 (q, , V), (q, bVc)
(9) V
9 (q, , V), (q, )
10 (q, a, a), (q, )
11 (q, b, b), (q, )
input = a a b c d d
12 (q, c, c), (q, )
13 (q, d, d), (q, )
trans
state
unread input
stack
The Other Way to Build a PDA - Directly
L = {anbmcpdq : m + n = p + q}
(1) S aSd
(2) S T
(3) S U
(4) T aTc
(5) T V
input = a a b c d d
(6) U bUd
(7) U V
(8) V bVc
(9) V
The Other Way to Build a PDA - Directly
L = {anbmcpdq : m + n = p + q}
(1) S aSd
(2) S T
(3) S U
(4) T aTc
(5) T V
a//a
(6) U bUd
(7) U V
(8) V bVc
(9) V
b//a
b//a
1
c/a/
c/a/
2
d/a/
d/a/
3
c/a/
4
d/a/
d/a/
input = a a b c d d
Notice Nondeterminism
Machines constructed with the algorithm are often nondeterministic,
even when they needn't be. This happens even with trivial
languages.
Example: AnBn = {anbn: n 0}
A grammar for AnBn is:
[1] S aSb
[2] S
A PDA M for AnBn is:
(0)
(1)
(2)
(3)
(4)
((p, , ), (q, S))
((q, , S), (q, aSb))
((q, , S), (q, ))
((q, a, a), (q, ))
((q, b, b), (q, ))
But transitions 1 and 2 make M nondeterministic.
A directly constructed machine for AnBn:
Bottom-Up
The idea: Let the stack keep track of what has been found.
(1) E E + T
(2) E T
(3) T T F
(4) T F
(5) F (E)
(6) F id
Reduce Transitions:
(1) (p, , T + E), (p, E)
(2) (p, , T), (p, E)
(3) (p, , F T), (p, T)
(4) (p, , F), (p, T)
(5) (p, , )E( ), (p, F)
(6) (p, , id), (p, F)
Shift Transitions
(7) (p, id, ), (p, id)
(8) (p, (, ), (p, ()
(9) (p, ), ), (p, ))
(10) (p, +, ), (p, +)
(11) (p, , ), (p, )
A Bottom-Up Parser
The outline of M is:
M = ({p, q}, , V, , p, {q}), where contains:
● The shift transitions: ((p, c, ), (p, c)), for each c .
● The reduce transitions: ((p, , (s1s2…sn.)R), (p, X)), for each rule
X s1s2…sn. in G.
● The finish up transition: ((p, , S), (q, )).
Going The Other Way
Lemma: If a language is accepted by a pushdown automaton M, it is
context-free (i.e., it can be described by a context-free grammar).
Proof (by construction):
Step 1: Convert M to restricted normal form:
● M has a start state s that does nothing except push a special
symbol # onto the stack and then transfer to a state s from which
the rest of the computation begins. There must be no transitions
back to s.
● M has a single accepting state a. All transitions into a pop # and
read no input.
● Every transition in M, except the one from s, pops exactly one
symbol from the stack.
Converting to Restricted Normal Form
Example:
{wcwR : w {a, b}*}
Add s and a:
Pop no more than one symbol:
M in Restricted Normal Form
[1]
[3]
Pop exactly one symbol:
Replace [1], [2] and [3] with:
[1] ((s, a, #), (s, a#)),
((s, a, a), (s, aa)),
((s, a, b), (s, ab)),
[2] ((s, b, #), (s, b#)),
((s, b, a), (s, ba)),
((s, b, b), (s, bb)),
[3] ((s, c, #), (f, #)),
((s, c, a), (f, a)),
((s, c, b), (f, b))
[2]
Must have one transition for
everything that could have been on
the top of the stack so it can be
popped and then pushed back on.
Second Step - Creating the Productions
Example:
WcWR
M=
The basic idea –
simulate a leftmost derivation of M on any input string.
Second Step - Creating the Productions
Example:
abcba
Nondeterminism and Halting
1. There are context-free languages for which no
deterministic PDA exists.
2. It is possible that a PDA may
● not halt,
● not ever finish reading its input.
3. There exists no algorithm to minimize a PDA. It is
undecidable whether a PDA is minimal.
Nondeterminism and Halting
It is possible that a PDA may
● not halt,
● not ever finish reading its input.
Let = {a} and consider M =
L(M) = {a}: (1, a, ) |- (2, a, a) |- (3, , )
On any other input except a:
● M will never halt.
● M will never finish reading its input unless its input is .
Solutions to the Problem
●
For NDFSMs:
● Convert to deterministic, or
● Simulate all paths in parallel.
● For NDPDAs:
● Formal solutions that usually involve changing the
grammar.
● Practical solutions that:
● Preserve the structure of the grammar, but
● Only work on a subset of the CFLs.
Alternative Equivalent Definitions of a
PDA
Accept by accepting state at end of string (i.e., we don't
care about the stack).
From M (in our definition) we build M (in this one):
1. Initially, let M = M.
2. Create a new start state s. Add the transition:
((s, , ), (s, #)).
3. Create a new accepting state qa.
4. For each accepting state a in M do,
4.1 Add the transition ((a, , #), (qa, )).
5. Make qa the only accepting state in M.
Example
The balanced parentheses language
What About These?
● FSM plus FIFO queue (instead of stack)?
● FSM plus two stacks?
Comparing Regular and
Context-Free Languages
Regular Languages
● regular exprs.
● or
● regular grammars
grammars
● recognize
● = DFSMs
Context-Free Languages
● context-free
● parse
● = NDPDAs