What is a First

Transcript What is a First

FSA Lecture 1
Finite State Machines
Creating a Automaton
 Given a language L over an alphabet ,
design a deterministic finite automaton
(DFA) M such that L(M) = L.
Example 1
 L1 = { w | w is a string over {0, 1} that contains an even number of 0s
and an odd number of 1s }
 Method:
Define nodes to represent when
a) both an even number of 0s and 1s have been seen in the input
b) both an odd number of 0s and 1s have been seen in the input
c) an even number of 0s and an odd number of 1s have been seen in the input
d) an even number of 1s and an odd number of 0s have been seen in the input
Example 1
1
qeo
qee
1
0
0
0
0
1
qoe
qoo
1
Example 2
 L2 = { w | w is a string over {0, 1} that does not contain an even
number of 0s and an odd number of 1s }
= L1
Example 2
1
qee
qeo
1
0
0
0
0
1
qoe
qoo
1
Example 3
 L3 = { w | w is a string over {0, 1} such that |w|  3}
= {, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, 110, 111}
Example 3
0, 1
q0
0, 1
q1
q2
0, 1
q3
0, 1
q4
0, 1
Example 4
 L4 = { w | w is a string over {0, 1} such that w contains the substring 11}
= { w | w = x11y, where x and y are strings over {0, 1}}
Example 4
0
0, 1
1
q0
q1
1
q2
0
q0
q1
q2
0
q0
q0
q2
1
q1
q2
q2
Machine M accepts string w
 If there exists a sequence of states r0, r1, …,
rn in Q such that
1) r0 = q0
2)  (ri , wi+1) = ri+1, for i=0,…,n-1
3) rn in F
Note: w = w1w2…wn
Regular Languages
 Machine M recognizes language A if
A = {w| M accepts w}
 A language is called regular if some finite
automaton recognizes it.
Regular Operations
 Let A and B be languages.

Union
A  B = { x | x in A or x in B}

Concatenation
A  B = {xy | x in A and y in B}

Star
A* = {x1x2…xk | k 0 and each xj in A}
Note:  is always a member of A*.
Regular languages are closed under union
 Let A1 and A2 be regular languages. We want to show
A1A2 is a regular language. Since A1 and A2 are regular
languages there exists a finite automaton M1 and there exists
a finite automaton M2 such that M1 recognizes A1 and M2
recognizes A2.
Assume M1 = (Q1, , 1, q1, F1) and M2 = (Q2, , 2, q2, F2)
It suffices to create a finite automaton M that recognizes
A1A2.
Continue …
 Let a be a symbol in  and states r1 in Q1 and r2 in Q2.
Define M = (Q, , , q0, F) where
Q = Q1 x Q2
states
((r1, r2), a) = (1(r1, a), 2(r2, a)) transition function
q0 = (q1, q2)
start state
F = (F1 x Q2)  (Q1 x F2)
final states
Regular languages are closed under
concatenation
 Let A1 and A2 be regular languages. We want to show
A1  A2 is a regular language. Since A1 and A2 are regular languages
there exists a finite automaton M1 and there exists a finite automaton M2
such that M1 recognizes A1 and M2 recognizes A2.
Assume M1 = (Q1, , 1, q1, F1) and M2 = (Q2, , 2, q2, F2) It suffices to
create a finite automaton M that recognizes
A1  A2. There is a problem since M doesn’t know where to subdivide
the input string into the part accepted by M1 and the remaining part that
will be accepted by M2. We will return to this later.
Non-Deterministic Automaton
 NFAs generalize DFAs.




In a DFA, each state has exactly one transition for each
symbol in the alphabet.
In an NFA, at any state there may be zero or more
transitions for a symbol in the alphabet.
In a DFA, a label on a transition arrow is a symbol in the
alphabet.
In an NFA, a label on a transition arrow is a symbol in
the alphabet or .
Example
0, 1
0, 1
q1
1
q2
0, 
q3
1
q4
Non-Deterministic Automaton
 NFAs generalize DFAs.




In a DFA, each state has exactly one transition for each
symbol in the alphabet.
In an NFA, at any state there may be zero or more
transitions for a symbol in the alphabet.
In a DFA, a label on a transition arrow is a symbol in the
alphabet.
In an NFA, a label on a transition arrow is a symbol in
the alphabet or .
Example
0, 1
0, 1
q1
1
q2
0, 
q3
1
q4
q1
Input:
010110
0
q1
1
1
q1
q2
0
q1
1
q1
1
q2
0
0
q1
q3
q3
1
1
q2
1
q3
1
q4
1
q3
q3
0
q1
1
1
q4
1
q4
0
q4
0
q4
Non-Deterministic Finite Automaton
 N = (Q, , , q0, F)
(Q) is the power
Q is a finite set of states
set of Q =
 is a finite alphabet
{X| X  Q}
: Q x (  {})  (Q)
F  Q is a set of accept states
Machine N accepts string w
 If there exists a sequence of states r0, r1, …,
rn in Q such that
1) r0 = q0
2) ri+1 in  (ri , wi+1) for i=0,…,n-1
3) rn in F
Note: w = w1w2…wn
 (ri , wi+1) is a
set of states
Are NFAs more powerful than DFAs?
 Every deterministic finite automaton has an
equivalent non-deterministic finite
automaton. (see next slide)
 Every non-deterministic finite automaton has
an equivalent deterministic finite automaton.
Non-deterministic?
0
0, 1
1
q0
1
q1
q2
0
Deterministic interpretation
q0
q1
q2
0
q0
q0
q2
1
q1
q2
q2
Non-deterministic interpretation
{q0}
{q1}
{q2}
0
{q0}
{q0}
{q2}
1
{q1}
{q2}
{q2}
Deterministic Equivalent?
1
a

b
a
3
2
a, b
DFA from NFA Construction
 Assume no  edges.
Let N = (Q, , , q0, F)
be an NFA that recognizes language A. We construct a DFA called
M = (Q’, , ’, q0’, F’)

Q’ = {{}, {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, {1,2,3}}
1) Q’ =
(Q)
2) For R in Q’ and a in  let
’(R,a) = {q in Q| q in (r,a) for some r in R}
=
 (r,a)
r in R
’({1,2},b) = (1,b)  (2,b) = {2}  {3} = {2,3}
Continued …
3) q0’ = { q0}
4) F’ = {R in Q’| R contains an accept state of N}
Assume  edges, then we need these modifications.
Let R be a state of M. Define
E(R) = {q| q can be reached from R traveling along 0 or more  edges}
Modify ’(R,a) = {q in Q| q in E((r,a)) for some r in R}
=

E((r,a))
transition function
r in R
’({1,2},b) = E((1,b))  E((2,b)) = E({2})  E({3}) = {2,3}
’({3},a) = E((3,a)) = E({1}) = {1,3}
q0’ = E({q0})
start state
Deterministic Equivalent?
1
a

b
a
3
2
a, b
Start state q0’= E({1}) = {1,3}
Final states F’ = {{1}, {1,2}, {1,3}, {1,2,3}}
Deterministic Equivalent
{}
{1}
{2}
{3}
{1,2}
{1,3}
{2,3}
{1,2,3}
a
{}
{}
{2,3}
{1,3}
{2,3}
{1,3}
{1,2,3}
{1,2,3}
b
{}
{2}
{3}
{}
{2,3}
{2}
{3}
{2,3}
Final Solution
a
a, b
a
{3}
{1,3}
b
b
b
{}
b
a
a
a
{2}
a
{2,3}
{1,2,3}
b
a, b
{1}
{1,2}
b
Regular languages are closed under union
Let A1 and A2 be regular languages. We want to show
A1A2 is a regular language. Since A1 and A2 are regular
languages there exists an NFA N1 and there exists an
NFA N2 such that N1 recognizes A1 and N2 recognizes A2.
Assume N1 = (Q1, , 1, q1, F1) and N2 = (Q2, , 2, q2,
F2) It suffices to create a NFA N that recognizes A1A2.
Construction of NFA
N1
q1
N

q2
q0
N2

N = (Q, , , q0, F)
Q = {q0} Q1  Q2
F = F1  F2
(q,a) =
1 (q,a)
2 (q,a)
{q1, q2}
{}
q in Q1
q in Q2
q = q0 and a=
q = q0 and a
Regular languages are closed under
concatenation
Let A1 and A2 be regular languages. We want to show
A1  A2 is a regular language. Since A1 and A2 are regular
languages there exists an NFA N1 and there exists an
NFA N2 such that N1 recognizes A1 and N2 recognizes A2.
Assume N1 = (Q1, , 1, q1, F1) and N2 = (Q2, , 2, q2,
F2) It suffices to create a NFA N that recognizes A1  A2.
Construction of NFA
N1
q1
N

q1
N2

q2
N = (Q, , , q1, F2)
Q = Q1  Q2
F = F2
(q,a) =
1 (q,a)
2 (q,a)
1 (q,a)  {q2}
1 (q,a)
q in Q1 and q not in F1
q in Q2
q in F1 and a=
q in F1 and a
q2
Regular languages are closed under the star
operation
Let A be a regular language. We want to show A* is a
regular language. Since A is regular language there exists
an NFA N1 such that N1 recognizes A.
Assume N1 = (Q1, , 1, q1, F1) It suffices to create a NFA
N that recognizes A*.
Regular languages are closed under union
Let A1 and A2 be regular languages. We want to show
A1A2 is a regular language. Since A1 and A2 are regular
languages there exists an NFA N1 and there exists an
NFA N2 such that N1 recognizes A1 and N2 recognizes A2.
Assume N1 = (Q1, , 1, q1, F1) and N2 = (Q2, , 2, q2,
F2) It suffices to create a NFA N that recognizes A1A2.
Construction of NFA
N1
q1
N

q2
q0
N2

N = (Q, , , q0, F)
Q = {q0} Q1  Q2
F = F1  F2
(q,a) =
1 (q,a)
2 (q,a)
{q1, q2}
{}
q in Q1
q in Q2
q = q0 and a=
q = q0 and a
Regular languages are closed under
concatenation
Let A1 and A2 be regular languages. We want to show
A1  A2 is a regular language. Since A1 and A2 are regular
languages there exists an NFA N1 and there exists an
NFA N2 such that N1 recognizes A1 and N2 recognizes A2.
Assume N1 = (Q1, , 1, q1, F1) and N2 = (Q2, , 2, q2,
F2) It suffices to create a NFA N that recognizes A1  A2.
Construction of NFA
N1
q1
N

q1
N2

q2
N = (Q, , , q1, F2)
Q = Q1  Q2
F = F2
(q,a) =
1 (q,a)
2 (q,a)
1 (q,a)  {q2}
1 (q,a)
q in Q1 and q not in F1
q in Q2
q in F1 and a=
q in F1 and a
q2
Regular languages are closed under the star
operation
Let A be a regular language. We want to show A* is a
regular language. Since A is regular language there exists
an NFA N1 such that N1 recognizes A.
Assume N1 = (Q1, , 1, q1, F1) It suffices to create a NFA
N that recognizes A*.
Construct NFA
N1
q1
N

q0
N = (Q, , , q0, F)
Q = {q0} Q1
F = F1  {q0}
1 (q,a)
(q,a) =
1 (q,a)
1 (q,a)  {q1}
{q1}
{}
q in Q1 and q not in F1
q in F1 and a
q in F1 and a=
q = q0 and a=
q= q0 and a

q1

Regular Expressions
 R is a regular expression if
1) x for some x in  (note: regular expression x represents language {x})
2)  (empty string) (note: regular expression  represents language {})
3)  (empty set)
4) (R1  R2) where R1 and R2 are regular expressions
5) (R1  R2) where R1 and R2 are regular expressions
6) (R1*) where R1 is a regular expression
If R is a regular expression then L(R) is the language of R.
Examples
 0*0 {w| w contains at least one zero}





* = {}
11  00 = {11, 00}
0 *1 = {w| w begins with a 0 and ends in a 1}
(01)* = {, 01, 0101, 010101, 01010101, …}
1*0 = {w| w contains any number of 1s followed by exactly one 0}
Using Regular Expressions
Beginning or End?
Regular Expressions vs. Regular Languages
 A language is regular if and only if some
regular expression describes it.
Part a) If a regular expression describes a language
then it is regular.
Part b) If a language is regular then a regular
expression describes it.
x
 NFA that recognizes {x}
x

 NFA that recognizes {}

 NFA that recognizes 
R1  R2, R1  R2, or R1*
 Construct a machine the same way we did to show regular
languages are closed under , , or *.
NFA to recognize (0  11)*
0
1

1
0


1
1

1
NFA to recognize (0  11)*
0




1

0


1
1

1

Part b) If a language is regular then a regular
expression describes it.
 Properties of GNFA
1) The start state has transition arrows going to every other
state but no arrows coming in from any other state.
2) There is one accept state, and it has arrows coming in
from every other state but no arrows going to any other
state. The accept state is not the same as the final state.
3) Except for the start and accept states, one arrow goes
from every state to every other state and also from each state
to itself.
4) The labels on each edge is a regular expression.
Example GNFA
aa
ab*
ab  ba
start
accept
a*
(aa)*

b*
ab
b
Generalize Non-deterministic Finite Automaton
 GNFA is a 5-tuple (Q, , , qstart, qaccept)
: (Q – {qaccept}) x (Q – {qstart})   (all regular expressions over )
 R
(qs, qt)
R
qs
qt
Example

a
1
a
1
qstart
b
b

2
qaccept
2
a b
a, b
qstart

qstart
a*b(a b)*
1
b(a b)*
qaccept
qaccept
a
Example 1.36 (b to c)
new(s,2) = old(s,2)  old(s,1) old(1,1)* old(1,2)
= 
 
*
a
= a
new(s,3) = old(s,3)  old(s,1) old(1,1)* old(1,3)
= 
 
*
b
= b
new(2,2) = old(2,2)  old(2,1) old(1,1)* old(1,2)
= b
 a
*
a
= b
 aa
new(3,3) = old(3,3)  old(3,1) old(1,1)* old(1,3)
= 
 b
*
b
= bb
Example 1.36 (b to c)
new(2,3) = old(2,3)  old(2,1) old(1,1)* old(1,3)
= 
 a
*
b
= ab
new(3,2) = old(3,2)  old(3,1) old(1,1)* old(1,2)
= a

b
*
a
= a

ba
Example 1.36 (c to d)
new(s,a) = old(s,a)  old(s,2) old(2,2)* old(2,a)
= 
 a
(aa b)*

= a(aa b)*
new(s,3) = old(s,3)  old(s,2) old(2,2)* old(2,3)
= b
 a
(aa b)*
ab
= b  a(aa b)* ab
new(3,a) = old(3,a)  old(3,2) old(2,2)* old(2,a)
=
  (ba a) (aa b)* 
= (ba a) (aa b)*  
new(3,3) = old(3,3)  old(3,2) old(2,2)* old(2,3)
= bb  (ba a) (aa b)* ab
Example 1.36 (d to e)
new(s,a) = old(s,a) 
old(s,3)
= a(aa b)*  (b  a(aa b)* ab)(bb 
old(3,3)*
old(3,a)
(ba a) (aa b)* ab)* ((ba a) (aa b)*  )
Part b) If a language is regular then a regular
expression describes it.
 Properties of GNFA
1) The start state has transition arrows going to every other
state but no arrows coming in from any other state.
2) There is one accept state, and it has arrows coming in
from every other state but no arrows going to any other
state. The accept state is not the same as the final state.
3) Except for the start and accept states, one arrow goes
from every state to every other state and also from each state
to itself.
4) The labels on each edge is a regular expression.
Example GNFA
aa
ab*
ab  ba
start
accept
a*
(aa)*

b*
ab
b
Generalize Non-deterministic Finite Automaton
 GNFA is a 5-tuple (Q, , , qstart, qaccept)
: (Q – {qaccept}) x (Q – {qstart})   (all regular expressions over )
 R
(qs, qt)
R
qs
qt
Example

a
1
a
1
qstart
b
b

2
2
qaccept
a b
a, b
new(1,qaccept) = old(1, qaccept)  old(1,2) old(2,2)* old(2, qaccept)
=


b
(a  b)*

=
b(a  b)*

qstart
a*b(a b)*
qstart
1
b(a b)*
qaccept
new(qstart, qaccept) = old(qstart, qaccept)  old(qstart,1) old(1,1)* old(1, qaccept)
=

 
a*
b(a b)*
=
a*b(a b)*
qaccept
a
Example
Remove vertex 2:
new(1,qaccept) = old(1, qaccept)  old(1,2) old(2,2)* old(2, qaccept)
=

 b
(a  b)*

=
b(a  b)*
Remove vertx 1:|
new(qstart, qaccept) = old(qstart, qaccept)  old(qstart,1) old(1,1)* old(1, qaccept)
=

 
a*
b(a b)*
=
a*b(a b)*
Example 1.36
b
a
1
b
b

2
a
s
a
3
1
2
a
b
a
3
a(aa b)*

aab
a
a
s
2
baa
b  a(aa b)* ab

a
b
s
b
a
(ba a) (aa b)*  
b
3
bb (ba a) (aa b)*ab
bb
3
ab


a
Example 1.36 (b to c)
new(s,2) = old(s,2)  old(s,1) old(1,1)* old(1,2)
= 
 
*
a
= a
new(s,3) = old(s,3)  old(s,1) old(1,1)* old(1,3)
= 
 
*
b
= b
new(2,2) = old(2,2)  old(2,1) old(1,1)* old(1,2)
= b
 a
*
a
= b
 aa
new(3,3) = old(3,3)  old(3,1) old(1,1)* old(1,3)
= 
 b
*
b
= bb
Remove
vertex 1
Example 1.36 (b to c)
new(2,3) = old(2,3)  old(2,1) old(1,1)* old(1,3)
= 
 a
*
b
= ab
new(3,2) = old(3,2)  old(3,1) old(1,1)* old(1,2)
= a

b
*
a
= a

ba
Remove
vertex 1
Example 1.36 (c to d)
new(s,a) = old(s,a)  old(s,2) old(2,2)* old(2,a)
= 
 a
(aa b)*

= a(aa b)*
new(s,3) = old(s,3)  old(s,2) old(2,2)* old(2,3)
= b
 a
(aa b)*
ab
= b  a(aa b)* ab
new(3,a) = old(3,a)  old(3,2) old(2,2)* old(2,a)
=
  (ba a) (aa b)* 
= (ba a) (aa b)*  
new(3,3) = old(3,3)  old(3,2) old(2,2)* old(2,3)
= bb  (ba a) (aa b)* ab
Remove
vertex 2
Example 1.36 (d to e)
new(s,a) = old(s,a) 
old(s,3)
= a(aa b)*  (b  a(aa b)* ab)(bb 
old(3,3)*
old(3,a)
(ba a) (aa b)* ab)* ((ba a) (aa b)*  )
Remove
vertex 3
s
a
a(aa b)*  (b  a(aa b)* ab)(bb  (ba a) (aa b)* ab)* ((ba a) (aa b)*  )
Pumping Lemma
 Purpose: Used to prove a language is not regular.
 What does it say?
All strings in a regular language can be “pumped” if
they are at least as long as the pumping length p.
Suppose xyz represents a string in the language
whose length is at least as long as p. There is a
section of the string (say y) that can be repeated,
i.e., xykz, where k>=0 is also a member of the
language.
Pumping Lemma
 If A is a regular language, then there is a number p
(the pumping length) where, if s is any string in A
of length at least p, then s can be divided into three
pieces, s = xyz satisfying the following conditions
1) for each k0, the string xykz is in A.
2) |y| > 0
3) |xy|  p.
(Note form of this theorem is: RS)
Sketch Proof
 Since A is a regular language there exists a DFA
M = (Q,,,q1,F) with p states that recognizes A.
 Either A has strings of length at least p or it doesn’t.
Case 1: Suppose no string in A has length at least p
Then the Pumping Lemma (RS) is vacuously
true since the antecedent R is False.
Case 2: See next page.
Sketch Proof
 Let s be a string in A of length n, where n is
at least p. Starting in q1, M processes the
string s by visiting n+1 states (namely, r1= q1,
r2, … rn+1).
By the Pigeonhole Principle (n+1 pigeons
and p nests), some state must have been
visited more than once (say qx).
Sketch Proof
 s = s1 s2 s3 … s’ … s” … sn
q1 r2 r3 … qx … qx
… rn+1
y
Repetition first
occurs when see the
(p+1) state
z
x
q1
qx
rn+1
Nonregular Languages
 Consider the language L={0n1n | n0}.
carefully
Assume L isChoose
a regular
language. Because of
this, there exists a DFA with p states that
recognizes L. Consider the string s = 0p1p
from L. Since its length is at least as long as
p, it follows from the Pumping Lemma that
1) s = xyz and xykz in A for k0
2) |y| > 0
3) |xy|  p
NonRegular Languages
 By (3) xy consists of all 0s.
Case 1: |xy| = p
xy = 0p-u 0u where u>0 and z = 1p
Consider xz = 0p-u 1p. Pumping Lemma says xz in L.
This is a contradiction.
Case 2: |xy| < p
xy = 0t where t < p and z = 0p-t 1p
xy = 0t-v0v where v > 0
Consider xy2z = 0t-v0v 0v 0p-t 1p = 0p+v 1p. Pumping Lemma says
xy2z in L. This is a contradiction.
 Therefore L is not regular.
Another s
 What if s = (01)p was chosen instead?
s = (01)p = xyz
Regardless how y is chosen, y can always be
pumped. For example,
if y = (01)k then it can be pumped.
Are Regular Languages Closed Under Other Operations?
 Closed under union
 Closed under intersection (see page 46)
 Closed under complement (see exercise 1.10)
 Closed under concatenation
 Closed under star
Nonregular Languages
 Consider the language
L2={w in * | w has the same number of 0s as 1s}.
Show L2 is not regular.
We know the language A = 0*1* is regular
since it can be represented using regular
expressions. Suppose L2 is regular then
A  L2 = {0n1n | n0}is regular is a contradiction.
Therefore L2 is not regular.
Minimum Pumping Length
 The minimum pumping length for a regular
language A is the smallest p that is a
pumping length of A.
Minimum Pumping Length
 What is the minimum pumping length for 01*?
2
s = 0 = xyz  x = , y = 0, z =  by Pumping Lemma
Can’t pump y!
Let p be greater than zero.
s = 01p = xyz  x = , y = 0, z = 1p can’t pump y
x = 0, y = 1, z = 1p-1 can pump y, |xy|=2
x = , y = 01, z = 1p-1 can’t pump y
Minimum Pumping Length
 What is the minimum pumping length for 11?
s = 11 = xyz  x = , y = 1, z = 1
x = 1, y = 1, z = 
x =  , y = 11, z = 
can’t pump y
can’t pump y
can’t pump y
minimum pumping length is 3 (vacuously true)
3

What is a First

Transcript What is a First

Directory