15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY (For next time: Read Chapter 1.3 of the book)

Download Report

Transcript 15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY (For next time: Read Chapter 1.3 of the book)

15-453
FORMAL LANGUAGES,
AUTOMATA AND
COMPUTABILITY
(For next time: Read Chapter 1.3 of the book)
1
1
q2
q4
0
ε
0
q1
2
q3
A non-deterministic finite automaton (NFA)
is a 5-tuple N = (Q, Σ, , Q0, F)
Q is the set of states
Σ is the alphabet
 : Q  Σε → 2Q is the transition function
Q0  Q is the set of start states
F  Q is the set of accept states
2Q is the set of subsets of Q and Σε = Σ  {ε}
3
N = (Q, Σ, , Q0, F)
1
q2
q4
Q = {q1, q2, q3, q4}
Σ = {0,1}
0
ε
Q0 = {q1, q2}
q3
F = {q4}  Q
0
0
q1
q1
q2
q3
4
q4
1
ε
N = (Q, Σ, , Q0, F)
1
q2
q4
Q = {q1, q2, q3, q4}
Σ = {0,1}
0
ε
Q0 = {q1, q2}
q3
F = {q4}  Q
0
q1
5
0
1
ε
q1
{q3}


q2

{q4}

q3
{q4}

{q2}
q4



N = (Q, Σ, , Q0, F)
1
q2
q4
Q = {q1, q2, q3, q4}
Σ = {0,1}
0
ε, 0
Q0 = {q1, q2}
q3
F = {q4}  Q
0
q1
00  L(N)?
6
01  L(N)?
0
1
ε
q1
{q3}


q2

{q4}

q3
{q2,q4} 
{q2}
q4



Let w Σ* and suppose w can be written as
w1... wn where wi  Σε (ε is viewed as representing
the empty string)
Then N accepts w if there are r0, r1, ..., rn  Q
such that
1. r0  Q0
2. ri+1  (ri, wi+1 ) for i = 0, ..., n-1, and
3. rn  F
L(N) = the language of machine N
= set of all strings machine N accepts
A language L is recognized by an NFA N
if L = L (N).
7
N = (Q, Σ, , Q0, F)
1
q2
q4
Q = {q1, q2, q3, q4}
Σ = {0,1}
0
ε, 0
Q0 = {q1, q2}
q3
F = {q4}  Q
0
q1
00  L(N)?
8
01  L(N)?
0
1
ε
q1
{q3}


q2

{q4}

q3
{q2,q4} 
{q2}
q4



FROM NFA TO DFA
Input: N = (Q, Σ, , Q0, F)
Output: M = (Q, Σ, , q0, F)
Q = 2Q
 : Q  Σ → Q
(R,) =
 ε( (r,) )
rR
*
q0 = ε(Q0)
*
9
F = { R  Q | f  R for some f  F }
For R  Q, the ε-closure of R, ε(R) = {q that can be reached
from some r  R by traveling along zero or more ε arrows},
Given: NFA N = ( {1,2,3}, {a.b},  , {1}, {1} )
Construct: equivalent DFA M
N
1
a
a
b
ε
2
a,b
ε({1}) = {1,3}
10
3
Σ, , Q0, F )
N = ( Q,
Given: NFA N = ( {1,2,3}, {a,b},  , {1}, {1} )
Construct: equivalent DFA M = (Q, Σ, , q0, F)
N
1
a
a
b
ε
2
a, b
q0 = ε({1}) = {1,3}
11
3

a


{1} 
{2} {2,3}
{3} {1,3}
{1,2} {2,3}
{1,3} {1,3}
{2,3} {1,2,3}
{1,2,3}
{1,2,3}
b

{2}
{3}

{2,3}
{2}
{3}
{2,3}
Σ, , Q0, F )
N = ( Q,
Given: NFA N = ( {1,2,3}, {a,b},  , {1}, {1} )
Construct: equivalent DFA M = (Q, Σ, , q0, F)
N
1
a
a
b
ε
2
a, b
q0 = ε({1}) = {1,3}
12
3

a


{1} 
{2} {2,3}
{3} {1,3}
{1,2} {2,3}
{1,3} {1,3}
{2,3} {1,2,3}
{1,2,3}
{1,2,3}
b

{2}
{3}

{2,3}
{2}
{3}
{2,3}
REGULAR LANGUAGES CLOSED
UNDER STAR
Let L be a regular language and M be a
DFA for L
We construct an NFA N that recognizes L*
ε
1
0
ε
0,1
1
0
0
1
13
ε
Formally:
Input: M = (Q, Σ, , q1, F)
DFA
Output: N = (Q, Σ, , {q0}, F)
NFA
Q = Q  {q0}
F = F  {q0}
(q,a) =
14
{(q,a)}
if q  Q and a ≠ ε
{q1}
if q  F and a = ε
{q1}
if q = q0 and a = ε

if q = q0 and a ≠ ε

else
Show: L(N) = L*
1. L(N)  L*
2. L(N)  L*
15
1. L(N)  L*
Assume w = w1…wk is in L*, where w1,…,wk  L
We show N accepts w by induction on k
Base Cases:
 k=0
 k=1
(w = ε)
(w  L)
Inductive Step:
Assume N accepts all strings v = v1…vk  L*, vi  L
and let u = u1…ukuk+1  L* , vj L
Since N accepts u1…uk (by induction) and M
accepts uk+1, N must accept u
16
2. L(N)  L*
Assume w is accepted by N, we show w  L*
If w = ε, then w  L*
If w ≠ ε
ε
 L*
By induction
ε
 L*
17
accept
By induction
REGULAR LANGUAGES ARE CLOSED
UNDER REGULAR OPERATIONS
Union: A  B = { w | w  A or w  B }
Intersection: A  B = { w | w  A and w  B }
Negation: A = { w  Σ* | w  A }
Reverse: AR = { w1 …wk | wk …w1  A }
Concatenation: A  B = { vw | v  A and w  B }
Star: A* = { w1 …wk | k ≥ 0 and each wi  A }
18
The PUMPING LEMMA and
REGULAR EXPRESSIONS
19
SOME LANGUAGES ARE
NOT REGULAR
B = {0n1n | n ≥ 0} is NOT regular!
20
WHICH OF THESE ARE REGULAR
C = { w | w has equal number of 1s and 0s}
NOT REGULAR
D = { w | w has equal number of
occurrences of 01 and 10}
REGULAR!!!
21
THE PUMPING LEMMA
Let L be a regular language with |L| = 
Then there exists a positive integer P
such that
if w  L and |w| ≥ P
then w = xyz, where:
1. |y| > 0
2. |xy| ≤ P
3. xyiz  L for any i ≥ 0
22
Let M be a DFA that recognizes L
Let P be the number of states in M
Assume w  L is such that |w| ≥ P
1. |y| > 0
2. |xy| ≤ P
3. xyiz  L for any i ≥ 0
We show w = xyz
x
…
q0
qi
qj
There must be j > i such that qi = qj
23
q|w|
USING THE PUMPING LEMMA
Use the pumping lemma to prove that
B = {0n1n | n ≥ 0} is not regular
Hint: Assume B is regular
Let B = L(M), for DFA M,
and let P be larger than the
number of states in M
Try pumping s = 0P1P
24
Use the pumping lemma to prove that
C = { w | w has an equal number of 0s and 1s}
is not regular
Hint: Try pumping s = 0P1P
If C is regular, s can be split into s = xyz,
where for any i ≥ 0, xyiz is also in C
and |xy| ≤ P
26
WHAT DOES D LOOK LIKE?
D = { w | w has equal number of
occurrences of 01 and 10}
= { w | w = 1, w = 0, w = ε or
w starts with a 0 and ends with a 0 or
w starts with a 1 and ends with a 1 }
(0(01)*0)  (1(01)*1)  1  0  ε
27
REGULAR EXPRESSIONS
 is a regular expression representing {}
ε is a regular expression representing {ε}
 is a regular expression representing 
If R1 and R2 are regular expressions
representing L1 and L2 then:
(R1R2) represents L1L2
(R1  R2) represents L1  L2
(R1)* represents L1*
28
PRECEDENCE
Tightest
Loosest
29
Star (“*”)
Concatenation (“.”, “”)
Union (“”, “+”, “|”)
EXAMPLE
R1*R2  R3 = ( ( R1* ) R2 )  R3
30
{ w | w has exactly a single 1 }
0*10*
31
What language does
* represent?
{ε}
32
{ w | w has length ≥ 3 and its 3rd symbol is 0 }
000(01)*  010(01)* 
100(01)*  110(01)*
= (01)(01)0(01)*
33
{ w | every odd position of w is a 1 }
1((01)1)*(01ε)  ε
Also
(1(01))*(1ε)
34
EQUIVALENCE
L can be represented by a regexp

L is a regular language
35
L can be represented by a regexp

L is a regular language
Given regular expression R, we show there
exists NFA N such that R represents L(N)
Induction on the length of R:
36
Given regular expression R, we show there
exists NFA N such that R represents L(N)
Induction on the length of R:
Base Cases (R has length 1):

R=
(matches a single symbol)
R=ε
(matches the empty string)
R=
(matches nothing)
37
Inductive Step:
Assume R has length k > 1 and that any regular
expression of length < k represents a language
that can be recognized by an NFA
Three possibilities for R:
R = R1  R2
R = R1 R2
R = (R1)*
38
(Union Theorem!)
Have Shown
L can be represented by a regexp

L is a regular language
39
Transform (1(0  1))* to an NFA
ε
1
1,0
ε
40
L can be represented by a regexp


L is a regular language
41
L can be represented by a regexp


L is a regular language
Proof idea: Transform an NFA for L into a
regular expression by removing states and relabeling the arrows with regular expressions
42
ε
ε
ε
NFA
ε
ε
Add unique and distinct start and accept states
While machine has more than 2 states:
Pick an internal state, rip it out and
re-label the arrows with regexps,
to account for the missing state
0
0
1
43
ε
ε
NFA
ε
ε
ε
While machine has more than 2 states:
Pick an internal state, rip it out and
re-label the arrows with regexps,
to account for the missing state
01*0
44
a
q0
ε
q1
a,b
b
q2
R(q0,q3) = (a*b)(ab)*
45
ε
q3
a,b
q0
a*b
q2
R(q0,q3) = (a*b)(ab)*
46
ε
q3
q0
(a*b)(ab)*
R(q0,q3) = (a*b)(ab)*
47
q3
b
bb
a
q1
q2
a
ε
b
ε
b
a
q3
48
ε
b
bb
a
q1
ε
q2
a aba
b
b
ε
a
ε
49
bb  (abb
 ba)b*a = R(q1,q1)
b
a
q1
ε
a  ba
q2
b  (a  ba)b*ε
(bb  (a  ba)b*a)* (b  (a  ba)b*)
50
Convert the NFA to a regular expression
a, b (a  b)b*b(bb*b)*
b
q2
q1
ε
(a  b)b*b
b
(a  b)b*b(bb*b)*a
bb*b
a
b
q3
ε
51
ε
((a  b)b*b(bb*b)*a)* 
((a  b)b*b(bb*b)*a)*(a  b)b*b(bb*b)*
52
Formally: Add qstart and qaccept to create G
Run CONVERT(G):
If #states = 2
(return regexp)
return the expression on the arrow
going from qstart to qaccept
If #states > 2
53
Formally: Add qstart and qaccept to create G
Run CONVERT(G):
(return regexp)
If #states > 2
select qripQ different from qstart and qaccept
define Q = Q – {qrip}
define R as:
}
Defines: G (GNFA)
R(qi,qj) = R(qi,qrip)R(qrip,qrip)*R(qrip,qj)  R(qi,qj)
return CONVERT(G)
54
CONVERT(G) is “equivalent” to G
Proof by induction on k (number of states in G)
Base Case:
 k=2
Inductive Step:
Assume claim is true for k-1 states
We first note that G and G are “equivalent”
But, by the induction hypothesis, G is
“equivalent” to CONVERT(G)
And CONVERT(G) is equivalent to CONVERT(G )
55
QED
DFA
NFA
DEF
Regular
Language
57
Regular
Expression
Read Chapter 1.3 of the book for next time
58