Finite Automata & Regular Languages
Download
Report
Transcript Finite Automata & Regular Languages
Finite Automata &
Regular Languages
Sipser, Chapter 1
Deterministic Finite Automata
A DFA or deterministic finite automaton M is a
5-tuple, M = (Q, , , q0, F), where:
Q is a finite set of states of M
is the finite input alphabet of M
: Q Q is the state transition function
q0 is the start state of M
F Q is the set of accepting states or final states
of M
DFA Example
State diagram
Q = { q0, q1 }
= { 0, 1 }
F = { q1 }
0
M
q0
1
0
q1
1
q0
0
q0
1
q1
q1
q1
q0
State
Table
State table &
state transition function
State table
q0
0
q0
1
q1
q1
q1
q0
State transition function
(q0, 0) = q0, (q0, 1) = q1
(q1, 0) = q1, (q1, 1) = q0
State transitions
If q, q’ Q, s , and (q, s) = q’,
then we say that q’ is an s-successor of
q, or there is a transition from q to q’
on input s, and we write
q s q’
Example: since (q0, 1) = q1, then there
is a transition from q0 to q1 on input 1,
and we write q0 1 q1.
State sequences
If a string of input symbols
w = s0s1s2 … sk-1 takes M from initial
state q0 to state qk, namely
q0 s0 q1 s1 q2 s2 q3 … s[k-1] qk
then we say that qk is a w-successor of
q0, and write q0 w qk. Also q0q1q2 … qk
is called an admissible state sequence
for w.
Strings accepted by a DFA
Let M = (Q, , , q0, F) be a DFA, and
w = s0s1s2 … sk-1 * be a string over
alphabet . Then M accepts w if there
exists an admissible state sequence
q0q1q2 … qk for w, starting at initial
state q0 and ending with state qk,
where qk F. That is, M accepts input
string w if M ends up in one of the final
states.
Language recognized by a DFA
The language L(M) that is recognized
by a DFA, M = (Q, , , q0, F), is the set
of all strings accepted by M. That is,
L(M) = { w * | M accepts w }
= { w * | q0 w qk, qk F }.
Example: For the previous DFA, L(M) is
the set of all strings of 0s and 1s with
odd parity, that is, odd number of 1s.
DFA Example 2
Recognizer for 11*01*
1
B
0
1
1
C
A
0
0
Trap
D
0,1
DFA Example 2
M = (Q, , , q0, F), L(M) = 11*01*
Q = { q0=A, B, C, D }
= { 0, 1 }
0
1
F={C}
A
D
B
B
C
B
C
D
C
D
D
D
DFA Example 3
0
Modulo 3 counter
B
1
0,R
2,R
A
1
2
2
1,R
C
0
DFA Example 3
M = (Q, , , q0, F)
Q = { q0=A, B, C }
= { 0, 1, 2, R }
F={A}
0
1
2
R
A
A
B
C
A
B
B
C
A
A
C
C
A
B
A
Regular Languages
A language L * is called regular if
there exists a DFA M such that L(M)=L.
Earlier, we defined a language L *
as regular if there exists a T3 or regular
(left-linear or right-linear) grammar G
such that L(G)=L. We shall prove that
these two definitions are equivalent.
Operations on Regular Languages
Let A and B be regular languages:
Union:
A B = { x | x A or x B }
Concatenation:
AB = { xy | x A and y B }.
Kleene Closure (A-star)
A* = {x1x2x3 ... xk | k 0 and xi A }
Examples of regular operations
A = { good, bad }, B = { boy, girl }
A B = { good, bad, boy, girl }
AB = { goodboy, goodgirl, badboy,
badgirl }
A* = { , good, bad, goodgood,
goodbad, badgood, badbad, … }
Closure under Union
If A and B are regular languages, then
their union, A B, is a regular
language
Union Machine M(A B)
q1F
q0
M(A)
q2F
r0
p1F
p0
M(B)
p2F
Closure under Concatenation
If A and B are regular languages, then
their concatenation, AB, is a regular
language.
Concatenation Machine M(AB)
Closure under Kleene Star
If A is a regular language, then the
Kleene closure of A, A*, is also a
regular language
Kleene Closure Machine M(A*)
NFAs:
Nondeterministic Finite Automata
Presence of lambda transtitions.
May have more than one initial state.
On input a, state q may have no
transition out.
On input a, state q may have more than
one transition out.
NFAs
A nondeterministic finite automaton M
is a five-tuple M = ( Q, , R, I, F ),
where
Q is a finite set of states
is the (finite) input alphabet
R is the transition relation, R QQ
I Q is the set of initial states
F Q is the set of final states
Example NFAs
NFA that recognizes the language
0*1 1*0
NFA that recognizes the language
(0 1)*11 (0 1)*
Converting NFAs to DFAs
Given a NFA, M = (Q, , R, I, F), build a
DFA, M’ = (Q’, , , S0, F’) as follows.
The states S0, S1, S2, … of M’ are sets of
states of M.
The initial state of M’ is obtained by putting
together all the initial states of M and all
states reachable from those by
transitions, and calling this set S0, the
initial state of M’
Converting NFAs to DFAs
For each state Sk already in Q’ in M’, and
for each input symbol a , put together
into a set Sj all states of M reachable from
each state in Sk on input a. This set Sj may
or may not yet already be in Q’. Also it may
be the empty set . Add to the transition
from Sk to Sj on input a.
Since there can only be a finite number of
subsets of states of M, this procedure will
stop after a finite number of steps.
Example conversions
Convert the NFA for the language
(0 1)*00 (0 1)*11 to a DFA
0,1
0
A
0
0,1
1
D
C
B
1
E
F
State transition table of NFA
0
1
A
A,B
A
-
B
C
-
-
C
-
-
-
D
D
D,E
-
E
-
F
-
F
-
-
-
State table of DFA
0
1
A,D
A,B,D
A,D,E
A,B,D
A,B,C,D
A,D,E
A,D,E
A,B,D
A,D,E,F
A,B,C,D
A,B,C,D
A,D,E
A,D,E,F
A,B,D
A,D,E,F
State diagram of DFA
ABD
0
ABCD
0
1
0
1
0
AD
0
1
ADE
1
ADEF
1
Regular Expressions (r.e.)
If a , then the set a = {a} is a r.e.
The set = {} is a r.e.
The set = { } is a r.e.
If R and S are r.e., then (R S) is a r.e.
If R and S are r.e., then (RS) is a r.e.
If R is a r.e., then ( R )* is a r.e.
Any r.e. is obtained by a finite application of
the above rules.
REs and Regular Languages
R.E.s are shorthand notation for regular
languages.
Regex: REs in Unix
[a-f], [^a-f]
R*, R+, R?
{R}
RS
R|S
Minimization of DFAs
Subset construction
(Myhill-Nerode Theorem)
NFAs, DFAs, & Lexical Analyzer
Generators
Sec 3.6: Finite Automata, Aho, Sethi,
Ullman, “Compilers: P.T.T”
Sec 3.7: From REs to NFAs (Thompson’s
Construction)
Sec 3.8: Design of a Lexical Analyzer
generator
Sec 3.9: Optimization of DFA-based
Lexical Analyzers