Document 7933632

Download Report

Transcript Document 7933632

Giorgi Japaridze
Theory of Computability
Regular Languages
Chapter 1
1.1.a
Giorgi Japaridze
How a finite automaton works
1
q0
1
1
0
q1
0
01100
q2
0
Theory of Computability
1.1.b
Giorgi Japaridze
Theory of Computability
The language of a machine
1
q0
1
q2
0
1
0
q1
0
L(M), “the language of M”, or “the language recognized by M”
--- the set of all strings that the machine M accepts
What is the language recognized by our automaton A?
1.1.c
Giorgi Japaridze
Theory of Computability
Formal definition of a finite automaton
A (deterministic) finite automaton (DFA) is a 5-tuple
(Q, , , s, F), where:
Q is a finite set whose elements are called the states
 is a finite set called the alphabet
 is a function of the type Q  Q called the transition function
s is an element of Q called the start state
F is a subset of Q called the set of accept states
1.1.d
Giorgi Japaridze
Theory of Computability
Our automaton formalized
1
q0
Q:
:
1
:
0
1
q2
0
1
0
q1
q0
q1
q2
s:
F:
0
A = (Q, , , s, F)
1.1.e
Giorgi Japaridze
Theory of Computability
Formal definition of computation
M = (Q, , , s, F)
1
q0
1
q2
0
1
0
q1
M accepts the string
u 1 u 2 … un
iff there is a sequence
0
r1, r2, …, rn, rn+1
of states such that:
u1
u2 …
un
• r1=s
0 1 1 0 0
• ri+1 = (ri,ui), for each i with 1 i  n
q0
• rn+1  F
q2
q0
r1, r2, …,
q0
q2
q1
rn, rn+1
1.1.f
Giorgi Japaridze
Theory of Computability
Designing finite automata
Task:
Design an automaton that accepts a bit string iff it contains an even number of “1”s.
1.2.a
Giorgi Japaridze
NFAs (Nondeterministic Finite Automata)
q1
1
q2
q3
0,1
0,1
01010
q1
0
q1
1
q1
q2
q1
q3
0
1
q1
q1
q2
q3
0
Theory of Computability
1.2.a
Giorgi Japaridze
NFAs (Nondeterministic Finite Automata)
q1
1
q2
0,1
What language does this NFA recognize?
0,1
q3
Theory of Computability
1.2.b
Giorgi Japaridze
Theory of Computability
Formal definition of a nondeterministic finite automaton
An NFA is a 5-tuple (Q, , , s, F), where:
Q is a finite set whose elements are called the states
 is a finite set called the alphabet
 is a function of the type Q  P(Q) called the transition function
s is an element of Q called the start state
F is a subset of Q called the set of accept states
1.2.c
Giorgi Japaridze
Theory of Computability
Example
1
Q:
:
b
a
b
a
:
a
1
b
3
a,b
2
2
3
s:
F:
A = (Q, , , s, F)
1.2.d
Giorgi Japaridze
Theory of Computability
Formal definition of accepting
M = (Q, , , s, F)
When M is a DFA
M accepts the string
When M is an NFA
M accepts the string
u1 u2 … u n
iff there is a sequence
u1 u2 … u n
iff there is a sequence
r1, r2, …, rn, rn+1
r1, r2, …, rn, rn+1
of states such that:
of states such that:
• r1=s
• r1=s
• ri+1 = (ri,ui), for each i with 1 i  n
• ri+1  (ri,ui), for each i with 1 i  n
• rn+1  F
• rn+1  F
1.2.e
Giorgi Japaridze
What language does this NFA recognize?
0
0
0
0
0
0
0
Theory of Computability
1.2.f
Giorgi Japaridze
Theory of Computability
What language does this DFA recognize?
1
2
0
0
0
0
3
0
0
5
0
4
1.2.g
Giorgi Japaridze
Theory of Computability
Equivalence of NFAs and DFAs
Two machines are said to be equivalent if they recognize the same language.
Theorem 1.39 Every NFA has an equivalent DFA.
Proof. Consider an NFA
N = (Q, , , s, F)
We need construct an equivalent DFA
D = (Q’, , ’, s’, F’)
using a procedure called the subset construction described on the next slide.
1.2.h
Giorgi Japaridze
Theory of Computability
The subset construction
Constructing DFA D = (Q’, , ’, s’, F’) from NFA N = (Q, , , s, F)
• Q’ = P (Q)
• ’(R,a) = {q | q=(p,a) for some pR}
• s’ = {s}
• F’= {R | R is a subset of Q containing an accept state of N}
D obviously works correctly:
at every step in the computation, it clearly enters a state that
corresponds to the subset of states that N could be in at that point.
1.2.i
Giorgi Japaridze
Theory of Computability
Example of applying the subset construction
Q’:
N = (Q, , , s, F)
:
1
’:

{1}
{2}
{3}
{1,2}
{1,3}
{2,3}
{1,2,3}
s’:
F’:
a
b
b
a
b
a
3
a,b
2
•Q’ = P (Q)
• ’(R,a) = {q | q=(p,a) for some pR}
• s’ = {s}
• F’= {R | R is a subset of Q containing an
accept state of N}
1.2.j
Giorgi Japaridze
Theory of Computability
The resulting DFA
D
{3}
b
a
a,b

a
b
b
{1,3}
a
{1}
b
b
{2,3}
a
b
{1,2,3}
a
a
{2}
a,b
{1,2}
1.2.k
Giorgi Japaridze
Removing unreachable states
D
{3}
b
a
Theory of Computability
a,b

a
{1}
b
b
{2,3}
a
b
{1,2,3}
a
1.2.l
Giorgi Japaridze
Testing in work
D
N
{3}
b
1
b
a
a
b
a
3
a,b
a,b

a
{1}
b
b
2
{2,3}
a
baa
Theory of Computability
b
{1,2,3}
a
1.3.a
Regular operations
Giorgi Japaridze
L1  L2 = {x | xL1 or xL2}
Union:
{Good,Bad}  {Boy,Girl} =
{0,00,000,…} {1,11,111,…} =
L  =
Concatenation:
L1  L2 = {xy | xL1 and yL2}
{Good,Bad}{Boy,Girl} =
{0,00,000,…}{1,11,111,…} =
L =
Star:
L* = {x1…xk | k0 and each xi is in L}
{Boy,Girl}* =
{0,00,000,…}* =
 *=
Theory of Computability
1.3.b
Regular expressions
Giorgi Japaridze
Theory of Computability
We say that R is a regular expression
(RE) iff R is one of the following:
What language is represented
by the expression:
1. a, where a is a symbol of the alphabet
{a}
2. 
{}
3. 

4. (R1)(R2), where R1 and R2 are RE
The union of the languages represented
by R1 and R2
The concatenation of the languages
represented by R1 and R2
The star of the language represented
by R1
5. (R1)  (R2), where R1 and R2 are RE
6. (R1)*, where R1 is a RE
Conventions:
 The symbol  is often omitted in RE
 Some parentheses can be omitted.
The precedence order for the operators is:
* (highest),  (medium),  (lowest)
1.3.c
Giorgi Japaridze
Theory of Computability
Regular languages
A language is said to be regular iff it can be represented by a regular expression.
Language
{11}
{Boy, Girl, Good, Bad}
{,0,00,000,0000,…}
{0,00,000,0000,…}
{,01,0101,010101,01010101,…}
{x | x = 0k where k is a multiple of 2 or 3}
{x | x is divisible by 8}
{x | x MOD 4 = 3}
Expression
1.3.d
Giorgi Japaridze
Theory of Computability
Exercising reading regular expressions
Expression
Language
(Good  Bad)(Boy  Girl)
(Tom  Bob)_is_(good  bad)
{Name_is_adjective | Name is an uppercase
letter followed by zero or more lowercase
letters, and adjective is a lowercase letter
followed by zero or more lowercase letters}
0*10*
(0 1)*101(0 1)*
((0 1)(0 1))*
1.3.e
Giorgi Japaridze
Theory of Computability
Regular languages and DFA-recognizable languages are the same
Theorem 1.54* A language is regular if and only if some NFA
(DFA) recognizes it.
Proof – omitted (but given in the textbook).
The textbook describes an algorithm for converting any given regular
expression to an equivalent NFA, and an algorithm for converting any
given NFA to an equivalent regular expression.
1.4.a
Giorgi Japaridze
The limitations of the power of DFAs
Theory of Computability
The computing power of finite automata is severely limited by the fact
that their memory (= set of states) is small (= of a fixed size) while inputs
can be arbitrarily large.
While the memories of real computers are also finite, they are not fixed,
in the sense that we assume one can always supply additional memory if
needed.
To summarize, DFAs are not as powerful as computers can generally
be.
The next slide gives several examples of non-regular languages, i.e.
languages that no DFA can handle (recognize). The non-regularity of
those languages can be strictly proven using the tool called pumping
lemma. We omit the pumping lemma in this course (but it is in the
textbook). Instead, we will simply rely on intuitive arguments.
Warning: Generally one cannot safely rely on intuition when making
important conclusions, because intuition can sometimes be deceptive.
Only strict mathematical proofs can be trusted.
1.4.b
Non-regular languages
Do the following languages look regular to you?
Giorgi Japaridze
Theory of Computability
A = { ww | w {0,1}* } Is not regular.
Intuitively, this is so because a DFA processing a long input will have forgotten much
of the previously seen part of the input when it gets to the middle of the string. But
without fully remembering the first half of the string, it is impossible to tell whether
the second half coincides with it or not.
B = { 0n1n | n0} Is not regular.
Intuitively, this is so because a DFA processing a long input 0n1n will be unable to
remember exactly how many 0s it has seen by the time when the 1s start. But without
that information it is impossible to tell whether the remaining 1* part of input has the
same length as the already seen 0* part.
C = {w | w contains the same number of “0”s as “1”s}
Is not regular.
An intuitive reason here is similar to the one for language B.
D = {w | w contains the same number of “01”s as “10”s}
Is regular.
Intuitively, it may appear to you that if C is irregular, “even more so” should be D. But
you’ve been warned about the deceptiveness of intuition. The following slide shows
a DFA that recognizes D, so that D is regular!
1.4.c
A DFA recognizing D
Giorgi Japaridze
Theory of Computability
D = {w | w contains the same number of “01”s as “10”s}
1
0
1
0
0
1
1
1
0
0