Topic 3-Automata Theory
Download
Report
Transcript Topic 3-Automata Theory
Topic 3:
Automata Theory
1
Outline
Finite state machine, Regular
expressions, DFA,
NDFA, and their equivalence,
Grammars and Chomsky
hierarchy.
2
What is Automata Theory?
Study of abstract computing devices, or “machines”
Automaton = an abstract computing device
Note: A “device” need not even be a physical hardware!
A fundamental question in computer science:
Find out what different models of machines can do and
cannot do
The theory of computation
Computability vs. Complexity
3
Alan Turing (1912-1954)
(A pioneer of automata theory)
Father of Modern Computer
Science
English mathematician
Studied abstract machines
called Turing machines even
before computers existed
Heard of the Turing test?
4
Languages & Grammars
Or “words”
Languages: “A language is a
collection of sentences of finite
length all constructed from a
finite alphabet of symbols”
Grammars: “A grammar can be
regarded as a device that
enumerates the sentences of a
language” - nothing more,
nothing less
N. Chomsky, Information and
Control, Vol 2, 1959
Image source: Nowak et al. Nature, vol 417, 2002
5
The Chomsky Hierachy
• A containment hierarchy of classes of formal languages
Regular
(DFA)
Contextfree
(PDA)
Contextsensitive
(LBA)
Recursivelyenumerable
(TM)
6
The Central Concepts
of Automata Theory
7
Alphabet
An alphabet is a finite, non-empty set of symbols
We use the symbol ∑ (sigma) to denote an alphabet
Examples:
Binary: ∑ = {0,1}
All lower case letters: ∑ = {a,b,c,..z}
Alphanumeric: ∑ = {a-z, A-Z, 0-9}
DNA molecule letters: ∑ = {a,c,g,t}
…
8
Strings
A string or word is a finite sequence of symbols
chosen from ∑
Empty string is (or “epsilon”)
Length of a string w, denoted by “|w|”, is equal to the
number of (non- ) characters in the string
|x| = 6
|x| = ?
E.g., x = 010100
x = 01 0 1 00
xy = concatentation of two strings x and y
9
Languages
10
The Membership Problem
11
Languages
Let S be a set of
characters. S is called the
alphabet.
A language over S is set of
strings of characters drawn
from S.
12
Example of Languages
Alphabet = English characters
Language = English sentences
Alphabet = ASCII
Language = C++ programs,
Java, C#
13
Notation
Languages are sets of
strings (finite sequence of
characters)
Need some notation for
specifying which sets we
want
14
Regular Languages
Each regular expression is a
notation for a regular
language (a set of words).
If A is a regular expression,
we write L(A) to refer to
language denoted by A.
15
Regular Expression
A regular expression (RE) is
defined inductively
a
ordinary character
from S
the empty string
16
Regular Expression
R|S
RS
R*
= either R or S
= R followed by S
(concatenation)
= concatenation of R
zero or more times
(R*= |R|RR|RRR...)
17
RE Extentions
R?
R+
(R)
= | R (zero or one R)
= RR* (one or more R)
= R (grouping)
18
RE Extentions
[abc] = a|b|c (any of listed)
[a-z] = a|b|....|z (range)
[^ab] = c|d|... (anything but
‘a’‘b’)
19
Regular Expression
RE
Strings in L(R)
a
“a”
ab
“ab”
a|b
“a” “b”
(ab)* “” “ab” “abab” ...
(a|)b “ab” “b”
20
Example: integers
integer: a non-empty string
of digits
digit
= ‘0’|’1’|’2’|’3’|’4’|
’5’|’6’|’7’|’8’|’9’
integer = digit digit*
21
Example: identifiers
identifier:
string or letters or digits
starting with a letter
C identifier:
[a-zA-Z_][a-zA-Z0-9_]*
22
Recap
Language L(R):
set of strings represented
by a regular expression R.
L(R) is the language
denoted by regular
expression R.
23
How to Use REs
We need mechanism to
determine if an input string
w belongs to L(R), the
language denoted by
regular expression R.
24
Acceptor
Such a mechanism is called
an acceptor.
input w
string
language L
acceptor
yes, if w L
no, if w L
25
Finite Automata (FA)
Specification:
Regular Expressions
Implementation:
Finite Automata
26
Finite Automata
Finite Automaton consists of
An input alphabet (S)
A set of states
A start (initial) state
A set of transitions
A set of accepting (final)
states
27
Finite Automaton
State Graphs
A state
The start state
An accepting
state
28
Finite Automaton
State Graphs
a
A transition
29
Finite Automata
A finite automaton accepts a
string if we can follow
transitions labelled with
characters in the string from
start state to some
accepting state.
30
FA Example
A FA that accepts only “1”
1
31
FA Example
A FA that accepts any number
of 1’s followed by a single 0
1
0
32
FA Example
A FA that accepts ab*a
Alphabet: {a,b}
b
a
a
33
Table Encoding of FA
Transition
table
a
b
a
0
0
1
2
1
a
1
2
err
2
b
err
1
err
34
RE → Finite Automata
Can we build a finite
automaton for every regular
expression?
Yes, – build FA inductively
based on the definition of
Regular Expression
35
NFA
Nondeterministic Finite
Automaton (NFA)
Can have multiple
transitions for one input
in a given state
Can have - moves
36
Epsilon Moves
ε – moves
machine can move from state
A to state B without consuming
input
A
B
37
NFA
operation of the automaton is not
completely defined by input
1
A
0
B
1
C
On input “11”, automaton could be
in either state
38
Execution of FA
A NFA can choose
Whether to make -moves.
Which of multiple
transitions to take for a
single input.
39
Acceptance of NFA
NFA can get into multiple states
Rule: NFA accepts if it can get
in a final state
1
A
0
B
1
C
0
40
DFA and NFA
Deterministic Finite Automata
(DFA)
One transition per input per
state.
No - moves
41
Execution of FA
A DFA
can take only one path
through the state graph.
Completely determined by
input.
42
NFA vs DFA
NFAs and DFAs recognize
the same set of languages
(regular languages)
DFAs are easier to
implement – table driven.
43
NFA vs DFA
For a given language, the
NFA can be simpler than
the DFA.
DFA can be exponentially
larger than NFA.
44
NFA vs DFA
NFAs are the key to
automating RE → DFA
construction.
45
RE → NFA Construction
Thompson’s construction
(CACM 1968)
Build an NFA for each RE
term.
Combine NFAs with
-moves.
46
RE → NFA Construction
Subset construction
NFA → DFA
Build the simulation.
Minimize number of states
in DFA (Hopcroft’s
algorithm)
47
RE → NFA Construction
Key idea:
NFA pattern for each
symbol and each operator.
Join them with -moves in
precedence order.
48
RE → NFA Construction
a
s0
s1
NFA for a
s0
a
s1
s3
b
s4
NFA for ab
49
RE → NFA Construction
NFA for a
s0
a
s1
50
RE → NFA Construction
NFA for a
NFA for b
s0
s3
a
b
s1
s4
51
RE → NFA Construction
NFA for a
NFA for b
s0
a
s1
s0
s3
a
b
s3
s1
s4
b
s4
52
RE → NFA Construction
NFA for a
s0
NFA for b
s0
a
a
b
s3
s1
s3
s1
s4
b
s4
NFA for ab
53
RE → NFA Construction
s1
a
s2
s0
s5
s3
b
s4
NFA for a | b
54
RE → NFA Construction
s1
a
s2
NFA for a
55
RE → NFA Construction
s1
a
s3
b
s2
s4
NFA for a and b
56
RE → NFA Construction
s1
a
s2
s0
s5
s3
b
s4
NFA for a | b
57
RE → NFA Construction
s0
s1
a
s2
s4
NFA for a*
58
RE → NFA Construction
s1
a
s2
NFA for a
59
RE → NFA Construction
s0
s1
a
s2
s4
NFA for a*
60
Example RE → NFA
NFA for a ( b|c )*
s0
a
s1 s2
s4
s3
s6
b
s5
s8 s 9
c
s7
61
Example RE → NFA
building NFA for a ( b|c )*
s0
a
s1
62
Example RE → NFA
NFA for a, b and c
s0
a
s4
b
s5
s6
c
s7
s1
63
Example RE → NFA
NFA for a and b|c
s0
a
s4
s1
s3
s6
b
s5
s8
c
s7
64
Example RE → NFA
NFA for a and ( b|c )*
s0
a
s1 s2
s4
s3
s6
b
s5
s8 s 9
c
s7
65
Example RE → NFA
NFA for a ( b|c )*
s0
a
s1 s2
s4
s3
s6
b
s5
s8 s 9
c
s7
66