Transcript ppt

COMS 3261, Lecture 2
Strings, Languages, Automata
September 6, 2001
Agenda
Today



Strings
Languages
Deterministic Finite Automata
For next time:


Read up to 1.2
Check out ABCEZ in lecture 2-4 directory.
Due 9/13:

HW 1
Vending Machine Example
Vending machine dispenses soda for
$0.45
Accepts only dimes and quarters
Eats your money if you don’t have
correct change
Model this by Java-method pseudocode:
Vending Machine Example
Soda vend(){
int total = 0, coin;
while (total != 45){
receive(coin);
if ((coin==10 && total==40)
||(coin==25 && total>=25))
reject(coin);
else
total += coin;
}
return new Soda();
}
Vending Machine Example
Why was this over-kill?
Vending Machine Example
Why was this over-kill?
1) Vending machines have been around long
before computers or Java!
2) Don’t really need int’s. Each int
introduces 232 possibilities multiplicatively!!!
3) Don’t need to know how to add integers to
model venting machine (total += coin)
4) if/else, Java grammar, all really artifices that
just complicate the essence
Vending Machine Example
Vending Machine Example
Input:
DQQD
About to put in a dime
Vending Machine Example
Input:
DQQD
About to put in a quarter
Vending Machine Example
Input:
DQQD
About to put in a quarter
Vending Machine Example
Input:
DQQD
About to put in a dime
Vending Machine Example
Input:
DQQD
DONE! MONEY ACCEPTED.
Vending Machine Example
What made this example simpler than the
Java pseudocode?
Vending Machine Example
1. Only needed two coin types “D” and
“Q” – symbols/letters in alphabet
2. Only needed 7 possible current total
amounts – states/nodes/vertices
3. Much cleaner and aesthetically
pleasing than Java lingo
Now generalize and abstractify…
Alphabets, Strings, Languages
DEF: An alphabet S is a set of symbols
(characters, letters). A string (or word) over
S is a sequence of symbols. The empty
string is the string containing no symbols at
all, and is denoted by e.
Q1: What is S in our vending machine
example?
Q2: What are some good/bad strings in our
example?
Q3: What does e signify in our example?
Alphabets, Strings, Languages
A1: S = { D, Q }
A2:
Good: QDD, DQD, DDQ, QQQQDD, etc.
Bad: Q, D, DD, etc.
Ugly: DDD …now you’re screwed!
A3: e signifies trying to get something for
nothing (putting no money in at all).
Alphabets, Strings, Languages
DEF: The length of a string is the
number of symbols that it contains
(repetitions allowed).
Absolute values are used to denote
length.
EG: Lengths of the above good (QDD,
DQD, DDQ, QQQQDD) are: 3, 3, 3, 6
Q: What’s the length of e ?
Alphabets, Strings, Languages
A: |e| = 0
Alphabets, Strings, Languages
DEF: The concatenation of two strings is the
string resulting from putting them together
from left to right. Given strings u and v,
denote the concatenation by u v, or just
uv.
EG: ire  land = ireland, QQ  DD = QQDD,
DDD  u is still bad, no-matter what u is!
Q1: Why the last claim?
Q2: What’s the Java equivalent of
concatenation?
Q3: Find a formula for |u v |.
Alphabets, Strings, Languages
A1: You are still screwed no matter what
combination of coins you put in.
A2: The + operator on strings.
A3: |u v | = |u |+|v |
Alphabets, Strings, Languages
DEF: The reversal of a string u is
denoted by u R.
EG: (banana)R = ananab
DEF: If S is an alphabet, S* denotes the
set of all strings over S. A language
over S is a subset of S*, i.e. a set of
strings each consisting of sequences of
letters in S.
Alphabets, Strings, Languages
EG: S = { D, Q }
S* = {e,
D, Q,
DD, DQ, QD, QQ,
DDD, DDQ, DQD, DQQ, QDD, QDQ, QQD, QQQ,
DDDD, DDDQ, … }
Define L = { uS* | u successfully vends }
Classroom exercise: What are all the strings in L
of length 1? Of length 2? 3? 4? 5?
Finite Deterministic Automata
More computer-like example:
sourceless
arrow
denotes
start
input put
on tape
read left
to right
double
circle
denotes
accept
0
0
1
0
1
1
1
1
0
0 1
Finite Deterministic Automata
0
0
1
0
1
1
1
1
0
0 1
Finite Deterministic Automata
0
0
1
0
1
1
1
1
0
0 1
Finite Deterministic Automata
0
0
1
0
1
1
1
1
0
0 1
Finite Deterministic Automata
0
0
1
0
1
1
1
1
0
0 1
Finite Deterministic Automata
0
0
1
0
1
1
1
1
0
0 1
Finite Deterministic Automata
0
0
1
0
1
1
1
1
0
0 1
REJECT!
Finite Deterministic Automata
0
0
1
0
1
1
Q: What kinds of bitstrings
are accepted?
Finite Deterministic Automata
0
0
1
0
1
1
A: Bitstrings that represent binary
even numbers.
Finite Deterministic Automata
Exercise: Design with a friend a machine
that tells us when a base-10 number is
divisible by 3.
What should your alphabet be?
How can you tell when a number is
divisible by 3?
Finite Deterministic Automata
0,3,6,9
Solution:
(except for e )
1 mod 3
1,4,7
2,5,8
0 mod 3
2,5,8
0,3,6,9
1,4,7
2,5,8
2 mod 3
0,3,6,9
1,4,7
Formal Definition of FA
DEF: A (deterministic ) finite
automaton (FA) consists of a set of
states Q, an alphabet S, labeled
transitions between states d, a start
state q0 Q, and a set of accept
states F. This is all encapsulated by
the “5-tuple”
M = (Q, S, d, q0, F )
Formal Definition of FA
Notice that the input string, as well as the
tape containing the input string, are
implicit in the definition of an FA. I.e.,
definition only deals with static view.
Further explaining needed for
understanding how FA’s interact with
their input.
Why Deterministic?
Deterministic means that there is enough
information to always determine which
state the Automaton goes into next,
when reading a particular symbol. Our
Vending Machine Example was actually
not deterministic because after $.45
have been deposited, the effects of an
additional coin deposit are undefined.
0,3,6,9
0 mod 3
1,4,7
2,5,8
0,3,6,9
2,5,8
1 mod 3
2,5,8
1,4,7
Exercise: Find the
formal description of
this automaton.
2 mod 3
0,3,6,9
1,4,7
Definition of FA, example
Q = { 0 mod 3, 1 mod 3, 2 mod 3 }
( rename: {q0, q1, q2} )
S = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }
q0 = 0 mod 3
F = { 0 mod 3 }
d -- further explanation required
The labeling function d
d tells us which state to go to, if machine
reads a given symbol. I.e., given a
source state in Q and a letter in S, d
defines a unique target state in Q. In
other words, d is a function from the
Cartesian product Q x S to Q :
δ :QS  Q
The labeling function d
δ:QS  Q
δ(q0 ,2) = q2 , δ(q0 ,9) = q0 , δ(q1 ,2) = q0 ,
δ(q1 ,7) = q2 , δ(q2 ,3) = q2 , δ(q2 ,5) = q1.
Question : δ(qi , j ) = ?
The labeling function d
δ(qi , j ) = q(i  j ) mod 3
Usually don’t have such neat and
Simple formulas.
Formal Definition of an FA:
Dynamic
How does an FA operate on strings?
Implicitly, there is some notion of an
auxiliary tape containing the string.
The FA reads the tape from left to right
with each new character causing the FA
to go into another state. When the
string is completely read, the string is
accepted depending on whether the
FA’s final state was an accept state.
Formal Definition of an FA:
Dynamic
DEF: A string u is accepted by an
automaton iff (IF and only iF ) the path
starting at q0 which is labeled by u ends
in an accept state.
Note: To really define what it means for
string to label a path, you need to break
u up into its sequence of characters and
apply d repeatedly, keeping track of
states. See Sipser for further details.
Language Accepted by an FA
DEF: The language accepted by an FA M is
the set of all strings which are accepted by M
and is denoted by L (M ).
Intuitively, think of all the possible ways of
getting from the start state to any accept
state. Then think of all the possible ways of
labeling those paths (if there are multiple
labels on some edges).
Regular Languages
We will eventually see that not all languages
can be described as the accepted language of
some FA. Languages which do belong to
some FA, exhibit a high degree of regularity
and fit the pattern defined by the FA. In fact,
DEF: A language L is called a regular
language if some FA M exists such that
L = L (M ).
Blackboard exercises