# Finite-state Recognizers

Zvi Kohavi and Niraj K. Jha 1

### Deterministic Recognizers

Treat FSM as a recognizer that classifies input strings into two classes: strings it accepts and strings it rejects Tape 1 0 0 1 0 0 1 1 Head Finite control Finite-state recognizer: • Equivalent to a string of input symbols that enter the machine at successive times • Finite-state control: Moore FSM • States in which output symbol is 1 (0): accepting (rejecting) states • A string is accepted by an FSM: if and only if the state the FSM enters after having read the rightmost symbol is an accepting state • Set of strings recognized by an FSM: all input strings that take the FSM from its starting state to an accepting state 2

### Transition Graph

Example: a machine that accepts a string if and only if the string begins and ends with a 1, and every 0 in the string is preceded and followed by at least a single 1

A

1 0 0 0,1 1

C B

(

a

) Deterministic state diagram.

1

A

1 0

B

(

b

) Transition graph.

Transition graph: consists of a set of vertices and various directed arcs connecting them • At least one of the vertices is specified as a starting vertex • Arcs are labeled with symbols from the input alphabet • A vertex may have one or more

I i

-successors or none • It accepts a string if the string is described by at least one path emanating from a starting vertex and terminating at an accepting vertex • It may be deterministic or non-deterministic 3

### Example

Example: 1110 and 11011 accepted by the transition graph below, but 100 rejected 1

A

0 1 1

D B

1

C

0 0 Equivalent transition graphs: set of strings two or more graphs that recognize the same • Each graph below accepts a string: least two 0’s if and only if each 1 is preceded by at 0 0 0 0 0 0

A B C A B C

1 1 4

 -transitions: when no input symbol is used to make the transition Example: Graph that recognizes a set of strings that start with an even number of 1’s, followed by an even number of 0’s, and end with substring 101 1

B

0

D A

1

C

0 1 0

E

(

a

) A graph containing a -transition.

F

1

G

1

B A

1 0 0

D C

0 1 1

E

0

F

(

b

) An equivalent graph without -transitions.

1

G

5

### Converting Nondeterministic into Deterministic Graphs

Example: Transition graph and its transition table 0,1 •

A

1 0 0

C

1

A B C

0

C

1

AB AC A B

(

a

) Transition graph.

(

b

) Transition table.

Successor table and deterministic graph: 0

AB C

0 1 0 1

AB C AC

1

C AB A

1

AC A AC ABC A

1

A C

0

ABC ABC AC ABC

0 0 0,1 (

a

) Successor table.

(

b

) State diagram of an equivalent deterministic machine.

6

### Theorem

Theorem: Let

S

be a set of strings that can be recognized by a nondeterministic transition graph

G n

. Then

S

can also be recognized by an equivalent deterministic graph

G d

. Moreover, if

G n G d

will have at most 2

p

vertices has

p

vertices, 7

### Regular Expressions

Example: Sets of strings and the corresponding expression • • • Graph (

a

) recognizes set {101}: expression denoted as

101

Graph (

b

) recognizes set {01,10}: expression =

01

+

10

Graph (

c

) recognizes {0111,1011}: expression =

0111

+

1011

• – Concatenation of

01

Graph (

d

+

10

 and

11

expression =

1*

1 0 1 (

a

) 0 1 0 1 1 0 (

b

) 1 1 0 1 1 (

d

) (

c

) 8

### Regular Expressions (Contd.)

Example: 01(01)* = 01 + 0101 + 010101 + 01010101 + … •

R*

R

+

R 2

+

R 3

+ … Example: Set of strings on {0,1} beginning with a 0 and followed only by 1’s:

01*

Example: Set of strings on {0,1} containing exactly two 1’s:

0*10*10*

Example: Set of all strings on {0,1}: (

0

+

1

0

+

1

+

00

+

01

+

10

+

11

+

000

+ … Example: Set of strings on {0,1} that begin with substring 11:

11

(

0

+

1

)* Example: Transition graphs and the sets of strings they recognize 1

B

0

A

0 1

D

1

E A B

1 1

C

0 9 (

a

)

(01

+

10)*11

.

(

b

)

(10*)*

.

### Definition and Basic Properties

Let

A

= {

a

1 ,

a

2 ,…,

a p

} be a finite alphabet: expressions over alphabet

A

then the class of regular is defined recursively as follows: • Any symbol,

a

and empty set

1

 ,

a

2

, …,

a p

alone is a regular expression: • If

P

and

Q

are regular expressions: their union

P

+

Q

then so is their concatenation

PQ

and – If

P

is a regular expression: then so is its closure

P*

• No other expressions are regular: unless they can be generated in a finite number of applications of the above rules 

A B

(

a

) A graph accepting .

(

b

) A graph accepting .

10

### Identities

R R

 +

R

=

R

=

R

    

R

 =

R

* = * =   Set of strings that can be described by a regular expression: regular set • Not every set of strings is regular • Set over {0,1}, which consists of

k

in turn by

k

0’s, is not regular:

010

0’s (for all

k

), followed by a 1, followed +

00100

+

0001000

+ … +

0 k 10 k

+ … – Requires an infinite number of applications of the union operation • However, certain infinite sums are regular – Set consisting of alternating 0’s and 1’s, starting and ending with a 1:

1

(

01

)* 11

### Manipulating Regular Expressions

A regular set may be described by more than one regular expression • Such expressions are called equivalent Example: Alternating 0’s and 1’s, starting and ending with 1 •

1

(

01

)* or (

10

)*

1

Let

P

,

Q

, and

R

be regular expressions: then

R

+

R

=

R PQ

+

PR = P

(

Q

+

R

)

; PQ

+

RQ

= (

P

+

R

)

Q R*R*

=

R* RR*

=

R*R

(

R*

)

*

=  +

R* RR*

=

R*

(

PQ

)

*P

=

P

(

QP

)

*

(

P

+ 

Q

)

*

= (

P*Q*

)

*

= (

P*

+ + (

P + Q

)

*Q

= (

P*Q

)

* Q*

)

*

=

P*

(

QP*

)

*

= (

P*Q

)

*P*

12

### Examples

Example: Prove that the set of strings in which every 0 is immediately followed by at least two 1’s can be described by both

R 1

and

R 2

, where

R 1

1*

(

011

)

*

(

1*

(

011

)

*

)

*

Proof:

R 2

= (

1 + 011

)

* R 1 1*

(

011

)

*

(

1*

(

011

)

*

)

*

= (

1*(011

)

*

)

*

= (

1 + 011

)

* = R 2

Example: Prove the identity Proof: (

1

+

00*1

) + (

1

+

00*1

)(

0

+

10*1

)

*

(

0

LHS = (

1

+

00

*

1

0

00*

)

1

0

+

10*1

)

*

(

0

+

10*1

)

*

(

0

+

10*1

) =

0*1

(

0

+

10*1

)] +

10*1

)] +

10*1

)

*

=

0*1

(

0

+

10*1

)

*

13

### Transition Graphs Recognizing Regular Sets

Theorem: Every regular expression

R

can be recognized by a transition graph Proof: i (

a

)

R

= .

(

b

)

R

= .

(

c

)

R

= .

G G H

(

a

) Graphs recognizing

P

and

Q

.

H

(

b

) A graph recognizing

P

+

Q

.

G

(

c

) A graph recognizing

PQ

.

H G

(

d

) A graph recognizing

P*

.

14

### Example

Example: Construct a transition graph recognizing

R

= (

0

+

1

(

01

)

*

)

*

0

P

A B C A B C A

(

a

)

R

=

P*; P

=

0

+

1

(

01

)

*.

0

B C

Q

(

b

)

P

=

0

+

Q; Q

=

1

(

01

)

*.

A

1

T

D

(

c

)

Q

=

1T; T

= (

01

)

*.

0

B

1 1

D E C

(

d

) Final step

.

0

F

15

### Example (Contd.)

Example: Prove that (

P

+

Q

)

*

=

P*

(

QP*

)

*

P Q P

(

a

) Graph recognizing

P*(QP*)*

.

P

,

Q Q Q P P P

(

b

) Equivalent graph with no -transitions.

P

,

Q

(

a

) Equivalent deterministic graph recognizing

(P

+

Q)*

.

16

### Informal Techniques

Example: Construct a graph that recognizes

P

= (

01

+ (

11

+

0

)

1*0

)

*11

1

B

1 1 Graph for

Q

= (

11

+

0

)

1*0

A C

0 0

D

Graph for

P

D

0 1

A

1

E

1 0 1

B

1 0

C

1

F

17

### Example

Example: Construct a graph that recognizes

R

= (

1

(

00

)

*1 + 01*0

)

*

1 1 0 0 0

B C

0

B

0

A A

0 1 1 1

D

1

F

1

D

1 0 0

E

0 (

a

) Partial graph.

E

0 (

b

) Complete graph.

C F

18

### Regular Sets Corresponding to Transition Graphs

The set of strings that can be recognized by a transition graph (hence, an FSM) is a regular set Theorem: Let

Q

,

P

, and

R

Then, if

P

be regular expressions on a finite alphabet.  • Equation

R

=

Q

+

RP

has a unique solution given by

R

=

QP*

• Equation

R

=

Q

+

PR

has a unique solution given by

R

=

P*Q

19

### Systems of Equations

Example: Derive the set of strings derived by the following transition graph 0 1 1 0

A B C

A

A0

+

B1 B

=

A0

+

B1

+

C0 C

=

B0

Substituting (3) into (2): (1) 0 (2) (3) 0

B

=

A0

+

B1

+

B00

=

A0

+

B

(

1

+

00

) (4) From the theorem:

B

=

A0

(

1

+

00

)

*

(5) Substituting (5) into (1):

A

A0

+

A0

(

1

+

00

)

*1

A

(

0

+

0

(

1

+

00

)

*1

) (6) From the theorem:

A

0

+

0

(

1

+

00

)

*1

)

*

= (

0

+

0

(

1

+

00

)

*1

)

*

Hence, solution

C

from (7), (5) and (3): (7)

C

= (

0

+

0

(

1

+

00

)

*1

)

*0

(

1

+

00

)

*0

20

### Theorem

Theorem: The set of strings that take an FSM

M

from an arbitrary state

S i

to another state

S j

is a regular set • Combining the two theorems: – An FSM recognizes a set of strings if and only if it is a regular set Applications: the correspondence between regular sets and FSMs enables us to determine whether certain sets are regular Example: Let

R

in

R

denote a regular set on alphabet

A

that can be recognized by machine

M

1 • Complement

R’

: set containing all the strings on

A

that are not contained •

R’

describes a regular set: which is obtained from

M

1 since it can be recognized by a machine

M

2 , by complementing the output values associated with the states of

M

1 21

### Examples

Example: Let

P&Q

represent the intersection of sets

P

and

Q

• • Prove

P&Q

is regular Since

P’

and

Q’

P’

+

Q’

are regular: is regular – – Hence, (

P’

Since

P&Q

+

Q’

) = (

P’ ’

is regular +

Q’

)

:

P&Q

is regular Regular expressions containing complementation, intersection, union, concatenation, closure: extended regular expressions Example: Consider the set of strings on {0,1} s.t. no string in the set contains three consecutive 0’s • • Set can be described by: [(

0

+

1

)

*000

(

0

+

1

)

*

]

More complicated expression if complementation not used: (

1

+

01

+

001

)

*

0

+

00

) 22

### Example

Example: Let

M

be an FSM whose input/output alphabet is {0,1}. Assume the machine has a designated starting state. Let

z

1

z

2 …

z n

the output sequence produced by

M

denote in response to input sequence

x

1

x

2 …

x n

. Define a set

S M

, which consists of all the strings

w w

=

z

1

x

1

z

2

x

2 …

z n x n

regular.

for any

x

1

x

2 …

x n

in (

0

+

1

)

*

. Prove that

S M

s.t. is • Given the state diagram of

M

: replace each directed arc with two directed arcs and a new state, as shown in the figure • Retain the original starting state: designate all the original states as accepting states • The resulting nondeterministic graph recognizes

S M

: thus

S M

is regular

x/z z x

Replace

A B

with

A B

23

### Example (Contd.)

Example (contd.): Derive

S N

for machine

N

shown below 1/0 0/1

A B

0/0 1/1 1

C

0

E

1

A

1 1 0

B

0 1

F

(

a

) Transition graph.

0

D A

0

DF

1 0 0,1 0 1

CE B

1 (

b

) Equivalent deterministic form.

0 24

### Two-way Recognizers

Two-way recognizer (or two-way machine): consists of a finite-state control coupled through a head to a tape • Initially: the finite-state control is in its designated starting state, with its head scanning the leftmost square of the tape • • The machine then proceeds to read the symbols of the tape: one at a time In each cycle of computation: the machine examines the symbol currently scanned by the head, shifts the head one square to the right or left, and then enters a new (not necessarily distinct) state • If the machine eventually moves off the tape on the right end entering an accepting state: the tape is accepted by the machine • • • A machine can reject a tape: either by moving off its right end while entering a rejecting state or by looping within the tape  the absence of an input tape or by a completely blank tape  its starting state is an accepting state 25

### Example

Example: A two-way machine recognizing set

100*

c A B

0

A A B

(

a

) A loop.

c A

1

A

1

B D D D D

(

b

) Rejection of a tape.

26

### Convenience of Using Two-way Machines

Two-way machines are as powerful as one-way machines w.r.t the class of tapes they can recognize • However, for some computations: it is convenient to use two-way machines since they may require fewer states Example: Consider the two-way machine shown in the table, which accepts a tape if and only if it contains at least three 1’s and at least two 0’s • The minimal one-way machine that is equivalent to the two-way machine has 12 states: since it must examine the tapes for the appropriate number of 0’s and 1’s simultaneously

c A

1

A

0

B

0

B

1

B

0

C

0

C C

(

a

) Rejecting a tape.

c A D

1

A

0

B

1

B

1

C

0

D D D E E F F

(

b

) Accepting a tape.

F

0

G G

27