Defining the language
Download
Report
Transcript Defining the language
Welcome to !
Theory Of Automata
1
Text and Reference Material
1. Introduction to Computer Theory, by Daniel
I. Cohen, John Wiley and Sons, Inc., 1991,
Second Edition
2. Introduction to Languages and Theory of
Computation, by J. C. Martin, McGraw Hill
Book Co., 1997, Second Edition
2
What does automata mean?
It is the plural of automation, and it means
“something that works automatically”
Study of abstract computing devices or
machines
3
History
Turing study an abstract machine
All capabilities of today's computers
Goal was to describe precisely the boundaries of B/W what a
computing machine could do and what couldn't.
In 1940,1950 simpler kinds of machines “finite
automata” studied by researchers
Which model brain functions
In 1950 linguist begun study the formal
grammar's
Serve as the basis of some important software components
including parts of compiler
Finite automata and formal grammars are used in
the design and construction of softwares
4
Why Study Automata Theory
Finite automata is useful model for many
hardware and software's
Software for designing and checking the
behavior of digital circuits
Lexical analyzer of typical compiler
Software's for scanning large bodies of text.
Software's for verifying systems of all types that
have a finite number of distinct states, such as
communications protocol.
5
Example
Nontrivial finite automation is an on/off switch.
device remembers whether it is in the on state
or off state
Push
Start
ON
OFF
Push
States are represented by Circles
Arcs labeled with Inputs
6
Example (Recognizing of Then)
Start
T
T
H
TH
E
THE
N
THEN
Inputs are letters
Analyze examines one character of program
Start state corresponding to empty string
Each state has a transition to next letters
7
Introduction to languages
A set of symbols that expresses ideas and
allows people to think and communicate
with each other.
Language is a system at many levels.
Not just a collection of words, language
consists of rules and patterns that relate
the words to one another
8
Introduction to languages
There are two types of languages
Formal Languages (Syntactic languages)
Informal Languages (Semantic
languages)
9
English Language
There are three different entities
1. Letters
2. Words
3. Sentences
Group of letters make words
Group of words make sentences
Not all collection of letters form valid word
Not all collections of words form valid sentences
If Analogy Continued
collections of sentences make paragraph
Collection of paragraph make stories
10
English Language
Humans agree on which sequence are valid and
which are not
Situation exists with computer languages certain
character strings are recognizable
Words (Do, If, End)
Certain strings of words recognizable become
commands, and commands become program
and then be compiled to machine code.
11
English Language
Whether an input is valid communication then
rules for decoding exactly what the
communications means
Language must be able to tell who is in and who
is out.
Very hard to state rules.
12
Theory of Formal Language
Refers to the fact that all rules for language
explicitly stated what strings of symbols can
occur.
No liberties are tolerated.
Its game of symbols with formal rules
Not expressions of ideas in the minds of human
13
Formal Languages (Alphabets)
Definition:
A finite non-empty set of symbols (letters), is
called an alphabet. It is denoted by Σ ( Greek
letter sigma).
Example:
Σ={a,b}
Σ={0,1} //important as this is the language
//which the computer understands.
Σ={i,j,k}
14
NOTE:
A certain version of language ALGOL has
113 letters
Σ (alphabet) includes letters, digits and a
variety of operators including sequential
operators such as GOTO and IF
15
Strings
Definition:
Concatenation of finite symbols or letters
from the alphabet is called a string.
Example:
If Σ= {a,b} then
a, abab, aaabb, ababababababababab
16
NOTE:
EMPTY STRING or NULL STRING
Sometimes a string with no symbol or letters
at all is used, denoted by (Small Greek letter
Lambda) λ or (Capital Greek letter Lambda) Λ,
is called an empty string or null string.
What alphabet is considering the null string is
always Λ
The capital lambda will mostly be used to
denote the empty string, in further discussion.
17
Words
Definition:
Words are strings belonging to some
language.
Example:
If Σ= {x} then a language L can be
defined as
L={xn : n=1,2,3,…..} or L={x,xx,xxx,….}
Here x,xx,… are the words of L
18
NOTE:
Finite set of fundamental units out of
which we build structure called Alphabet.
Specified set of strings of characters from
alphabet called Language
Strings those are permissible in language
called Words.
Possible string is that it contain only
finitely many letters or symbols
All words are strings, but not all strings
are words.
19
Note
Two words consider same if their order and
letters are same
There is only one word without no letters.
Λ symbol is not allowed in the part of
alphabets of any language.
The language that has no words the symbol is
used Ф.
This is not true Λ is the word in the language Ф.
20
Note
If L= Ф not contain Λ
If we want to add Λ to L we use union of set
operators ‘+’ to form L + { Λ }
This language is not same as L
But L + Ф =L
If we have method for producing language and
in certain instance method produce nothing
We can say method produced nothing or failed.
21
English
Whole alphabets are represented as
Σ= {a, b, c, d, ……}
Sometimes elements are separated by
comma, spaces and some times uppercase
letters are used.
From these alphabets which strings are valid
English-word={all words in a standard
dictionary}
22
English
This language still have no grammar if we want to make
a formal definition use capital gamma
┌ ={entries in standard dictionary, blank space, usual punctuation
marks}
Produce sentences as
I am teaching
U are listening
If we only follow rules of grammar then
I ate three Tuesdays
You ate cloths
Grammatically corrects but has wrong meanings
In formal languages these sentences are correct
We interested syntax alone not semantics or diction
The set of rules defining English is a grammar
23
Example
My_subject
Alphabet for this language is
{E A P T S W}
Only one word in this language I wish to specify
If earth and moon ever collide then
My_subject={SE}
If earth and moon never collide then
My_subject={AT}
24
Example
It is impossible to be certain whether the word
AT is or not in language MY_subject
Set of rules must enable us to decide, in a finite
amount of time whether given string of alphabet
letters is or not a word in language
Requirements are not made that all the letters
in the alphabet need to appear in the word
selected for the language
25
Defining Languages
Two kinds of rules to define languages
How to test a valid word
OR
How to construct all word in the language
Example:
If Σ= {a} then a language L can be defined as
L={a, aa, aaa, aaaaa….}
L={an : n=1,2,3,…..} or Here a, aa,… are the words of L
concatenation operation is same as addition
If aa concatenated with aaa then we find aaaaa written as
An concatenated with am is word a m+n
Convenient way is x=aaa and y=aa
Xy=aaaaa
26
Defining Languages
Not always true that when two words are concatenated
they produce another word in language.
L2={a, aaa, aaaaa, aaaaaaa….}
={a Odd }
={a 2n+1 for n=0,1,2,3,…} then
X=aaa and y=aaaaa then
Xy=aaaaaaaa not in L2 but alphabet of L2 and L1 are
same
Also xy=yx but in some case that’s not true like
X=house and y=boat
Xy=houseboat and yx=boathouse so xy # yx
27
Valid/In-valid alphabets
While defining an alphabet, an alphabet may
contain letters consisting of group of symbols
for example Σ1= {B, aB, bab, d}.
Now consider an alphabet
Σ2= {B, Ba, bab, d}
and a string BababB.
28
Valid/In-valid alphabets
This string BababB can be tokenized in two
different ways
(Ba), (bab), (B)
(B), (abab), (B)
Which shows that the second group cannot
be identified as a string, defined over
Σ = {a, b}.
29
Valid/In-valid alphabets
As when this string is scanned by the
compiler (Lexical Analyzer), first symbol B is
identified as a letter belonging to Σ, while for
the second letter the lexical analyzer would
not be able to identify, so while defining an
alphabet it should be kept in mind that
ambiguity should not be created.
30
Remarks:
While defining an alphabet of letters
consisting of more than one symbols,
no letter should be started with the letter of
the same alphabet i.e. one letter should not
be the prefix of another. However, a letter
may be ended in the letter of same alphabet
i.e. one letter may be the suffix of another.
31
Conclusion
Σ1= {B, aB, bab, d}
Σ2= {B, Ba, bab, d}
Σ1 is a valid alphabet while Σ2 is an in-valid
alphabet.
32
Length of Strings
Definition:
The length of string s, denoted by |s|, is the
number of letters in the string.
Example:
Σ={a,b}
s=ababa
|s|=5
33
Length of Strings
Example:
Σ= {B, aB, bab, d}
s=BaBbabBd
Tokenizing=(B), (aB), (bab), (B) , (d)
|s|=5
length(Λ)=0 means if length (w)=0 then w=Λ
34
Reverse of a String
Definition:
The reverse of a string s denoted by Rev(s)
or s r, is obtained by writing the letters of s
in reverse order.
Example:
If s=abc is a string defined over Σ={a,b,c}
then Rev(s) or s r = cba
35
Example:
Σ= {B, aB, bab, d}
s=BaBbabBd
Rev(s)=dBbabaBB
36
Defining Languages
The languages can be defined in different
ways , such as Descriptive definition,
Recursive definition, using Regular
Expressions(RE) and using Finite
Automaton(FA) etc.
Descriptive definition of language:
The language is defined, describing the
conditions imposed on its words.
37
Defining Languages
Example:
The language L of strings of odd length,
defined over Σ={a}, can be written as
L={a, aaa, aaaaa,…..}
Example:
The language L of strings that does not start
with a, defined over Σ={a,b,c}, can be written
as
L={b, c, ba, bb, bc, ca, cb, cc, …}
38
Defining Languages
Example:
The language L of strings of length 2,
defined over Σ={0,1,2}, can be written as
L={00, 01, 02,10, 11,12,20,21,22}
Example:
The language L of strings ending in 0,
defined over Σ ={0,1}, can be written as
L={0,00,10,000,010,100,110,…}
39
Defining Languages
Example: The language EQUAL, of strings with
number of a’s equal to number of b’s, defined
over Σ={a,b}, can be written as
{Λ ,ab,aabb,abab,baba,abba,…}
Example: The language EVEN-EVEN, of strings
with even number of a’s and even number of
b’s, defined over Σ={a,b}, can be written as
{Λ, aa, bb, aaaa,aabb,abab, abba, baab, baba,
bbaa, bbbb,…}
40
Defining Languages
Example: The language INTEGER, of strings
defined over Σ={-,0,1,2,3,4,5,6,7,8,9}, can
be written as
INTEGER = {…,-2,-1,0,1,2,…}
Example: The language EVEN, of stings
defined over Σ={-,0,1,2,3,4,5,6,7,8,9}, can
be written as
EVEN = { …,-4,-2,0,2,4,…}
41
Defining Languages
Example: The language {anbn }, of strings
defined over Σ={a,b}, as
{an bn : n=1,2,3,…}, can be written as
{ab, aabb, aaabbb,aaaabbbb,…}
Example: The language {anbnan }, of strings
defined over Σ={a,b}, as
{an bn an: n=1,2,3,…}, can be written as
{aba, aabbaa, aaabbbaaa,aaaabbbbaaaa,…}
42
Defining Languages
Example: The language factorial, of strings
defined over Σ={1,2,3,4,5,6,7,8,9} i.e.
{1,2,6,24,120,…}
Example: The language FACTORIAL, of
strings defined over Σ={a}, as
{an! : n=1,2,3,…}, can be written as
{a,aa,aaaaaa,…}. It is to be noted that the
language FACTORIAL can be defined over
any single letter alphabet.
43
Defining Languages
Example: The language DOUBLEFACTORIAL,
of strings defined over Σ={a, b}, as
{an!bn! : n=1,2,3,…}, can be written as
{ab, aabb, aaaaaabbbbbb,…}
Example: The language SQUARE, of strings
defined over Σ={a}, as
n2
{a : n=1,2,3,…}, can be written as
{a, aaaa, aaaaaaaaa,…}
44
Defining Languages
Example: The language
DOUBLESQUARE, of strings defined
over Σ={a,b}, as
n2 n2
{a b : n=1,2,3,…}, can be written as
{ab, aaaabbbb, aaaaaaaaabbbbbbbbb,…}
45
Defining Languages
Example: The language PRIME, of
strings defined over Σ={a}, as
p
{a : p is prime}, can be written as
{aa,aaa,aaaaa,aaaaaaa,aaaaaaaaaaa…}
46
An Important language
PALINDROME:
The language consisting of Λ and the
strings s defined over Σ such that
Rev(s)=s.
It is to be denoted that the words of
PALINDROME are called palindromes.
Example:For Σ={a,b},
PALINDROME={Λ , a, b, aa, bb, aaa, aba,
bab, bbb, ...}
47
Note
Number of strings of length ‘m’ defined over
alphabet of ‘n’ letters is nm.
Examples:
The language of strings of length 2, defined
over Σ={a,b} is L={aa, ab, ba, bb} i.e.
number of strings = 22
The language of strings of length 3, defined
over Σ={a,b} is L={aaa, aab, aba, baa, abb,
bab, bba, bbb} i.e. number of strings = 23
48
Exercise
Q) Prove that there are as many palindromes
of length 2n, defined over Σ = {a,b,c}, as
there are of length 2n-1. Determine the
number of palindromes of length 2n defined
over the same alphabet as well.
49
KLEENE STAR Closure
Given Σ, then the KLEENE STAR Closure of
the alphabet Σ, denoted by Σ*, is the
collection of all strings defined over Σ,
including Λ.
It is to be noted that KLEENE STAR Closure
can be defined over any set of strings.
50
Examples
If Σ = {x}
Then Σ* = {Λ, x, xx, xxx, xxxx, ….}
If Σ = {0,1}
Then Σ* = {Λ, 0, 1, 00, 01, 10, 11, ….}
If Σ = {aaB, c}
Then Σ* = {Λ, aaB, c, aaBaaB, aaBc, caaB,
cc, ….}
51
Note
Languages generated by Kleene Star Closure
of set of strings, are infinite languages. (By
infinite language, it is supposed that the
language contains infinite many words, each
of finite length).
Order the words in Lexicographic order.
Shorter length first and then other words of
same length
52
Example
Let S={aa, b} then
S* ={Λ Plus any word composed of factors
of aa and b }
S* ={Λ Plus all strings of a’s and b’s in which
a’s occur in even clumps}
={Λ b aa aab baa bbb aaaa baab bbaa…….}
NOTE: string aabaaab is not in S*
53
Example
Let S={a, ab} then
S* ={Λ Plus any word composed of factors
of a and ab }
S* ={Λ Plus all strings of a’s and b’s except
those that start with b and those that contain
a double b}
={Λ a aa ab aaa aab …….}
54
Example
Parenthesis can be the letter of the alphabet
If Σ = {x ( ) }
Then Σ* = {Λ, x, xx, xxx, xxxx, ….}
Length(xxxxx)=5
Length( (xx)(xxx) )=9
55
Note
If alphabet has no letters then its closure is a
language with null string as its only word.
If Σ = Ф
Then Σ* = { Λ }
But not same as
if s={ Λ } then
S* ={ Λ }
56
Task
Q)
1) Let S={ab, bb} and T={ab, bb, bbbb} Show that S*
= T*
2) Let S={ab, bb} and T={ab, bb, bbb} Show that S*
≠ T* But S* T*
3) Let S={a, bb, bab, abaab} be a set of strings. Are
abbabaabab and baabbbabbaabb in S*? Does any
word in S* have odd number of b’s?
57
PLUS Operation (+)
Plus Operation is same as Kleene Star Closure
except that it does not generate Λ (null string),
automatically.
Example:
If Σ = {0,1}
Then Σ+ = {0, 1, 00, 01, 10, 11, ….}
If Σ = {aab, c}
Then Σ+ = {aab, c, aabaab, aabc, caab, cc, ….}
58
Remark
It is to be noted that Kleene Star can also be
operated on any string i.e. a* can be considered
to be all possible strings defined over {a}, which
shows that a* generates
Λ, a, aa, aaa, …
It may also be noted that a+ can be considered
to be all possible non empty strings defined over
{a}, which shows that a+ generates
a, aa, aaa, aaaa, …
59
Theorem1
i.
For any set S of strings we have
S*=S**
Every word in S** is made up of factors from S*
Every factor from S* is made up of factors from S. so every
word in S** is made up of factors from S.
Every word in S** is also a word in S* we can write as
S** contain S*
S** S* --------------------------1
As we know that A A*
If A=S* then S* S** --------------------------2
By 1 and 2
S*=S**
60
TASK
Q1)Is there any case when S+ contains Λ? If
yes then justify your answer.
Q2) Prove that for any set of strings S
i. (S+)*=(S*)*
ii. (S+)+=S+
iii. Is (S*)+=(S+)*
61
Defining Languages Continued…
Recursive definition of languages
The following three steps are used in recursive
definition
1. Some basic objects (words) are specified in the
language.
2. Rules for constructing more objects (words)
are defined in the language.
3. No objects (strings) except those constructed
in above, are allowed to be in the language.
62
Example
Defining language of POSITIVE
INTEGER
Rule 1:
1 is in INTEGER.
Rule 2:
If x is in INTEGER then x+1 and x-1 are
also in INTEGER.
Rule 3:
No strings except those constructed in
above, are allowed to be in INTEGER.
63
Example
Defining language of EVEN
Even is the set of the all positive whole
numbers divisible by 2
Even is the set of all 2n where
n=1,2,3,4,5,…..
64
Example
Defining language of EVEN
Rule 1:
2 is in EVEN.
Rule 2:
If x is in EVEN then x+2 and x-2 are also in EVEN.
Rule 3:
No strings except those constructed in above, are
allowed to be in EVEN.
Assignment: state and prove two more recursive definition
of Even
65
Example
Defining language of POSITIVE and
NEGATIVE INTEGER
Rule 1:
1 is in INTEGER.
Rule 2:
If both x and y is in INTEGER then x+y and
x-y are also in INTEGER.
Rule 3:
No strings except those constructed in
above, are allowed to be in INTEGER.
66
Example
Defining the language factorial
Rule 1:
As 0!=1, so 1 is in factorial.
Rule 2:
n!=n*(n-1)! is in factorial.
Rule 3:
No strings except those constructed in above,
are allowed to be in factorial.
67
Example
Defining the language PALINDROME, defined
over Σ = {a,b}
Rule 1:
a and b are in PALINDROME
Rule 2:
if x is palindrome, then s(x)Rev(s) and xx will also
be palindrome, where s belongs to Σ*
Rule 3:
No strings except those constructed in above,
are allowed to be in palindrome
68
Example
Defining the language {anbn }, n=1,2,3,… ,
of strings defined over Σ={a,b}
Rule 1:
ab is in {anbn}
Rule 2:
if x is in {anbn}, then axb is in {anbn}
Rule 3:
No strings except those constructed in
above, are allowed to be in {anbn}
69
Example
Defining the language L, of strings ending in a ,
defined over Σ={a,b}
Rule 1:
a is in L
Rule 2:
if x is in L then s(x) is also in L, where s belongs to Σ*
Rule 3:
No strings except those constructed in above, are
allowed to be in L
70
Example
Defining the language L, of strings beginning and
ending in same letters , defined over Σ={a, b}
Rule 1:
a and b are in L
Rule 2:
(a)s(a) and (b)s(b) are also in L, where s belongs to Σ*
Rule 3:
No strings except those constructed in above, are
allowed to be in L
71
Example
Defining the language L, of strings containing aa
or bb , defined over
Σ={a, b}
Rule 1:
aa and bb are in L
Rule 2:
s(aa)s and s(bb)s are also in L, where s belongs to Σ*
Rule 3:
No strings except those constructed in above, are
allowed to be in L
72
Example
Defining the language L, of strings containing
exactly aa, defined over
Σ={a, b}
Rule 1:
aa is in L
Rule 2:
s(aa)s is also in L, where s belongs to b*
Rule 3:
No strings except those constructed in above, are
allowed to be in L
73
Example
An Important Language ARITHMETIC EXPRESSION (A.E)
Rule 1:
Any number (+ive, -ive or zero) is in A.E
Rule 2: if x is in A.E so
(x)
-x
(x does not start with already – sign)
Rule 3: if x and y are in A.E so are
X+y
X-y
X*y
x/y
X**y
No strings except those constructed in above, are allowed to be in L
(2+4)*(7*(9-3)/4*(2+8)-1
74
Theorem-2
An arithmetic expression cannot contain the
character $
Proof
Denied by rule 1
Denied by rule 2
Denied by rule 3
75
Theorem-3
No A.E can begin or end with symbol /
Proof
Denied by rule 1
Denied by rule 2
Denied by rule 3
76
Theorem-4
No A.E contain the substring //
77
Summing Up
Recursive definition of languages, INTEGER,
EVEN, factorial, PALINDROME, {anbn},
languages of strings (i) ending in a, (ii)
beginning and ending in same letters, (iii)
containing aa or bb (iv)containing exactly aa,
78