Cse321, Programming Languages and Compilers Lecture #3b, Jan. 22, 2007 •References •While Loops •Accumulating parameter functions •Regular Expressions as programs •RE to NFA •Patterns for RE’s 11/7/2015

Download Report

Transcript Cse321, Programming Languages and Compilers Lecture #3b, Jan. 22, 2007 •References •While Loops •Accumulating parameter functions •Regular Expressions as programs •RE to NFA •Patterns for RE’s 11/7/2015

Cse321, Programming Languages and Compilers
Lecture #3b, Jan. 22, 2007
•References
•While Loops
•Accumulating parameter functions
•Regular Expressions as programs
•RE to NFA
•Patterns for RE’s
11/7/2015
1
Cse321, Programming Languages and Compilers
Libraries
• There are lots of predefined types and
functions in SML when it starts up.
• You can find out about them at:
http://www.smlnj.org//basis/pages/top-level-chapter.html
• Many more can be found in the other
libraries.
– http://www.standardml.org/Basis/manpages.html
• Libraries are encapsulated in Structures
which are classified by Signatures (a list of
what is in the structure).
11/7/2015
2
Cse321, Programming Languages and Compilers
Peeking inside a Library
• To see what is inside a Structure you can open it.
• This is somewhat of a hack, but it is useful.
Standard ML of New Jersey v110.57 [built: Mon Nov 21 21:46:28 2005]
- open Int;
opening Int
type int = ?.int
val precision : Int31.int option
val minInt : int option
val maxInt : int option
val toLarge : int -> IntInf.int
val fromLarge : IntInf.int -> int
val toInt : int -> Int31.int
val fromInt : Int31.int -> int
val div : int * int -> int
val mod : int * int -> int
val quot : int * int -> int
val rem : int * int -> int
val min : int * int -> int
val max : int * int -> int
11/7/2015
3
Cse321, Programming Languages and Compilers
The List library
- open List;
opening List
datatype 'a list = :: of 'a * 'a list | nil
exception Empty
val null : 'a list -> bool
val hd : 'a list -> 'a
val tl : 'a list -> 'a list
val last : 'a list -> 'a
val getItem : 'a list -> ('a * 'a list) option
val nth : 'a list * int -> 'a
val take : 'a list * int -> 'a list
val drop : 'a list * int -> 'a list
val length : 'a list -> int
val rev : 'a list -> 'a list
val @ : 'a list * 'a list -> 'a list
val concat : 'a list list -> 'a list
val revAppend : 'a list * 'a list -> 'a list
val app : ('a -> unit) -> 'a list -> unit
val map : ('a -> 'b) -> 'a list -> 'b list
val mapPartial : ('a -> 'b option) -> 'a list -> 'b list
val find : ('a -> bool) -> 'a list -> 'a option
val filter : ('a -> bool) -> 'a list -> 'a list
val partition : ('a -> bool) -> 'a list -> 'a list * 'a list
val foldr : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
val foldl : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
val exists : ('a -> bool) -> 'a list -> bool
val all : ('a -> bool) -> 'a list -> bool
val tabulate : int * (int -> 'a) -> 'a list
val collate : ('a * 'a -> order) -> 'a list * 'a list -> order
-
11/7/2015
4
Cse321, Programming Languages and Compilers
References
• References allow one to write programs with mutable variables.
• The interface to assignment and update is slightly different
from other languages
– val r = (ref 5)
– !r
– (ref n) => …
» fun ! (ref n)
– r := 6 + z
Create a new reference that can be updated
Get the value stored in the reference
Pattern match to get value
= n
Update a reference with a new value
• Later today we will use the following:
val next = ref 0;
fun new () = let val ref n = next
in (next := n+1; n) end;
Alternatively
fun new () = let val n = !next
in (next := n+1; n) end;
11/7/2015
5
Cse321, Programming Languages and Compilers
While Loops
• While loops are similar to other languages, they
usually require use of references (to get the
condition to eventually change).
• Statements are inside ()’s and separated by “;”
val n = ref 4;
Semicolon to
separate
val w1 =
statements
while (!n > 0)
do (print (Int.toString (!n) ^ "\n");
n := (!n) – 1 );
Semicolon to
end the
“val w1 = …”
declaration
11/7/2015
6
Cse321, Programming Languages and Compilers
Factorial as a while loop
• Factorial
fun fact3 n =
let val ans = ref 1
val count = ref n
in while (!count > 0)
do (ans := !ans * !count
; count := !count - 1);
!ans
Return the value stored in
end;
the reference ans
Inside a let between “in”
and “end” we don’t need
to surround statements
with ()s, but we still
separate with “;”
compare with
the recursive
versions
fun fact1 n = if n=0 then 1 else n * (fact1 (n-1));
fun fact2 0 = 1
| fact2 n = n * (fact2 (n-1));
11/7/2015
7
Cse321, Programming Languages and Compilers
Accumulating parameters
• Many loops look like this
{ ans = init;
While test do stuff;
Return ans}
• There is a pattern that mimics this in ML called
functions with accumulating parameter.
• The pattern consists of a recursive function with two
(or more) parameters. The first parameter drives the
loop (usually by pattern matching), the second
accumulates an answer (like ans in example above).
• We call the function with init as the value of the
second argument to get started.
• We return the second argument when the function is
done recurring.
11/7/2015
8
Cse321, Programming Languages and Compilers
Fact as an accumulating function
fun fact4
let fun
|
in loop
n =
loop 0 ans = ans
loop n ans = loop (n-1) (n*ans)
n 1 end;
{ ans = init;
While test do stuff;
Return ans}
11/7/2015
9
Cse321, Programming Languages and Compilers
Flat as an accumulating function
datatype Tree =
Tip | Node of Tree * int * Tree;
fun flat3 x =
let fun help Tip ans = ans
| help (Node(x,y,z)) ans =
help x (y::(help z ans))
in help x [] end;
{ ans = init;
While test do stuff;
Return ans}
11/7/2015
10
Cse321, Programming Languages and Compilers
Regular Expressions
• Regular Languages and Regular expressions are
used to describe the patterns which describe
lexemes.
• Regular expressions are composed of empty-string,
concatenation, union, and closure.
• Examples:
closure
A(A | D)*
where A is alphabetic and
D is a digit
union
Empty-string
(+ | - | ε ) D D*
11/7/2015
Concatenation is implicit
11
Cse321, Programming Languages and Compilers
Meaning of Regular Expressions
Let A,B be sets of strings:
The empty string: ""
ε= { "" }
(sometimes <empty> )
Concatenation by juxtaposition:
AB = a^b where a in A and b in B
A = {"x", "qw"} and B = {"v", "A"}
then AB = { "xv", "xA", "qwv", "qwA"}
11/7/2015
12
Cse321, Programming Languages and Compilers
Meaning of Regular Expressions (cont.)
Union by |
(or other symbols like U etc)
A = {"x", "qw"} and B = {"v", "A"}
then A|B = {"x", "qw", "v", "A"}
Closure by *
Thus A* = {""} | A | AA | AAA | ...
= A0 | A1 | A2 | A3 | ...
A = {"x", "qw"}
then A* = { "" } | {"x", "qw"}
| {"xqw", "qwx","xx", "qwqw"} | ...
11/7/2015
13
Cse321, Programming Languages and Compilers
Regular Expressions as a language
• We can treat regular expressions as a programming
language.
• Each expression is a new program.
• Programs can be compiled.
• How do we represent the regular expression
language? By using a datatype.
datatype RE
= Empty
| Union of RE * RE
| Concat of RE * RE
| Star of RE
| C of char;
11/7/2015
14
Cse321, Programming Languages and Compilers
Example RE program
(+ | - | ε ) D D*
val re1 =
Concat(Union(C #”+”,Union(C #”-”,Empty))
,Concat(C #”D”,Star (C #”D”)))
11/7/2015
15
Cse321, Programming Languages and Compilers
R.E.’s and FSA’s
• Algorithm that constructs a FSA from a regular
expression.
• FSA
–
–
–
–
–
alphabet , A
set of states, S
a transition function, A x S -> S
a start state, S0
a set of accepting states, SF subset of S
• Defined by cases over the structure of regular
expressions
• Let A,B be R.E.’s, “x” in A, then
–
–
–
–
–
11/7/2015
ε is a R.E.
“x” is a R.E.
AB is a R.E.
A|B is a R.E.
A* is a R.E.
1 Rule for each case
16
Cse321, Programming Languages and Compilers
Rules
• ε
ε
• “x”
x
B
A
• AB
ε
• A|B
ε
A
ε
ε
B
ε
• A*
11/7/2015
ε
A
ε
ε
17
Cse321, Programming Languages and Compilers
Example: (a|b)*abb
ε
0
ε
ε
1
ε
2
a
3
ε
6
4
b
5
ε
7
ε
a
ε
8
b
10
b
9
•Note the many ε transitions
•Loops caused by the *
•Non-Determinism, many paths out of a state on “a”
11/7/2015
18
Cse321, Programming Languages and Compilers
Building an NFA from a RE
datatype Label
= Epsilon
| Char of char;
type Start = int;
type Finish = int;
datatype Edge
= Edge of Start * Label * Finish;
Ref makes a mutable variable
val next = ref 0;
fun new () = let val ref n = next
in (next := n+1; n) end;
11/7/2015
Semi colon separates commands
(inside parenthesis)
19
Cse321, Programming Languages and Compilers
ε
fun nfa Empty =
let val s = new()
val f = new()
in (s,f,[Edge(s,Epsilon,f)]):Nfa end
| nfa (C x) =
x
let val s = new()
val f = new()
in (s,f,[Edge(s,Char x,f)]) end
| nfa (Union(x,y)) =
let val (sx,fx,xes) = nfa x
val (sy,fy,yes) = nfa y
val s = new()
ε
val f = new()
val newes =
ε
[Edge(s,Epsilon,sx)
,Edge(s,Epsilon,sy)
,Edge(fx,Epsilon,f)
,Edge(fy,Epsilon,f)]
in (s,f,newes @ xes @ yes) end
11/7/2015
ε
A
ε
B
20
Cse321, Programming Languages and Compilers
| nfa (Concat(x,y)) =
let val (sx,fx,xes) = nfa x
val (sy,fy,yes) = nfa y
in (sx,fy,(Edge(fx,Epsilon,sy))::
(xes @ yes))
end
B
A
| nfa (Star r) =
let val (sr,fr,res) = nfa r
val s = new()
val f = new()
val newes = [Edge(s,Epsilon,sr)
,Edge(fr,Epsilon,f)
,Edge(s,Epsilon,f)
,Edge(f,Epsilon,s)]
ε
in (s,f,newes @ res) end
ε
11/7/2015
A
ε
ε
21
Cse321, Programming Languages and Compilers
Example use
val re1 =
Concat(Union(C #”+”,Union(C #”-”,Empty))
,Concat(C #”D”,Star (C #”D”)))
Val ex6 = nfa re1;
val ex6 =
(8,15,
[Edge (9,Epsilon,10),Edge (8,Epsilon,0)
,Edge (8,Epsilon,6),Edge (1,Epsilon,9)
,Edge (7,Epsilon,9),Edge (0,Char #,1)
,Edge (6,Epsilon,2),Edge (6,Epsilon,4)
,Edge (3,Epsilon,7),Edge (5,Epsilon,7)
,Edge (2,Char #,3),Edge (4,Epsilon,5),...]) : Nfa
11/7/2015
22
Cse321, Programming Languages and Compilers
Assignment #3
CS321 Prog Lang & Compilers
Assignment # 3
Assigned: Jan 22, 2007
Due: Wed. Jan 24, 2007
Turn in a listing, and a transcript that shows you have tested your code. A minimum of 3 tests is necessary.
Some functions may require more than 3 tests to receive full credit.
1) Write the following functions over lists. You must use pattern matching and recursion.
A. reverse a list so that its elements appear in the oposite order.
B. Count the number of occurrences of an element in a list
count 4 [1,2,3,4,5,4] ---> 2
reverse [1,2,3,4]
count 4 [1,2,3,2,1]
---->
[4,3,2,1]
---> 0
C. concatenate together a list of lists
concat [[1,2],[],[5,6]] ----> [1,2,5,6]
2) Using the datatype for Regular Expressions we defined in class
datatype RE
= Empty
| Union of RE * RE
| Concat of RE * RE
| Star of RE
| C of char;
Write a function that turns a RE into a string, so that it can be
printed. Minimize the number of parenthesis, but keep the string
unambigouous by using the following rules.
1) Star has highest precedence so: ab* means a(b*)
2) Concat has the next highest precedence so: a+bc
means
a+(bc)
3) Union has lowest precedence so: a+bc+c*
means
a+(bc)+(c*)
4) Use the hash mark (#) as the empty string.
5) Special characters *+()\ should be escaped by using a
preceeding backslash.
So (Concat (C #"+") (C #"a")) should be
"\+a"
Hints:
1) The string concatenation operator is usefull:
"abc" ^ "zx" -----> "abczx"
2) Write this is two steps.
First, fully paranethesize every RE
Second, Change the function to not add the parenthesis which
the rules don't require.
11/7/2015
23