Cse321, Programming Languages and Compilers Lecture #12, Feb. 21, 2007 •Basic Types •Constructed Types •Representing types as data •Describing type systems as rules •Type rules for.
Download ReportTranscript Cse321, Programming Languages and Compilers Lecture #12, Feb. 21, 2007 •Basic Types •Constructed Types •Representing types as data •Describing type systems as rules •Type rules for.
Cse321, Programming Languages and Compilers
Lecture #12, Feb. 21, 2007
5/2/2020
•
Basic Types
•
Constructed Types
•
Representing types as data
•
Describing type systems as rules
•
Type rules for ML
•
Type equality
•
Type coercions
•
Sub typing
•
Purpose of type systems
•
Kinds of type systems
•
Primitive types
•
Constructed types
•
Type checking
•
Attribute grammars
•
Inherited attributes
•
Synthesized attributes
•
Adding attributes to trees
•
Programs for computing attribute computations.
1
Cse321, Programming Languages and Compilers
Assignments
• •
Reading Chapter 4,
– – – –
Sections 4.3 (Attribute Grammar) 4.4 (Adhoc syntax directed translation).
Pages 171-200 Quiz on all of chapter 4 read so far (4.1-4.4) on Monday.
5/2/2020 2
Cse321, Programming Languages and Compilers
Type Checking
•
Type Checking, assigns a consistent type to every expression and statement.
•
Generally there are two kinds of types:
– –
Basic types: int, real, bool, string etc.
Constructed types: array, list, products (pairs, triples, etc), pointers, records, functions. These contain instances of other types.
•
In ML types are quite simple. Constructed types include things like functions ( int - > string ), tuples ( int * bool * string ), lists ( int list), etc.
•
In mini Java types are more complex because of classes and inheritance 5/2/2020 3
Cse321, Programming Languages and Compilers
Representing Types
datatype MLtype = Unit | Int | Char | Bool . . .
| Product of MLtype list | Arrow of (MLtype * MLtype);
•
Product and arrow types are the only constructed types in this example. 5/2/2020 4
Cse321, Programming Languages and Compilers
Describing Type Systems
• •
A type system gives rules for assigning types to expressions and statements based upon the types of their sub-expressions and sub-statements. A standard way to talk about type systems is to use an inference notation.
–
Let S (or some other symbol) stand for a mapping from names to types.
–
Then rules are of the form: S |- x S |- w ------------------- S |- y (usually x & w are “sub-pieces” of y)
•
Which is read as: “To show S derives y, show S derives x, and S derives w”.
5/2/2020 5
Cse321, Programming Languages and Compilers
Simple Rules
•
We always have simple rules, such as: (S x) = t -------------------- S |- x : t Think of S as a table. In this table we look up the types of simple objects like variables. So (S x) = t means that when we look up the type of x we find the type t and S |- 5 : int For Primitives, like constants, we just know their types.
• •
To show S derives x (x a variable) has type t, show that the mapping S applied to x is t. And the integer 5 has type int (regardless of what's derivable from S, that’s why there is nothing above the line).
5/2/2020 6
Cse321, Programming Languages and Compilers
Complex Rules
•
To implement rules like this we would use an attribute computation.
•
Note that the mapping, S, might change as we move around the program.
– –
Declarations add to S, adding types to new variables.
Exiting a local scope means removing things from S
•
S is usually implemented as an inherited attribute. And the types that we derive annotate the tree and are a synthesized attributes.
5/2/2020 7
Cse321, Programming Languages and Compilers
Rules for ML expressions
(S x) = t -------------------- S |- x : t
(where x = is a variable)
S |- n : int
(where n = an integer constant like 5 or 23 )
S |- c : char
(where c = character constant like #”a” )
S |- b : bool
(where b = boolean like true or false )
5/2/2020 8
Cse321, Programming Languages and Compilers
ML expression types (cont)
S |- x : a S |- f : a -> t ----------------------------------- S |- f x : t Note how the domain of the function must have the same type as the actual argument., S |- x : t1 S |- y : t2 (S <+>)= t1 * t2 -> t3 -----------------------------
where <+> is a binary operator like + or *
S |- x <+> y : t3 5/2/2020 9
Cse321, Programming Languages and Compilers
ML statement types
• •
Statements in ML are semi-colon separated expressions inside of parentheses.
–
E.g. (print x; x + 1) The expressions before that last are executed only for their side effects. They can have any type. The type of the last expression is the type of the statement.
S |- e
i
: a
i
S |= e
n
: t ------------------------------------- S |- (e
1
; … ;e
n
) : t
5/2/2020 10
Cse321, Programming Languages and Compilers
Assignment and If expressions
S |- x : t ref S |- y : t ------------------------------------ S |- x := y : unit S |- x : bool S |= s1 : t S |- s2 : t -------------------------------------- S |- if x then s1 else s2 : t 5/2/2020 11
Cse321, Programming Languages and Compilers
ML while stmt
S |- e : bool S |- s : a ----------------------------- S |- while e do s : unit 5/2/2020 12
Cse321, Programming Languages and Compilers
ML anonymous function types
S+(x,a) |- e : b ----------------------------- S |- (fn x => e) :: a -> b S+(x,a) means add the mapping of variable x to tha type a to the table S. If S already has a mapping for x, then overwrite it with a 5/2/2020 13
Cse321, Programming Languages and Compilers
Implementing the Rules
•
To implement the rules, use an inductive function which takes an expression and returns a MLtype.
•
Any error that occurs indicates the expression can’t be well typed.
•
The mapping S is an inherited attribute.
–
That means it changes as we move around the program
•
If a type appears more than once in any rule, it must be the same for all occurrences.
•
This requires that we check that two types are equal.
•
In an language without polymorphic types, this is simple. The function below illustrates structural 14
Cse321, Programming Languages and Compilers
Type Equality
fun typeeq (x,y) = case (x,y) of (Void,Void) => true | (Int,Int) => true | (Char,Char) => true | (Bool,Bool) => true | (Arrow(d1,r1),Arrow(d2,r2)) => typeeq(d1,d2) andalso typeeq(r1,r2) | (Product(ss),Product(ts)) => (listeq ss ts) | (_,_) => false and listeq (x::xs) (y::ys) = typeeq(x,y) andalso listeq xs ys | listeq [] [] = true | listeq _ _ = false Note we need mutually recursive functions 5/2/2020 15
Cse321, Programming Languages and Compilers
Type Equality in more complicated type systems
•
For more complicated type systems type equality can be quite difficult without some simplifying rules.
•
For example any type system that allows names for types, or recursive type definitions may not be able to use structural equality.
–
Why? A system with names says before comparing for equality substitute out each name.
–
What if a name has a recursive definition?
•
Names are important in real systems because they allow recursive definitions, but hard to test for equality. Such systems often have two ways of declaring types. And use name equality.
– –
i.e.. don’t substitute a definition for a name. Two types with different names could have identical definitions, but not be equal.
5/2/2020 16
Cse321, Programming Languages and Compilers
Example
type intlist = pointer(variant record tag nil {}, tag cons { car : int; cdr : intlist }); datatype 'a list = nil | cons of 'a * 'a list;
•
If we tried to compare two recursive things for equality using structural equality we could get into an infinite loop.
5/2/2020 17
Cse321, Programming Languages and Compilers
Type Coercions
• • •
Using type systems to infer coercions. Sometimes we would like operators to be overloaded, so we have to infer which one to use.
Type checking not only annotates the tree it might insert things as well.
typecheck: (string -> Pascaltype) -> (string Exp) -> (string Exp') where Exp' is an annotated type similar to Exp but with type annotations.
5/2/2020 18
Cse321, Programming Languages and Compilers
Example Algorithm
fun typecheck S x = case x of ...
| (Binop(oper,x,y)) => let val x' = typecheck S x val y' = typecheck S y in case (tagof x', tagof y') of (Int,Int) => Binop'(Int, intversion oper, x', y') | (Real,Real) => Binop'(Real, realversion oper, x', y') | (Int,Real) => Binop'(Real, realversion oper, int2real x', y') | (Real,Int) => Binop'(Real, realversion oper, x', int2real y') end ...
5/2/2020 19
Cse321, Programming Languages and Compilers
Sub Typing
• • • •
In some languages there is subtyping. For example the type 3 .. 12 is a subtype of the type Int. A function that expects an Int could take an element which was a subtype of Int. This is called subsumption.
Such rules might be expressed as: S |- x i : s i S |- P : t 1 & (s i * ... * t n <= t i ) -> Void -------------------------------- S |- P(x1, ... ,xn) : Void
•
The type system would have to be able to check the <= relationship between types, just as we computed type equality.
5/2/2020 20
Cse321, Programming Languages and Compilers
Types
•
Purpose
– –
Types describe both the form and behavior of valid programs Safety
»
Bad things do not happen
–
Expressiveness
»
Operator overloading
• •
Context sensiitve meaning Can’t be expressed in the syntax of the language without richer types of grammars.
–
Efficiency
»
By choosing implementation that depend on type information the most efficient ones can be used
–
Representation information
»
Types often express knowledge about how a value is represented
• •
How much space it takes up Whether it is a pointer 5/2/2020 21
Cse321, Programming Languages and Compilers
Kinds of Type Systems
• • •
Untyped
– –
No type information is carried by the data No type checking is done, any kind of type mistake causes a run time error. The error is often mysterious
»
Core dump
–
Examples: parts of C, assembly language Dynamically typed
–
Data carries type information
» »
Tags Pointers into discreet ranges
– – –
Operations test the type information before running Type errors are caught at run-time, often tell what exactly went wrong E.g. Lisp, Basic, certain parts of the class herrachy in Java Statically typed
– – –
Data may or may not have type information Errors are caught be disallowing programs that don’t type check E.g. ML, Haskel, Parts of Java.
5/2/2020 22
Cse321, Programming Languages and Compilers
Basic Types
• • •
Numbers
– – – –
Int Int64 Unsigned . . .
Characters
– –
Traditionally 8 bit Usually 16 or 32 bit with the advent of unicode and extended character sets.
Booleans
– –
Sometimes 1 bit Often the same representation as int 5/2/2020 23
Cse321, Programming Languages and Compilers
Constructed Types
• • •
Arrays
– –
Homogeneous Contiguous Strings
– – –
Usually special syntax for string constants Sometimes arrays Sometimes linked list representation Enumerated types
– –
Finite number of elements In ML
»
Datatype Color = Red | Blue | Green | Yellow | Purple | Orange 5/2/2020 24
Cse321, Programming Languages and Compilers
Products
•
Products are heterogeneous aggregates
• • •
Structures Tuples Records
•
Sometimes have named fields, sometimes uses pattern matching or integer indexing
• • •
person.age
fun f (name,age,address) = . . .
person.#1 5/2/2020 25
Cse321, Programming Languages and Compilers
Pointers
•
Pointers are a low level (implementation level) mechanism to describe indirection.
•
In some languages, Notably C, one can do pointer arithmetic.
•
Advantages
– –
Uniform representation size. All pointers have the same size.
Sharing (change what is pointed to, all other pointers observe the change).
•
Disadvantages
– –
Null or dangling pointers Sometimes hard to know what type is the thoing pointed to 5/2/2020 26
Cse321, Programming Languages and Compilers
Unions
•
When an element of a type can take on one of a number of different forms
• • • •
Variants Unions Datatypes in ML Classes in Java 5/2/2020 27
Cse321, Programming Languages and Compilers
Type Checking
•
Trys to catch errors where data is used in a manner inconsistent with its definition
• • • •
Operators take specific types as operands Functions take specific types are arguments Only pointer types can be de-referenced Only union types can be “cased” over 5/2/2020 28
Cse321, Programming Languages and Compilers
Type Checking
• • • • •
Typing is directed by the syntax, or the structure of the program.
Usually performed by walking the abstract syntax tree.
Type checking attaches a type to every sub expression (or piece of syntax)
–
This type is always given in terms of the type of the sub expressions Often we think of this type being an attribute of each syntax node.
Attribute grammars provide a natural way of describing this.
5/2/2020 29
Cse321, Programming Languages and Compilers
Examples
•
Grammar E -> E < E E -> E andalso E E -> number bool true bool andalso 3 int
•
Abstract Type type exp = And of exp * exp | Less of exp * exp | Num of int
<
bool
•
5/2/2020 2 int
•
attributes flowing up tree are “synthesised” the
•
attributes flowing down the tree are “inherited” 30
Cse321, Programming Languages and Compilers
Example 2 synthesized attributes
•
Meaning = Code to compute E (represented as a (string * string) ) fun mean x = case x of Plus(x,y) => let val (namex,codex) = (mean x) val (namey,codey) = (mean y) val new = newtemp() in (new, codex ^ codey ^ (new ^ “ = “ ^namex ^ “ + “ ^ namey)) end | Times(x,y) => let val (namex,codex) = (mean x) val (namey,codey) = (mean y) val new = newtemp() in (new, codex ^ codey ^ (new ^ “ = “ ^namex ^ “ * “ ^ namey)) end 5/2/2020 | Num n => let val new = newtemp() in (new, new ^” = “^(int2str n)) end 31
Cse321, Programming Languages and Compilers
Example 2
(cont) (“T5”, “T1 = 5 T2 = 3 T3 = 2 T4 = T2 * T3 T5 = T1 + T4”)
+
(“T4”, “T2 = 3 T3 = 2 T4 = T2 * T3” )
*
(“T1”,“T1 = 5”) 5 (“T2”,“T2 = 3”) 3 (“T3”, “T3 = 2”) 2 5/2/2020 32
Cse321, Programming Languages and Compilers
Inherited attributes
(int,f)(int,s)(bool,t) int f(int s, bool t) { int temp ; temp = 0 if (t ) return s; return (temp+3); } Type Type int (int,f) Name Fdecl Name (Int,s) f Decl Args (bool,t) Decl Type Name Body int s bool t (int,f)(int,s)(bool,t) (int,temp) Decl Stmt Type Name (int,f)(int,s)(bool,t)(int,temp) temp int 5/2/2020 33
Cse321, Programming Languages and Compilers
Attribute Grammar Computations
•
Decorating the syntax tree.
•
Computing Synthesized attributes proceeds from the leaves to the root.
–
Synthesized computations are implemented by an inductive function, where the value computed (returned) is the synthesized attribute.
•
Computing Inherited attributes passes information from the root to the leaves.
–
Inherited computations are implemented by an inductive function with an extra parameter.
5/2/2020 34
Cse321, Programming Languages and Compilers
Example: Synthesized
datatype exp = Int of int | Real of real | Op of exp * string * exp; datatype value = I of int | R of real; exception mix_matched_type fun operate x s y = case (x,s,y) of (I n,"+",I m) => I (n+m) | (I n,"*",I m) => I (n*m) | (R n,"+",R m) => R (n+m) | (R n,"*",R m) => R (n*m) | _ => raise mix_matched_type 5/2/2020 35
Cse321, Programming Languages and Compilers
Example
(cont.) fun translate e = case e of Int n => I n | Real r => R r | Op(x,s,y) => let val xv = translate x val yv = translate y in operate xv s yv end
• •
Note that information flows Up the tree.
We recursively translate the leaves before we translate a node.
•
This causes a bottom up flow. 5/2/2020 36
Cse321, Programming Languages and Compilers
Explicitly Annotating the tree
• •
If we want to build a tree which has explicit annotations we need to define a type which has “room” for the annotations.
Use polymorphism to encode a type with “room” for an annotation at each node. (the
a
in the types below) datatype Exp a = Int' of a * int | Real' of a * real | Op' of a * (Exp a) * string * (Exp a); fun getattr e = case e of Int'(x,n) => x | Real'(x,r) => x | Op'(x,a,s,b) => x; 5/2/2020 37
Cse321, Programming Languages and Compilers
Explicit Annotation Example
fun translate e = case e of Int’( _ ,n) => Int'(I n,n) | Real’( _ ,r) => Real'(R r,r) | Op’( _ ,x,s,y) => let val xv = translate x val yv = translate y in Op'(operate (getattr xv) s (getattr yv), v,s,yv) end
•
Note we ignore the attribute (use the wild card pattern) on the way down, and rebuild the tree with the correct attribute on the way up. 5/2/2020 38
Cse321, Programming Languages and Compilers
Inherited Attributes
• •
Consider an expression language with declarations and implicit coercions.
Grammar: E -> id E -> ( E ) E -> id : Type in E E -> E op E Example: var x : int in * x var y : real in (x + y)
•
datatype: datatype exp = Id of string
x needs to be coerced to a real
| ItoR of exp
(explicit coercion)
| DeclI of string * exp | DeclR of string * exp | Op of exp * string * exp; 5/2/2020 39
Cse321, Programming Languages and Compilers
Inherited Attribute Tree
var x : int in var y : real in (x + y) * x
We have
var
We want
var x x int var int var 5/2/2020 x y + real y
*
x x y Real + real y
*
Real x 40
Cse321, Programming Languages and Compilers
Inherited Computation
[ ] var x y 5/2/2020 [ (y, real), (x, int) ] x int var [ (x, int) ] + real y [ (y, real), (x, int),] * [ (y, real), x [ (y, real), (x, int) ] (x, int) ] 41
Cse321, Programming Languages and Compilers
Simultaneous Synthesized Computation
real, Op(I2R(Var(x)),”*” Op(I2R(Var(x)) “+”, Var(y))) real, Op(I2R(Var(x)), “+”, Var(y)) + * [ (x, int), (y, real) ] [ (y, real), (x, int) ] [ (y, real), (x, int) ] x int,Var(x) 5/2/2020 int,Var(x) x [ (y, real), (x, int) ] real,Var(y) y [ (y, real), (x, int) ] 42
Cse321, Programming Languages and Compilers
The Computation
•
The computation uses an extra parameter of type list(string * type) as the inherited attribute.
•
It returns a type * exp as the synthesized attribute.
•
We need a modified Op constructor that uses the type info to add the explicit I2R annotations.
fun AnnOp ( (t1,e1),oper,(t2,e2) ) = case (t1,t2) of (int,int) => (int, Op(e1,oper,e2)) | (real,int) => (real, Op(e1,oper,I2R e2)) | (int,real) => (real, Op(I2R e1,oper, e2)) | (real,real) => (real, Op(e1,oper,e2)) 5/2/2020 43
Cse321, Programming Languages and Compilers
Algorithm
fun translate e types = case e of Var s => (lookup s types,Var s) | DeclI(s,e) => DeclI(s, translate e ( (s,int) :: types )) | DeclR(s,e) => DeclR(s, translate e ( (s,real) :: types )) | Op(x,s,y) => let val xv = translate x types val yv = translate y types in AnnOp(xv,s,yv) end
•
Note how the extra parameter types is used to add information and pass it “down” the tree.
5/2/2020 44
Cse321, Programming Languages and Compilers
Overview
•
The key to writing successful attribute computations is thinking ahead.
•
First identify what the synthesized attributes are. Think of a type that will represent these. This is the return type of the computation.
•
Second identify what the inherited attributes are. For each one there will be an extra parameter.
•
The final type will be something like: syntaxtree -> inh1 -> inh2 -> (syn1 * syn2) 5/2/2020 45
Cse321, Programming Languages and Compilers
Tagging the tree
• •
Sometimes the synthesized attribute needs to be added to the tree rather than be returned.
Again thinking ahead is important.
– –
The tree needs room for the extra attribute The tree used as input doesn’t contain any interesting values as the attribute. The attributes in the input are usually ignored
–
Rather than return the synthesized attribute, the program returns a new tree.
•
The final type will be something like: syntaxtree a -> inh1 -> inh2 -> syntaxtree (syn1 * syn2) Note the tag is originally some type that we ignore.
But the output is a new tree with filled in attribute (or attributes as a tuple) 5/2/2020 46