Cse321, Programming Languages and Compilers Lecture #12, Feb. 21, 2007 •Basic Types •Constructed Types •Representing types as data •Describing type systems as rules •Type rules for.

Download Report

Transcript Cse321, Programming Languages and Compilers Lecture #12, Feb. 21, 2007 •Basic Types •Constructed Types •Representing types as data •Describing type systems as rules •Type rules for.

Cse321, Programming Languages and Compilers

Lecture #12, Feb. 21, 2007

5/2/2020

Basic Types

Constructed Types

Representing types as data

Describing type systems as rules

Type rules for ML

Type equality

Type coercions

Sub typing

Purpose of type systems

Kinds of type systems

Primitive types

Constructed types

Type checking

Attribute grammars

Inherited attributes

Synthesized attributes

Adding attributes to trees

Programs for computing attribute computations.

1

Cse321, Programming Languages and Compilers

Assignments

• •

Reading Chapter 4,

– – – –

Sections 4.3 (Attribute Grammar) 4.4 (Adhoc syntax directed translation).

Pages 171-200 Quiz on all of chapter 4 read so far (4.1-4.4) on Monday.

5/2/2020 2

Cse321, Programming Languages and Compilers

Type Checking

Type Checking, assigns a consistent type to every expression and statement.

Generally there are two kinds of types:

– –

Basic types: int, real, bool, string etc.

Constructed types: array, list, products (pairs, triples, etc), pointers, records, functions. These contain instances of other types.

In ML types are quite simple. Constructed types include things like functions ( int - > string ), tuples ( int * bool * string ), lists ( int list), etc.

In mini Java types are more complex because of classes and inheritance 5/2/2020 3

Cse321, Programming Languages and Compilers

Representing Types

datatype MLtype = Unit | Int | Char | Bool . . .

| Product of MLtype list | Arrow of (MLtype * MLtype);

Product and arrow types are the only constructed types in this example. 5/2/2020 4

Cse321, Programming Languages and Compilers

Describing Type Systems

• •

A type system gives rules for assigning types to expressions and statements based upon the types of their sub-expressions and sub-statements. A standard way to talk about type systems is to use an inference notation.

Let S (or some other symbol) stand for a mapping from names to types.

Then rules are of the form: S |- x S |- w ------------------- S |- y (usually x & w are “sub-pieces” of y)

Which is read as: “To show S derives y, show S derives x, and S derives w”.

5/2/2020 5

Cse321, Programming Languages and Compilers

Simple Rules

We always have simple rules, such as: (S x) = t -------------------- S |- x : t Think of S as a table. In this table we look up the types of simple objects like variables. So (S x) = t means that when we look up the type of x we find the type t and S |- 5 : int For Primitives, like constants, we just know their types.

• •

To show S derives x (x a variable) has type t, show that the mapping S applied to x is t. And the integer 5 has type int (regardless of what's derivable from S, that’s why there is nothing above the line).

5/2/2020 6

Cse321, Programming Languages and Compilers

Complex Rules

To implement rules like this we would use an attribute computation.

Note that the mapping, S, might change as we move around the program.

– –

Declarations add to S, adding types to new variables.

Exiting a local scope means removing things from S

S is usually implemented as an inherited attribute. And the types that we derive annotate the tree and are a synthesized attributes.

5/2/2020 7

Cse321, Programming Languages and Compilers

Rules for ML expressions

(S x) = t -------------------- S |- x : t

(where x = is a variable)

S |- n : int

(where n = an integer constant like 5 or 23 )

S |- c : char

(where c = character constant like #”a” )

S |- b : bool

(where b = boolean like true or false )

5/2/2020 8

Cse321, Programming Languages and Compilers

ML expression types (cont)

S |- x : a S |- f : a -> t ----------------------------------- S |- f x : t Note how the domain of the function must have the same type as the actual argument., S |- x : t1 S |- y : t2 (S <+>)= t1 * t2 -> t3 -----------------------------

where <+> is a binary operator like + or *

S |- x <+> y : t3 5/2/2020 9

Cse321, Programming Languages and Compilers

ML statement types

• •

Statements in ML are semi-colon separated expressions inside of parentheses.

E.g. (print x; x + 1) The expressions before that last are executed only for their side effects. They can have any type. The type of the last expression is the type of the statement.

S |- e

i

: a

i

S |= e

n

: t ------------------------------------- S |- (e

1

; … ;e

n

) : t

5/2/2020 10

Cse321, Programming Languages and Compilers

Assignment and If expressions

S |- x : t ref S |- y : t ------------------------------------ S |- x := y : unit S |- x : bool S |= s1 : t S |- s2 : t -------------------------------------- S |- if x then s1 else s2 : t 5/2/2020 11

Cse321, Programming Languages and Compilers

ML while stmt

S |- e : bool S |- s : a ----------------------------- S |- while e do s : unit 5/2/2020 12

Cse321, Programming Languages and Compilers

ML anonymous function types

S+(x,a) |- e : b ----------------------------- S |- (fn x => e) :: a -> b S+(x,a) means add the mapping of variable x to tha type a to the table S. If S already has a mapping for x, then overwrite it with a 5/2/2020 13

Cse321, Programming Languages and Compilers

Implementing the Rules

To implement the rules, use an inductive function which takes an expression and returns a MLtype.

Any error that occurs indicates the expression can’t be well typed.

The mapping S is an inherited attribute.

That means it changes as we move around the program

If a type appears more than once in any rule, it must be the same for all occurrences.

This requires that we check that two types are equal.

In an language without polymorphic types, this is simple. The function below illustrates structural 14

Cse321, Programming Languages and Compilers

Type Equality

fun typeeq (x,y) = case (x,y) of (Void,Void) => true | (Int,Int) => true | (Char,Char) => true | (Bool,Bool) => true | (Arrow(d1,r1),Arrow(d2,r2)) => typeeq(d1,d2) andalso typeeq(r1,r2) | (Product(ss),Product(ts)) => (listeq ss ts) | (_,_) => false and listeq (x::xs) (y::ys) = typeeq(x,y) andalso listeq xs ys | listeq [] [] = true | listeq _ _ = false Note we need mutually recursive functions 5/2/2020 15

Cse321, Programming Languages and Compilers

Type Equality in more complicated type systems

For more complicated type systems type equality can be quite difficult without some simplifying rules.

For example any type system that allows names for types, or recursive type definitions may not be able to use structural equality.

Why? A system with names says before comparing for equality substitute out each name.

What if a name has a recursive definition?

Names are important in real systems because they allow recursive definitions, but hard to test for equality. Such systems often have two ways of declaring types. And use name equality.

– –

i.e.. don’t substitute a definition for a name. Two types with different names could have identical definitions, but not be equal.

5/2/2020 16

Cse321, Programming Languages and Compilers

Example

type intlist = pointer(variant record tag nil {}, tag cons { car : int; cdr : intlist }); datatype 'a list = nil | cons of 'a * 'a list;

If we tried to compare two recursive things for equality using structural equality we could get into an infinite loop.

5/2/2020 17

Cse321, Programming Languages and Compilers

Type Coercions

• • •

Using type systems to infer coercions. Sometimes we would like operators to be overloaded, so we have to infer which one to use.

Type checking not only annotates the tree it might insert things as well.

typecheck: (string -> Pascaltype) -> (string Exp) -> (string Exp') where Exp' is an annotated type similar to Exp but with type annotations.

5/2/2020 18

Cse321, Programming Languages and Compilers

Example Algorithm

fun typecheck S x = case x of ...

| (Binop(oper,x,y)) => let val x' = typecheck S x val y' = typecheck S y in case (tagof x', tagof y') of (Int,Int) => Binop'(Int, intversion oper, x', y') | (Real,Real) => Binop'(Real, realversion oper, x', y') | (Int,Real) => Binop'(Real, realversion oper, int2real x', y') | (Real,Int) => Binop'(Real, realversion oper, x', int2real y') end ...

5/2/2020 19

Cse321, Programming Languages and Compilers

Sub Typing

• • • •

In some languages there is subtyping. For example the type 3 .. 12 is a subtype of the type Int. A function that expects an Int could take an element which was a subtype of Int. This is called subsumption.

Such rules might be expressed as: S |- x i : s i S |- P : t 1 & (s i * ... * t n <= t i ) -> Void -------------------------------- S |- P(x1, ... ,xn) : Void

The type system would have to be able to check the <= relationship between types, just as we computed type equality.

5/2/2020 20

Cse321, Programming Languages and Compilers

Types

Purpose

– –

Types describe both the form and behavior of valid programs Safety

»

Bad things do not happen

Expressiveness

»

Operator overloading

• •

Context sensiitve meaning Can’t be expressed in the syntax of the language without richer types of grammars.

Efficiency

»

By choosing implementation that depend on type information the most efficient ones can be used

Representation information

»

Types often express knowledge about how a value is represented

• •

How much space it takes up Whether it is a pointer 5/2/2020 21

Cse321, Programming Languages and Compilers

Kinds of Type Systems

• • •

Untyped

– –

No type information is carried by the data No type checking is done, any kind of type mistake causes a run time error. The error is often mysterious

»

Core dump

Examples: parts of C, assembly language Dynamically typed

Data carries type information

» »

Tags Pointers into discreet ranges

– – –

Operations test the type information before running Type errors are caught at run-time, often tell what exactly went wrong E.g. Lisp, Basic, certain parts of the class herrachy in Java Statically typed

– – –

Data may or may not have type information Errors are caught be disallowing programs that don’t type check E.g. ML, Haskel, Parts of Java.

5/2/2020 22

Cse321, Programming Languages and Compilers

Basic Types

• • •

Numbers

– – – –

Int Int64 Unsigned . . .

Characters

– –

Traditionally 8 bit Usually 16 or 32 bit with the advent of unicode and extended character sets.

Booleans

– –

Sometimes 1 bit Often the same representation as int 5/2/2020 23

Cse321, Programming Languages and Compilers

Constructed Types

• • •

Arrays

– –

Homogeneous Contiguous Strings

– – –

Usually special syntax for string constants Sometimes arrays Sometimes linked list representation Enumerated types

– –

Finite number of elements In ML

»

Datatype Color = Red | Blue | Green | Yellow | Purple | Orange 5/2/2020 24

Cse321, Programming Languages and Compilers

Products

Products are heterogeneous aggregates

• • •

Structures Tuples Records

Sometimes have named fields, sometimes uses pattern matching or integer indexing

• • •

person.age

fun f (name,age,address) = . . .

person.#1 5/2/2020 25

Cse321, Programming Languages and Compilers

Pointers

Pointers are a low level (implementation level) mechanism to describe indirection.

In some languages, Notably C, one can do pointer arithmetic.

Advantages

– –

Uniform representation size. All pointers have the same size.

Sharing (change what is pointed to, all other pointers observe the change).

Disadvantages

– –

Null or dangling pointers Sometimes hard to know what type is the thoing pointed to 5/2/2020 26

Cse321, Programming Languages and Compilers

Unions

When an element of a type can take on one of a number of different forms

• • • •

Variants Unions Datatypes in ML Classes in Java 5/2/2020 27

Cse321, Programming Languages and Compilers

Type Checking

Trys to catch errors where data is used in a manner inconsistent with its definition

• • • •

Operators take specific types as operands Functions take specific types are arguments Only pointer types can be de-referenced Only union types can be “cased” over 5/2/2020 28

Cse321, Programming Languages and Compilers

Type Checking

• • • • •

Typing is directed by the syntax, or the structure of the program.

Usually performed by walking the abstract syntax tree.

Type checking attaches a type to every sub expression (or piece of syntax)

This type is always given in terms of the type of the sub expressions Often we think of this type being an attribute of each syntax node.

Attribute grammars provide a natural way of describing this.

5/2/2020 29

Cse321, Programming Languages and Compilers

Examples

Grammar E -> E < E E -> E andalso E E -> number bool true bool andalso 3 int

Abstract Type type exp = And of exp * exp | Less of exp * exp | Num of int

<

bool

5/2/2020 2 int

attributes flowing up tree are “synthesised” the

attributes flowing down the tree are “inherited” 30

Cse321, Programming Languages and Compilers

Example 2 synthesized attributes

Meaning = Code to compute E (represented as a (string * string) ) fun mean x = case x of Plus(x,y) => let val (namex,codex) = (mean x) val (namey,codey) = (mean y) val new = newtemp() in (new, codex ^ codey ^ (new ^ “ = “ ^namex ^ “ + “ ^ namey)) end | Times(x,y) => let val (namex,codex) = (mean x) val (namey,codey) = (mean y) val new = newtemp() in (new, codex ^ codey ^ (new ^ “ = “ ^namex ^ “ * “ ^ namey)) end 5/2/2020 | Num n => let val new = newtemp() in (new, new ^” = “^(int2str n)) end 31

Cse321, Programming Languages and Compilers

Example 2

(cont) (“T5”, “T1 = 5 T2 = 3 T3 = 2 T4 = T2 * T3 T5 = T1 + T4”)

+

(“T4”, “T2 = 3 T3 = 2 T4 = T2 * T3” )

*

(“T1”,“T1 = 5”) 5 (“T2”,“T2 = 3”) 3 (“T3”, “T3 = 2”) 2 5/2/2020 32

Cse321, Programming Languages and Compilers

Inherited attributes

(int,f)(int,s)(bool,t) int f(int s, bool t) { int temp ; temp = 0 if (t ) return s; return (temp+3); } Type Type int (int,f) Name Fdecl Name (Int,s) f Decl Args (bool,t) Decl Type Name Body int s bool t (int,f)(int,s)(bool,t) (int,temp) Decl Stmt Type Name (int,f)(int,s)(bool,t)(int,temp) temp int 5/2/2020 33

Cse321, Programming Languages and Compilers

Attribute Grammar Computations

Decorating the syntax tree.

Computing Synthesized attributes proceeds from the leaves to the root.

Synthesized computations are implemented by an inductive function, where the value computed (returned) is the synthesized attribute.

Computing Inherited attributes passes information from the root to the leaves.

Inherited computations are implemented by an inductive function with an extra parameter.

5/2/2020 34

Cse321, Programming Languages and Compilers

Example: Synthesized

datatype exp = Int of int | Real of real | Op of exp * string * exp; datatype value = I of int | R of real; exception mix_matched_type fun operate x s y = case (x,s,y) of (I n,"+",I m) => I (n+m) | (I n,"*",I m) => I (n*m) | (R n,"+",R m) => R (n+m) | (R n,"*",R m) => R (n*m) | _ => raise mix_matched_type 5/2/2020 35

Cse321, Programming Languages and Compilers

Example

(cont.) fun translate e = case e of Int n => I n | Real r => R r | Op(x,s,y) => let val xv = translate x val yv = translate y in operate xv s yv end

• •

Note that information flows Up the tree.

We recursively translate the leaves before we translate a node.

This causes a bottom up flow. 5/2/2020 36

Cse321, Programming Languages and Compilers

Explicitly Annotating the tree

• •

If we want to build a tree which has explicit annotations we need to define a type which has “room” for the annotations.

Use polymorphism to encode a type with “room” for an annotation at each node. (the

a

in the types below) datatype Exp a = Int' of a * int | Real' of a * real | Op' of a * (Exp a) * string * (Exp a); fun getattr e = case e of Int'(x,n) => x | Real'(x,r) => x | Op'(x,a,s,b) => x; 5/2/2020 37

Cse321, Programming Languages and Compilers

Explicit Annotation Example

fun translate e = case e of Int’( _ ,n) => Int'(I n,n) | Real’( _ ,r) => Real'(R r,r) | Op’( _ ,x,s,y) => let val xv = translate x val yv = translate y in Op'(operate (getattr xv) s (getattr yv), v,s,yv) end

Note we ignore the attribute (use the wild card pattern) on the way down, and rebuild the tree with the correct attribute on the way up. 5/2/2020 38

Cse321, Programming Languages and Compilers

Inherited Attributes

• •

Consider an expression language with declarations and implicit coercions.

Grammar: E -> id E -> ( E ) E -> id : Type in E E -> E op E Example: var x : int in * x var y : real in (x + y)

datatype: datatype exp = Id of string

x needs to be coerced to a real

| ItoR of exp

(explicit coercion)

| DeclI of string * exp | DeclR of string * exp | Op of exp * string * exp; 5/2/2020 39

Cse321, Programming Languages and Compilers

Inherited Attribute Tree

var x : int in var y : real in (x + y) * x

We have

var

We want

var x x int var int var 5/2/2020 x y + real y

*

x x y Real + real y

*

Real x 40

Cse321, Programming Languages and Compilers

Inherited Computation

[ ] var x y 5/2/2020 [ (y, real), (x, int) ] x int var [ (x, int) ] + real y [ (y, real), (x, int),] * [ (y, real), x [ (y, real), (x, int) ] (x, int) ] 41

Cse321, Programming Languages and Compilers

Simultaneous Synthesized Computation

real, Op(I2R(Var(x)),”*” Op(I2R(Var(x)) “+”, Var(y))) real, Op(I2R(Var(x)), “+”, Var(y)) + * [ (x, int), (y, real) ] [ (y, real), (x, int) ] [ (y, real), (x, int) ] x int,Var(x) 5/2/2020 int,Var(x) x [ (y, real), (x, int) ] real,Var(y) y [ (y, real), (x, int) ] 42

Cse321, Programming Languages and Compilers

The Computation

The computation uses an extra parameter of type list(string * type) as the inherited attribute.

It returns a type * exp as the synthesized attribute.

We need a modified Op constructor that uses the type info to add the explicit I2R annotations.

fun AnnOp ( (t1,e1),oper,(t2,e2) ) = case (t1,t2) of (int,int) => (int, Op(e1,oper,e2)) | (real,int) => (real, Op(e1,oper,I2R e2)) | (int,real) => (real, Op(I2R e1,oper, e2)) | (real,real) => (real, Op(e1,oper,e2)) 5/2/2020 43

Cse321, Programming Languages and Compilers

Algorithm

fun translate e types = case e of Var s => (lookup s types,Var s) | DeclI(s,e) => DeclI(s, translate e ( (s,int) :: types )) | DeclR(s,e) => DeclR(s, translate e ( (s,real) :: types )) | Op(x,s,y) => let val xv = translate x types val yv = translate y types in AnnOp(xv,s,yv) end

Note how the extra parameter types is used to add information and pass it “down” the tree.

5/2/2020 44

Cse321, Programming Languages and Compilers

Overview

The key to writing successful attribute computations is thinking ahead.

First identify what the synthesized attributes are. Think of a type that will represent these. This is the return type of the computation.

Second identify what the inherited attributes are. For each one there will be an extra parameter.

The final type will be something like: syntaxtree -> inh1 -> inh2 -> (syn1 * syn2) 5/2/2020 45

Cse321, Programming Languages and Compilers

Tagging the tree

• •

Sometimes the synthesized attribute needs to be added to the tree rather than be returned.

Again thinking ahead is important.

– –

The tree needs room for the extra attribute The tree used as input doesn’t contain any interesting values as the attribute. The attributes in the input are usually ignored

Rather than return the synthesized attribute, the program returns a new tree.

The final type will be something like: syntaxtree a -> inh1 -> inh2 -> syntaxtree (syn1 * syn2) Note the tag is originally some type that we ignore.

But the output is a new tree with filled in attribute (or attributes as a tuple) 5/2/2020 46