Presentazione di PowerPoint

Download Report

Transcript Presentazione di PowerPoint

Breve introduzione a CLI/CLR Massimo Ancona DISI Università di Genova

Testi: J. Gough, Compiling for .NET Common Language Runtime (CLR), .NET Series, B. Mayer Editor J. Richter, CLR via C#, Microsoft Press 1

CLR - Common Language Runtime

The CLR has been designed with three objectives:

1.

portability

(write once, run anywhere),

2.

3.

reliability

(make operations predictable),

reusability

(object-orientation and parametric code [generics]). GenCLI has the objective of meeting all the three objective above. 2

CLR 2

The CLR machine is composed by the CTS specification

Specification (.NET Common Type

) and the based machine. CLR instructions. The CTS defines all possible data types and constructs supported by the .NET Run Time Environment (RTE) , while The CLR instructions define a virtual stack 3

Execution Model CLR 3

Code generators for .NET emit CIL text file for subsequent memory buffer.

The code of CIL are instructions for a virtual machine and are always executed indirectly by means of a Just-In-Time compiler ( JIT ).

assembly The JIT translates the instructions of IL into machine code for a specific computer on which the program has to be executed.

Program executable modules called demand-loaded (IL for short), either in form of or directly into a file or assemblies are usually and are just-in-time compiled (JIT-ed) at the time of loading. 4

Execution Model CLR 4

At load time each assembly is subject to some form of checking.

The execution engine is able to ensure that the assembly is

memory-safe

.

Programs that are intended to pass the checks [of verification] are said to written in in

verifiable code

. 5

Verifiable Code CLR 5

Verifiable code must conform to several requirements. First of all dynamically allocated memory must be

managed data

. This means that all objects must be allocated from the garbage collected heap, and must be

self-describing .

The GC must be able to discern the exact type of the object from inspection of the object encoding.

6

Verifiable Code CLR 6

   Operations on data must be performed in such a way that the verifier is able to statically prove that the operation is safe for the type of object. Method calls must pass arguments that are conformant to the statically specified method signature.

For most programming languages not all PGMs can be translated into verifiable code. In such cases a programmers who whish their PGM to pass verification must restrict themselves to a subset of the language. 7

Verifiable Code CLR 7

Programming constructs that can cause problems are, for example, union types ( variant types) and pointer arithmetic .

As well as speaking of managed data we speak of managed code . Managed code is code that is executed by the CLR as opposed to ordinary native code execution.

An erroneous address computation allows an arbitrary memory allocation to be overwritten. 8

Verifiable Code CLR 8

An erroneous address computation can be generated by:  Accessing a deallocated memory location   Accessing a non-existing array element Treating a pointer of one type as another  Sending wrongly typed arguments to a function 9

Memory Safety by Design 0

How to design languages and RTEs for which every semantically correct source program may be compiled into a memory safe executable program. One approach is to define a statically typed (or strongly typed) programming language, e.g. Modula-2 .NET system provides a framework for memory-safe programming. There are a number of different aspects of .NET that contribute toward this outcome: 1. dynamically allocated data in verifiable code is garbage collected 2. Every datum is of known type at runtime.

10

Memory Safety by Design 1

Objects of reference type are allocated from a heap called the managed heap.

collected The managed heap is garbage and the CLR provides instructions for managing it in a safe way. Value types are not allocated on the managed heap. However, an object of value type can be converted to a reference type by using the boxing mechanism reference type. : a copy of the object value is allocated on the managed heap and its address is returned as a

11

Memory Safety by Design 2

How to design languages and run-times for which every semantically correct source PGM may be compiled into a memory safe executable PGM.

The .NET execution engine is able to ensure that the generated code is safe by performing a verification process. It checks that every method is called with the correct number of parameters, and that each parameter passed is of the correct type.

12

Memory Safety by Design 3

In order to be safe the generated code must allocate dynamic objects

managed data

executed indirectly via a JIT (

only as

on the managed heap by means of specific CLR instructions. The code generated for .NET is always

Just In time Translator

) that translates the code generated by a .NET compiler, into native machine code, while safety checks are performed at load time, just before the JIT translation.

13

Memory Safety by Design 4

.NET resolves these problems by a combination of checking.

on the

load-time

The load-time verifier computes the types of all data used by the IL code of a PGM.

This involve significant computations based

control flow graph

and

runtime

: the verifier checks that all data. This involve access to multiple assemblies because consistency of argumt types between method caller and callee may cut across PEM boundariees.

14

CTS

CTS provides three sets of types:

• •

primitive types reference types heap, and

value types , managed by the compiler, , allocated on the managed

15

CTS Types Hierarchy

16

CTS 3

The CLS (Common Language specification, a subset of CTS) defines the requirements to be met by a language in order to be classified as a safe .NET language . Programs generated by such a compiler, in order to pass the verification process, must be written in verifiable code . Example: GenCLI generates only verifiable high-level IL making the with Rpython compiler, a de facto .NET compiler.

17

CTS Generics 1

The CTS allows the creation of generic reference types as well as generic value types. In addition, the CLR allows the creation of generic classes, interfaces, and generic delegates. Moreover, the CLR allows the creation of generic methods that are defined in a reference type, value type, or interface

18

CTS Generics 2

Adding generics to the CLR required to:

create new IL instructions that are aware of type

arguments

insert type names and methods with generic

parameters in metadata tables

modify languages, compilers and the JIT

compiler to process the new type-argument-aware IL instructions.

19

CLR Assemblies 1

Combining managed modules into Assemblies Pg 6 The CLR does not actually work with modules it works with assemblies . An assembly is a logical grouping of one or more modules or resource files .

An assembly is the smallest unit of reuse, security and versioning. It supports the separation of types and resources into separate files used by users of the assembly

20

CLR/CTS Assemblies 2

An assembly is the smallest unit of reuse, security and versioning. It supports the separation of types and resources into separate files used by users of the assembly

21

CLR/CTS Assemblis 3

An assembly is the smallest unit of reuse, security and versioning. It supports the separation of types and resources into separate files used by users of the assembly

22

Mapping Oberon-2 to CLR

The record types of Oberon-2 need to be mapped in some way to the class constructs of the CTS. Oberon-2 does not make a declarative distinction between value and reference aggregate types. Record types always have value semantics, and pointer types always have a reference semantics. Our choice is the following.

23

Mapping Oberon

One of the most relevant features of CLR (.NET 2.0) are generics for the .NET languages to easily create type-safe, reusable code. . With generics, it is now possible The term generics, means parameterized types.

A parameterized type is a class, interface, method, or delegate in which the type of data upon which it operates is specified as a parameter . A class, interface, method, or delegate that operates on a parameterized type is called generic, class, interface, method or delegate.

24

Mapping Oberon-2 to CLR

Record types that are not extensible [i.e., heirless] nor extensions of another type are implemented as value classes. If a program declare a pointer to such a record type, the pointer type is implemented as a reference class with a single field of the type of the value class. This reference class is an explicit boxed occurrence of the embedded value class. It has at least one advantage over the automatically boxed classes manipulated by “box” and “unbox” instructions. In this case we may access the fields of the boxed value without unboxing .

25

CTS X+1

Procedures that are bound to such a record type [equivalent to a method in Oberon-2 ] are implemented as (non-virtual) instance methods of the value class.

Procedures bound to a type that is a pointer to the record are implemented as (non-virtual) instance methods of the explicitly boxed class: MODULE ValCls; IMPORT CPmain; TYPE RecTyp = RECORD c: CHAR END; PtrTyp=POINTER TO RecTyp; PROCEDURE (IN r:RecTyp) Foo(), NEW; END Foo; PROCEDURE ( r: PtrTyp) Bar(), NEW; END Bar; There is an interesting artefact of this design. Procedures bound [methods] to the record type, and to the pointer to record type, are bound to the same underlying type in the source semantics but are bound to separate types in the implementation. It seems curious, but no ambiguity can arise [esempio].

26

PL0 29

ssym['+']:=plus; ssym['-']:=minus; ssym['*']:=times; ssym['/']:=slash; ssym['(']:=lparen; ssym[')']:=rparen; ssym['=']:=eql; ssym[',']:=comma; ssym['.']:=period; ssym['#']:=neq; ssym['<']:=lss; ssym['>']:=gtr; ssym['%']:=leq; ssym['@']:=geq; ssym['<']:=lss; ssym['>']:=gtr; ssym[';']:=semicolon;

27

PL0 30

mnemonic[lit]:='LIT '; mnemonic[opr]:='OPR '; mnemonic[lod]:='LOD '; mnemonic[sto]:='STO '; mnemonic[cal]:='CAL '; mnemonic[int]:='INT '; mnemonic[jmp]:='JMP '; mnemonic[jpc]:='JPC '; declbegsys:=[constsym,varsym,procsym]; statbegsys:=[beginsym,callsym,ifsym,whilesym]; facbegsys:=[ident,number,lparen]; RESET(in,'pl0','pgm');err:=0; cc:=0;ll:=0;ch:=' ';kk:=al; REWRITE(cout,'PL0','asm'); getsym; mysys:=[period]+declbegsys+statbegsys; block(0,0,mysys(*[period]+declbegsys+statbegsys*)); WRITELN('END COMPILATION'); IF sym<>period THEN error(9) FI;WRITECODE;CLOSE(cout); IF err = 0 THEN WRITE('CICCIO'); interpret ELSE WRITE('Errors IN PL/0 PROGRAM') FI; WRITELN END.

28

Hendren93register

Grafo di interferenza G=(V,E)(Chaiting) Ciascun vertice in G corrisponde ad un di una variabile del programma.

live range

Un arco unisce due vertici del grafo se vi è interferenza tra i due vertici del grafo cioè un overlapping temporale dei corrispondenti live range. Più precisamente uno è vivo in un punto di definizione del secondo. Un 29

Hen 93

30

Hen 93 Definizione

.

Un grafo di intervalli (grafo di intersezione)

G

(IG=Interval Graph): è definito da un insieme di intervalli sulla retta nel modo seguente:   Ad ogni intervallo

I v

di

V

viene associato un vertice Esiste un arco

Iw e

, associati a

v

E

e e=(v,w) gli intervalli

w

intersezione non vuota

Iv

Iw.

Iv

e rispettivamente, hanno 31

Hen 93

Un vertice del grafo ha grado k se ha k vertici vicini (direttamente ad esso connessi) Il metodo di Chaitin colora con m colori il grafo con la proprietà che due vertici adiacenti abbiano colori diversi.

Una colorazione del grafo di interferenza con k colori definisce una soluzione feasible con k registri 32