Document 7624081

Download Report

Transcript Document 7624081

A New CC Course –
Based on Cooperation with Linz
Vladimir Kurbalija, Mirjana Ivanović
Department of Mathematics and Informatics
University of Novi Sad
Serbia and Montenegro
Agenda

Current CC Course in Novi Sad



CC Course in Linz



Lectures
Exercises
Lectures
Exercises
Conclusion
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Novi Sad


One of core software courses at Computer
Science directions
7th semester CC1 (obligatory), 8th
semester CC2 (elective), for students of



Computer Science,
Business Computer Science,
Teaching of Computer Science
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Novi Sad - Lectures



Practical approach
Development of Pascal- compiler
Subset of Pascal language:



data types: Boolean and integer standard types,
arrays and fixed records as structured types;
basic statements: assignment statement, procedure
call, if and while statements;
Standard input/output (read and write) procedures,
user defined procedures including recursion.
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Novi Sad - Lectures

The implementation:




recursive-descent manner for syntax analysis
code generation for abstract P machine
This approach was interesting 10 years
ago
Now, when new languages and tools are
appeared we need modernisation!!!
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Novi Sad - Exercises




Students repeat and train practical skils
gained during lectures
Several independent tasks – small
grammars
Implementation language is Modula-2
Compiler generator Coco/R
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Novi Sad - Exercises

Tasks:




Lexical and syntax analysis and some parts of
semantic analysis using Coco/R
“Hand written” parsers (LA & SA)
“Hand written” parsers with semantic analysis and
rarely with code generation or interpretation
Some algorithms on grammars (memory organisation,
checking consistency, computing first and follow
sets…)
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Agenda

Current CC Course in Novi Sad



CC Course in Linz



Lectures
Exercises
Lectures
Exercises
Conclusion
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Linz - Lectures



One semester course held in English
The same course – at University of Oxford
Goals of the course



acquire the practical skills to write a simple compiler
for an imperative programming language
understand the concepts of scanning, parsing, name
management in nested scopes, and code generation.
learn to transfer the skills also to general software
engineering tasks
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Linz - Lectures





More theoretical approach
Course goes through all phases of a compiler
writing
Shows the theoretical concepts underlying each
phase as well as how to implement it efficiently
Examples: MicroJava compiler in Java, target
language – subset of Java byte code
Example (05.SymbolTable.ppt, sl. 27-36)
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
5. Symbol Table
5.1
5.2
5.3
5.4
5.5
Overview
Objects
Scopes
Types
Universe
Types
Every object has a type with the following properties
• size (in MicroJava always 4 bytes)
• structure (fields for classes, element type for arrays, ...)
Kinds of types in MicroJava?
• primitive types (int, char)
• arrays
• classes
Types are represented by structure nodes
class Struct {
static final int
// type kinds
None = 0, Int = 1, Char = 2, Arr = 3, Class = 4;
int
Struct
int
Obj
kind;
elemType;
nFields;
fields;
// None, Int, Char, Arr, Class
// Arr: element type
// Class: number of fields
// Class: list of fields
}
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Structure Nodes for Primitive
Types
int a, b;
char c;
kind
name
type
next
val
adr
level
nPars
locals
Var
"a"
0
0
-
Var
"b"
Var
"c"
1
0
-
2
0
-
object node
structure node
kind
elemType
nFields
fields
Int
-
Char
-
There is just one structure node for int in the whole symbol table.
It is referenced by all objects of type int.
The same is true for structure nodes of type char.
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Structure Nodes for Arrays
int[] a;
int b;
kind
name
type
next
val
adr
level
nPars
locals
Var
"a"
Var
"b"
0
0
-
1
0
-
kind
elemType
nFields
fields
Arr
-
Int
-
The length of an array is statically unknown.
It is stored in the array at run time.
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Structure Nodes for Classes
class C {
int x;
int y;
int z;
}
C v;
kind
name
type
next
val
adr
level
nPars
locals
Type
"C"
Var
"v"
-
0
0
-
kind
Class
elemType nFields
3
fields
V. Kurbalija, M. Ivanović
Int
-
kind
name
type
next
val
adr
level
nPars
locals
Var
"x"
Var
"y"
Var
"z"
0
-
1
-
2
-
A New CC Course - Based on
Cooperation with Linz
Name
Equivalence
Type Compatibility:
Two types are the same if they are represented by the same type node
(i.e. if they are denoted by the same type name)
class T {...}
T a;
T b;
Type
"T"
Var
"a"
Var
"b"
...
...
...
Class
...
...
The types of a and b are the same
Name equivalence is used in Java, C/C++/C#, Pascal, ..., MicroJava
Exception
In Java (and MicroJava) two array types are the same if they have the same element types!
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Structural
Equivalence
Type Compatibility:
Two types are the same if they have the same structure
(i.e. the same fields of the same types, the same element type, ...)
class T1 { int a, b; }
class T2 { int c, d; }
T1 x;
T2 y;
Type
"T1"
Var
"x"
Type
"T2"
Var
"y"
...
...
...
...
Class
2
Class
2
Var
"a"
Var
"b"
Var
"c"
Var
"d"
...
...
...
...
Int
-
The types of x and y are the same (but not in MicroJava!)
Structural equivalence is used in Modula-3 but not in MicroJava and in most other languages!
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Methods for Checking Type
Compatibility
class Struct {
...
// checks if two types are compatible (e.g. in compare operations)
public boolean compatibleWith (Struct other) {
return (this.equals(other)
|| this == Tab.nullType && other.isRefType()
|| other == Tab.nullType && this.isRefType());
}
// checks if "this" is assignable to "other"
public boolean assignableTo (Struct other) {
return (this.equals(other)
|| this == Tab.nullType && other.isRefType());
}
// checks if two types are the same (structural equivalence for arrays, name equivalence otherwise)
public boolean equals (Struct other) {
if (kind == Arr)
return (other.kind == Arr && (other.elemType == elemType || other.elemType == Tab.noType));
else
return (other == this);
necessary because of standard function len(arr)
}
public boolean isRefType() {
return kind == Class || kind = Arr;
}
}
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Solving LL(1) Conflicts with the Symbol
Table
Method syntax in MicroJava
void foo()
int a;
{ a = 0; ...
}
Actually we would like to write it like this
void foo() {
int a;
a = 0; ...
}
But this would result in an LL(1) conflict
First(VarDecl)  First(Statement) = {ident}
Block
VarDecl
Type
Statement
=
=
=
=
|
Designator =
V. Kurbalija, M. Ivanović
"{" { VarDecl | Statement } "}".
Type ident {"," ident}.
ident ["[" "]"].
Designator "=" Expr ";"
... .
ident {"." ident | "[" Expr "]"}.
A New CC Course - Based on
Cooperation with Linz
Solving the Conflict With Semantic
Information
private static void Block() {
check(lbrace);
for (;;) {
if (NextTokenIsType()) VarDecl();
else if (sym  First(Statement)) Statement();
else if (sym  {rbrace, eof}) break;
else {
error("..."); ... recover ...
}
}
check(rbrace);
}
Block = "{" { VarDecl | Statement } "}".
private static boolean NextTokenIsType() {
if (sym != ident) return false;
Obj obj = Tab.find(la.str);
return obj.kind == Obj.Type;
}
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Linz – Exercises




Students should acquire practical skills in
compiler writing
One (big) project divided in smaller
subtasks
Students should write a small compiler for
a Java-like language - MicroJava
The implementation language is also Java
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Linz – Exercises

The project consists of three levels:



Level 1 – implementation of a scanner and a
parser for the language MicroJava, error
handling
Level 2 - deals with symbol table handling and
type checking
Level 3 - deals with code generation for the
MicroJava
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Linz – Exercises

Study material:





Teaching material (slides)
Description of the project
Specification of MicroJava language (tokens, language
grammar, semantic and context constraints)
Specification of MicroJava virtual machine (similar but
simpler than Java VM) – Memory layout and
Instruction set
Specification of object file format
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
CC Course in Linz – Exercises

Sources



Frameworks of all classes needed for compiler
Complete implementation of MicroJava VM
Samples of MicroJava programs (for testing)
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
program P
{
void main()
int i;
{
i = 0;
while (i < 5) {
print(i);
i = i + 1;
}
}
}
program Eratos
char[] sieve;
int max;
// maximum prime to be found
int npp;
// numbers per page
{
void put(int x)
{
if (npp == 10) {print(chr(13)); print(chr(10)); npp = 0;}
print(x, 5);
npp++;
}
void found(int x)
int i;
{
put(x);
i = x;
while (i <= max) {sieve[i] = 'o'; i = i + x;}
}
void main()
int i;
{
read(max);
npp = 0;
sieve = new char[max+1];
i = 0;
while (i <= max) {sieve[i] = 'x'; i++;}
i = 2;
while (i <= max) {
found(i);
while(i <= max && sieve[i] == 'o') i++;
}
}
}//test
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Agenda

Current CC Course in Novi Sad



CC Course in Linz



Lectures
Exercises
Lectures
Exercises
Conclusion
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Conclusion


Contents of both courses is similar
Approach is slightly different. In new course:




Lectures are purely theoretical
Exercises are more practical
A newer programming language (Java) is used
instead of Modula-2
A target language (for code generation) is a
subset of Java byte code instead of P-code
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Conclusion

Advantages of new course:




Concepts of compiler construction are shown on more
formal way
Modern and object-oriented language is used (Java
and Java byte code)
Students autonomously write (almost) whole compiler
Disadvantages of new course

Defining several language extensions for each student
– students like to corporate during task solution 
V. Kurbalija, M. Ivanović
A New CC Course - Based on
Cooperation with Linz
Thank you for your attention