Motivation - OIT CSET: Welcome to CSET!

Download Report

Transcript Motivation - OIT CSET: Welcome to CSET!

Structure of Compilers
Modified
Source
Program
Lexical
Analyzer
(scanner)
Tokens
Parser
Syntactic Semantic
Structure Analysis
Intermediate
Representation
Optimizer
Symbol
Table
Code
Generator
Target machine
code
1
Binding
• Binding - association of an operation
and a symbol.
• Binding time - when does the
association take place?
– Design time
– Compile time
– Link time
– Runtime
2
Binding time
• Example:
int count;
•
•
•
•
count = count + 5;
Type of count is bound at compile time.
Value of count bound at execution time.
Meaning of + bound at compile time.
Possible types for count, internal representation
of 5 and set of possible meanings for + bound at
compiler design time.
3
Binding
• Static
– It occurs before runtime and remains
unchanged throughout program execution.
• Dynamic
– It occurs at runtime and can change in the
course of program execution.
4
Type Binding
• Before a variable can be referenced in a
program, it must be bound to a data type.
• Two important questions to ask:
1. How is the type specified?
• Explicit Declaration
• Implicit Declaration
– All variable names that start with the letters ‘i’ - ‘r’ are
integer, real otherwise
– @name is an array, %something is a hash structure
• Determined by context and value
5
Type Binding
2. When does the binding take place
– Explicit declaration (static)
– Implicit declaration (static)
– Determined by context and value (dynamic)
• Dynamic Type Binding
– When a variable gets a value, the type of the
variable is determined right there and then.
6
Dynamic Type Binding
• Specified through an assignment statement
(set x ‘(1 2 3)) <== x becomes a list
(set x ‘a)
<== x becomes an atom
• Advantage:
– flexibility (generic program units)
• Disadvantages:
1. High cost (dynamic type checking and interpretation)
2. Type error detection by the compiler is difficult
7
Type Inference
• Rather than by assignment statement, types are
determined from the context of the reference.
– Some languages don’t support assignment
statements
• Example:
(define someFunction (m n) .... )
(someFunction ‘a ‘b) <== m becomes an atom
(someFunction ‘(a b c) ‘(1 2 3)) <= m becomes a list
8
Storage Bindings
• Allocation
– getting a cell from some pool of available cells
• Deallocation
– putting a cell back into the pool
• The lifetime of a variable is the time during
which it is bound to a particular memory
cell.
9
Variable lifetime
• Static
– bound to memory cells before execution
begins and remains bound to the same
memory cell throughout execution.
– C’s static variables
• Advantage:
– efficiency (direct addressing),
– history-sensitive subprogram support
• Disadvantage:
– lack of flexibility (no recursion)
10
Variable lifetime
• Stack-dynamic
– Storage bindings are created for variables when
their declaration statements are elaborated.
– If scalar, all attributes except address are
statically bound.
– e.g. local variables in Pascal and C subprograms
• Advantage:
– allows recursion; conserves storage
• Disadvantages:
– Overhead of allocation and deallocation
– Subprograms cannot be history sensitive
11
Variable Lifetime
• Explicit heap-dynamic
– Allocated and deallocated by explicit directives,
specified by the programmer, which take effect
during execution.
– Referenced only through pointers or references
– e.g. dynamic objects in C++ (via new and delete)
– all objects in Java
• Advantage:
– provides for dynamic storage management
• Disadvantage:
– inefficient and unreliable
12
Variable lifetime
• Implicit heap-dynamic
– Allocation and deallocation caused by
assignment statements
– e.g. Lisp variables
• Advantage:
– flexibility
• Disadvantages:
– Inefficient, because all attributes are dynamic
– Loss of error detection
13
Type Checking
-
• Generalize the concept of operands and
operators to include subprograms and
assignments
• Type checking is the activity of ensuring that the
operands of an operator are of compatible types
• A compatible type is one that is either legal for
the operator, or is allowed under language rules
to be implicitly converted, by compiler-generated
code, to a legal type. This automatic conversion
is called a coercion.
14
Type Checking
• A type error is the application of an
operator to an operand of an inappropriate
type.
• If all type bindings are static, nearly all
type checking can be static
• If type bindings are dynamic, type
checking must be dynamic
• A programming language is strongly typed
if type errors are always detected.
15
Type Checking
• Advantage of strong typing:
– Allows the detection of the misuses of variables that result
in type errors.
• Languages:
–
–
–
–
1. FORTRAN 77 is not: parameters, EQUIVALENCE
2. Pascal is not: variant records
3. Modula-2 is not: variant records, WORD type
4. C and C++ are not: parameter type checking can be
avoided; unions are not type checked.
– 5. Ada is, almost (UNCHECKED CONVERSION is
loophole) (Java is similar)
• Coercion rules strongly affect strong typing
– they can weaken it considerably (C++ versus Ada)
16
Name Type Compatibility
• Type compatibility by name means the two
variables have compatible types if they are
in either the same declaration or in
declarations that use the same type name
– Easy to implement but highly restrictive:
• Subranges of integer types are not compatible with
integer types
• Formal parameters must be the same type as their
corresponding actual parameters
17
Name Type Compatibility
type indextype = 1..100;
var
count : integer;
index : indextype;
//subrange
Is count and index name type compatible?
18
Structure Type Compatibility
• Type compatibility by structure means that
two variables have compatible types if
their types have identical structures
– More flexible, but harder to implement
type myType1 = double;
type myType2 = double;
myType1 data1;
myType2 data2;
data1 = data2;
//????
19
Type Compatibility
type
type1 = arary [1..10] of integer;
type2 = array [1..10] of integer;
type3 = type2;
type2 is compatible with type3 by
name
type1 is compatible with type2 by
structure
20
Type Compatibility
• Anonymous types
A: array [1..10] of INTEGER;
B: array [1..10] of INTEGER;
C, D: array [1..10] of INTEGER;
21
Scope
• The scope of a variable is the range of
statements over which it is visible.
• The nonlocal variables of a program unit
are those that are visible but not declared
there.
• The scope rules of a language determine
how references to names are associated
with variables.
22
Static scope
• Based on program text
– To connect a name reference to a variable, you (or the
compiler) must find the declaration.
• Search process:
– search declarations, first locally, then in increasingly larger
enclosing scopes, until one is found for the given name.
• Enclosing static scopes (to a specific scope) are
called its static ancestors; the nearest static
ancestor is called a static parent.
23
Static Scope
• Blocks - a method of creating static scopes inside
program units--from ALGOL 60
C and C++:
Ada:
for (...) {
int index;
...
}
declare LCL : FLOAT;
begin
...
end
24
Static Scope: Procedures
Evaluation of Static Scoping (for nested procedures)
Consider the example:
MAIN
Define Sub A
Define Sub C
Define Sub D
Call Sub C
Define Sub B
Define Sub E
Call Sub E
Call Sub A
MAIN
A
C
D
B
E
25
Static Scope: Variables
program main;
var x: integer;
procedure sub1;
begin
print(x);
end; {sub1}
procedure sub2;
var x: integer;
begin
x := 10;
sub1;
end;
begin
x := 5;
sub2
end.
Output
10 or 5?
26
Static Scope
• Suppose the spec is changed so that D must now
access some data in B
• Solutions:
1. Put D in B (but then C can no longer call it and D cannot
access A's variables)
2. Move the data from B that D needs to MAIN (but then all
procedures can access them)
• Same problem for procedure access!
• Overall: static scoping often encourages many
globals
27
Dynamic Scope
• Based on calling sequences of program units, not
their textual layout (temporal versus spatial)
• References to variables are connected to
declarations by searching back through the chain of
subprogram calls that forced execution to this point.
• Evaluation of Dynamic Scoping:
– Advantage: convenience
– Disadvantage: hard to debug, hard to
understand the code
28
Dynamic Scope: Variables
program main;
var x: integer;
procedure sub1;
begin
print(x);
end; {sub1}
procedure sub2;
var x: integer;
begin
x := 10;
sub1;
end;
begin
x := 5;
sub2
end.
Output
10 or 5?
29
Scope vs. Lifetime
• Scope and lifetime are sometimes closely
related, but are different concepts!!
• Consider a static variable in a C or C++
function
void someFunction()
{
static int x;
...
}
30
Intermediate Representation
• Almost no compiler produces code
without first converting a program into
some intermediate representation that is
used just inside the compiler.
• This intermediate representation is
called by various names:
– Internal Representation
– Intermediate representation
– Intermediate language
Intermediate Representation
• Intermediate Representations are also
called by the form the intermediate
language takes:
– tuples
– abstract syntax trees
– Triples
– Simplied language
Intermediate Form
• In general, an intermediate form is kept
around only until the compiler generates
code; then it cane be discarded.
• Another difference in compilers is how
much of the program is kept in
intermediate form; this is related to the
question of how much of the program
the compiler looks at before it starts to
generate code.
• There is a wide spectrum of choices.
Abstract Syntax Tree
x = y + 3;
=
x
+
y
3
34
Quadruples
y
a
y= a*(x+b)/(x-c);
x
b
T1= x+b;
T2=a*T1;
T3=x-c;
T4=T2/T3;
y=T4;
(+, 3, 4, 5)
(*, 2, 5, 6)
(-, 3, 7, 8)
(/, 6, 8, 9)
(=, 0, 9, 1)
T1
T2
c
T3
T4
35