Semantic analysis

Download Report

Transcript Semantic analysis

Semantic Analysis
Aaron Bloomfield
CS 415
Fall 2005
1
Compilation in a Nutshell 1
Source code
(character stream)
if (b == 0) a = b;
Lexical analysis
Token stream if ( b == 0 ) a = b ;
==
Abstract syntax tree
b
(AST)
Parsing
if
0
;
=
a
b
if
boolean
Decorated AST
==
int b
int 0
Semantic Analysis
=
int
;
int a int b
lvalue
2
Compilation in a Nutshell 2
if
boolean
==
int b
=
int 0
int
;
int a int b
lvalue
Intermediate Code Generation
CJUMP ==
MEM
CONST
+
0
fp
MOVE
MEM
MEM
+
+
8
fp
CJUMP ==
CX
CONST
0
4
fp
Optimization
8
Code generation
MOVE
DX
NOP
NOP
CX
CMP CX, 0
CMOVZ DX,CX
3
Questions to Answer:
•
•
•
•
Is x a scalar, an array, or a function?
Is x declared before it is used?
Which declaration of x does this reference?
Does the dimension of a reference match the
declaration?
• Is an array reference in bounds?
• Type errors (that can be caught statically)
4
Context-Sensitive Analysis
• Two solutions:
Attribute grammars – augment CFG with rules,
calculate attributes for grammar symbols
ad hoc techniques – augment grammar with arbitrary
code, execute at corresponding reduction, store
information in attributes, symbol tables values
5
Attribute Grammars
• Generalization of context-free grammars
• Each grammar symbol has an associated set of
attributes
• Augment grammar with rules that define values
– Not allowed to refer to any variables or attributes
outside the current production
6
Attribute Types
• Values are computed from constants and other
types:
– Synthesized attribute – value computed from
children
– Inherited attribute – value computed from siblings,
parent, and own attributes
7
Attribute Flow
S-attributed grammar
– Uses only synthesized types
– Bottom-up attribute flow
L-attributed grammar
– Attributes can be evaluated in a single left-to-right
pass over the input
– Each synthesized attribute of LHS depends only on
that symbol’s own inherited attributes or on attributes
(synthesized or inherited) of the production’s RHS
symbols
8
9
10
11
12
13
Action Routines
• We need a translation scheme - an algorithm
that invokes the attributes in a order that
respects attribute flow.
• Action Routines = an Ad hoc translation
scheme that is interleaved with parsing
• An action routine = Semantic function that
programmer instructs the compiler to execute at
a particular point in the parse.
• What most production compilers use.
14
15
Abstract Syntax Tree
• An abstract syntax tree is the procedure’s parse
tree with the nodes for most non-terminal
symbols removed
E.g., “a + 3 * b”
16
Symbol Table
• A “Dictionary” that maps names to info the compiler
knows about that name.
• What names?
– Variable and procedure names
– Literal constants and strings
• What info?
Textual name
Data type
Declaring procedure
Lexical level of declaration
If array, number and size of dimensions
If procedure, number and type of parameters
17
Sample program
Program gcd (input, output);
Var I,j: integer;
Begin
Read(I,j);
While I <> j to
If I > j then I := I – j
Else j := j – I;
Writeln (i)
End.
18
19
Symbol Table Implementation
• Usually implemented as hash tables
• Return closest lexical declaration to handle
nested lexical scoping
• How to handle multiple scopes?
20
One option
• Use one symbol table per scope
• Chain tables to enclosing scope
• Insert names in tables for current scope
• Start name lookup in current table, checking
enclosing scopes in order if needed
21
LeBlanc-Cook symbol tables
- Give each scope a number,
- All names in a hash table, keyed by name
- Also have a scope stack – to show current
referencing environment.
- As analyzer looks at programs, it pushes and
pops this stack as it enters and leaves scopes.
- To look up – scan down the appropriate hash
chain, for each matching entry, scan down the
scope stack to see if that is visible. We look no
deeper than the top-most scope.
22