Compiler: AST

Download Report

Transcript Compiler: AST

ST — Introduction
The Smalltalk Compiler
© Oscar Nierstrasz
ST 1.1
ST — Introduction
Compiler
>
Fully reified compilation process:
— Scanner/Parser (built with SmaCC)
–
builds AST (from Refactoring Browser)
— Semantic Analysis: ASTChecker
–
annotates the AST (e.g., var bindings)
— Translation to IR: ASTTranslator
–
uses IRBuilder to build IR (Intermediate Representation)
— Bytecode generation: IRTranslator
–
© Oscar Nierstrasz
uses BytecodeBuilder to emit bytecodes
ST 1.2
ST — Introduction
Compiler: Overview
code
Scanner
/ Parser
AST
Semantic
Analysis
AST
Code
Generation
Bytecode
Code generation in detail
AST
Build
IR
ASTTranslator
IRBuilder
© Oscar Nierstrasz
IR
Bytecode
Generation
Bytecode
IRTranslator
BytecodeBuilder
ST 1.3
ST — Introduction
Compiler: Syntax
>
SmaCC: Smalltalk Compiler Compiler
— Like Lex/Yacc
— Input:
–
–
–
Scanner definition: regular expressions
Parser: BNF-like grammar
Code that builds AST as annotation
— SmaCC can build LARL(1) or LR(1) parser
— Output:
–
–
© Oscar Nierstrasz
class for Scanner (subclass SmaCCScanner)
class for Parser (subclass SmaCCParser)
ST 1.4
ST — Introduction
Scanner
© Oscar Nierstrasz
ST 1.5
ST — Introduction
Parser
© Oscar Nierstrasz
ST 1.6
ST — Introduction
Calling Parser code
© Oscar Nierstrasz
ST 1.7
ST — Introduction
Compiler: AST
>
AST: Abstract Syntax Tree
— Encodes the Syntax as a Tree
— No semantics yet!
— Uses the RB Tree:
–
–
–
–
–
© Oscar Nierstrasz
visitors
backward pointers in ParseNodes
transformation (replace/add/delete)
pattern directed TreeRewriter
PrettyPrinter
RBProgramNode
RBDoItNode
RBMethodNode
RBReturnNode
RBSequenceNode
RBValueNode
RBArrayNode
RBAssignmentNode
RBBlockNode
RBCascadeNode
RBLiteralNode
RBMessageNode
RBOptimizedNode
RBVariableNode
ST 1.8
ST — Introduction
Compiler: Semantics
>
We need to analyse the AST
— Names need to be linked to the variables according to the
scoping rules
>
ASTChecker implemented as a Visitor
—
—
—
—
—
Subclass of RBProgramNodeVisitor
Visits the nodes
Grows and shrinks scope chain
Methods/Blocks are linked with the scope
Variable definitions and references are linked with objects
describing the variables
© Oscar Nierstrasz
ST 1.9
ST — Introduction
A Simple Tree
© Oscar Nierstrasz
ST 1.10
ST — Introduction
A Simple Visitor
RBProgramNodeVisitor new visitNode: tree
Does nothing except
walk through the tree
© Oscar Nierstrasz
ST 1.11
ST — Introduction
TestVisitor
RBProgramNodeVisitor subclass: #TestVisitor
instanceVariableNames: 'literals'
classVariableNames: ''
poolDictionaries: ''
category: 'Compiler-AST-Visitors'
TestVisitor>>acceptLiteralNode: aLiteralNode
literals add: aLiteralNode value.
TestVisitor>>initialize
literals := Set new.
TestVisitor>>literals
^literals
tree := RBParser parseExpression: '3 + 4'.
(TestVisitor new visitNode: tree) literals
a Set(3 4)
© Oscar Nierstrasz
ST 1.12
ST — Introduction
Compiler: Intermediate Representation
>
IR: Intermediate Representation
—
—
—
—
—
>
Semantic like Bytecode, but more abstract
Independent of the bytecode set
IR is a tree
IR nodes allow easy transformation
Decompilation to RB AST
IR build from AST using ASTTranslator:
— AST Visitor
— Uses IRBuilder
© Oscar Nierstrasz
ST 1.13
ST — Introduction
Compiler: Bytecode Generation
>
CHECK the
example!
IR needs to be converted to Bytecode
— IRTranslator: Visitor for IR tree
— Uses BytecodeBuilder to generate Bytecode
— Builds a compiledMethod
testReturn1
| iRMethod aCompiledMethod |
iRMethod := IRBuilder new
numRargs: 1;
addTemps: #(self);
"receiver and args declarations"
pushLiteral: 1;
returnTop;
aCompiledMethod := iRMethod compiledMethod.
ir.
self should:
[(aCompiledMethod
valueWithReceiver: nil
arguments: #() ) = 1].
© Oscar Nierstrasz
ST 1.14