Compilers - Tunghai University
Download
Report
Transcript Compilers - Tunghai University
Compilers
Book: Crafting a Compiler with C
Author: Charles N. Fischer and Richard J. LeBlanc, Jr.
The Benjamin/Cumming Publishing Company, Inc
Gain Score
Homework: 10%
Project: 40% (Two members in a team)
Lexical analysis: 10%
Syntax analysis: 20%
Code generation: 10%
Mid Exam: 25%
Final Exam: 25%
Contents
Introduction
A Simple Compiler
Scanning – Theory and Practice
Grammars and Parsing
LL(1) Parsing
LR Parsing
Semantic Processing
Symbol Tables
Run-time Storage Organization
Contents (Cont’d.)
Processing Declarations
Processing Expressions and Data Structure
References
Translating Control Structure
Translating Procedures and Functions
Attribute Grammars and Multipass Translation
Code Generation and Local Code
Optimization
Global Optimization
Parsing in the Real World
Chapter 1 Introduction
Contents
Overview and History
What Do Compilers Do?
The Structure of a Compiler
The Syntax and Semantics of Programming
Languages
Compiler Design and Programming
Language Design
Compiler Classifications
Influences on Computer Design
Overview and History
Compilers are fundamental to modern
computing.
They act as translators, transforming humanoriented programming languages into
computer-oriented machine languages.
Programming
Language
(Source)
Compiler
Machine
Language
(Target)
Overview and History (Cont’d.)
The first real compiler
FORTRAN compilers of the late 1950s
18 person-years to build
Today, we can build a simple compiler in a
few month.
Crafting an efficient and reliable compiler is
still challenging.
Overview and History (Cont’d.)
Compiler technology is more broadly
applicable and has been employed in rather
unexpected areas.
Text-formatting languages, like nroff and troff;
preprocessor packages like eqn, tbl, pic
Silicon compiler for the creation of VLSI circuits
Command languages of OS
Query languages of Database systems
What Do Compilers Do?
Compilers may be distinguished according
to the kind of target code they generate:
Pure Machine Code
Augmented Machine Code
Assume there is no run-time OS support.
For systems implementation or embedded systems
Run on bare machines
For hardware + OS + language-specific support routines,
e.g., I/O, math functions, storage allocation, and data
transfer.
Virtual Machine Code
JVM, P-code
Portable
4-times slower
Code is interpreted.
What Do Compilers Do? (Cont’d.)
Another way that compilers differ from one
another is in the format of the target machine
code they generate
Assembly Language Format
Simplify compilation
Use symbolic labels rather than calculating address
Pro: good for smaller machines
Con: need an additional pass
What Do Compilers Do? (Cont’d.)
Relocatable Binary Format
A linkage step is required
Similar to the output of assembler
Need a linking step before execution
Good for modular compilation, cross-language references,
and libraries
Memory-Image (Load-and-Go) Format
Fast
Very limited linking capabilities
Good for debugging (frequent changes)
What Do Compilers Do? (Cont’d.)
Another kind of language processor, called
an interpreter, differs from a compiler in that
it executes programs without explicitly
performing a translation
Interpreter
Source
Program
Encoding
Data
Advantages and Disadvantages of an
interpreter
See page 6 & 7
Output
What Do Compilers Do? (Cont’d.)
Advantage
Modification to program during execution
Dynamic-typed languages
Variable types may change at run time, e.g., LISP.
Difficult to compile
Better diagnostics
Interactive debugging
Not for every language, e.g., Basic, Pascal
Source code is available.
Machine independence
However, the interpreter itself must be portable.
What Do Compilers Do? (Cont’d.)
Disadvantage
Slower execution due to repeated examination
Dynamic (LISP): 100:1
Static (BASIC): 10:1
Substantial space overhead
The Structure of a Compiler
Modern compilers are syntax-directed
Compilation is driven the syntactic structure of
programs; i.e., actions are associated with the
structures.
Any compiler must perform two major tasks
Analysis of the source program
Synthesis of a machine-language program
The Structure of a Compiler (Cont’d.)
Source
Program
Syntactic
Tokens
Scanner
Parser
(Character
Stream)
Structure
Semantic
Routines
Intermediate
Representation
Symbol and
Attribute
Tables
(Used by all Phases
of The Compiler)
The structure of a Syntax-Directed Compiler
Optimizer
Code
Generator
Target Machine
Code
The Structure of a Compiler (Cont’d.)
Scanner
The scanner begins the analysis of the source
program by reading the input, character by
character, and grouping characters into individual
words and symbols (tokens)
The tokens are encoded and then are fed to the
parser for syntactic analysis
For details, see the bottom of page 8.
Scanner generators
regular
exp for
tokens
lex or scangen
finite
automata
as programs
The Structure of a Compiler (Cont’d.)
Parser
Given a formal syntax specification (typically as a
context-free grammar [CFG]), the parse reads
tokens and groups them into units as specified by
the productions of the CFG being used.
While parsing, the parser verifies correct syntax,
and if a syntax error is found, it issues a suitable
diagnostic.
As syntactic structure is recognized, the parser
either calls corresponding semantic routines
directly or builds a syntax tree.
yacc or llgen
grammar
parser
The Structure of a Compiler (Cont’d.)
Semantic Routines
Perform two functions
Check the static semantics of each construct
Do the actual translation for generating IR
The heart of a compiler
Optimizer
The IR code generated by the semantic routines is
analyzed and transformed into functionally
equivalent but improved IR code.
This phase can be very complex and slow
Peephole optimization
The Structure of a Compiler (Cont’d.)
One-pass compiler
Retargetable compiler
No optimization is required
To merge code generation with semantic routines
and eliminate the use of an IR
Many machine description files, e.g., gcc
Match IR against target machine patterns.
Compiler writing tools
Compiler generators or compiler-compilers
Lex and Yacc
E.g., scanner and parser generators
Compiler Design and Programming
Language Design
An interesting aspect is how programming
language design and compiler design
influence one another.
Programming languages that are easy to
compiler have many advantages
See the 2nd paragraph of page 16.
Compiler Design and Programming
Language Design (Cont’d.)
Languages such as Snobol and APL are
usually considered noncompilable
What attributes must be found in a
programming language to allow compilation?
Can the scope and binding of each identifier
reference be determined before execution begins
Can the type of object be determined before
execution begins?
Can existing program text be changed or added to
during execution?
Compiler Classifications
Diagnostic compilers
Optimizing compilers
Re-targetable compiler
Report and repair compile-time errors.
Add run-time checks, e.g., array subscripts.
should be used in real world.
vs. production compiler
Localize machine dependence.
difficult to implement
less efficient object code
Integrated programming environments
integrated E-C-D