Transcript Motivation

Why study compilers?

    Ties lots of things you know together: – Theory (finite automata, grammars) – – – Data structures Modularization Utilization of software tools You might build a parser.

The theory of computation/formal language still applies today. – As long as we still program with 1-D text. Helps you to be a better programmer CST320 - Lec 1 1

One-dimensional Text

int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”;

The formatting has no impact on the meaning of program

int x;cin >> x;if(x>5) cout << “Hello”; else …

2

What is a translator?

 Takes input (

SOURCE

) and produces output (

TARGET

) SOURCE TARGET ERROR 3

Types of Target Code:  “Pure” machine code » No operating system required.

» No library routines.

» Good for developing software for new hardware.

 “Augmented” code » More common » Executable code relies on o/s provided support and library routines loaded as program is prepared to execute.

4

Conventional Translator skeletal source program absolute machine code preprocessor source program compiler target assembly program loader / linker library, relocatable object files relocatable machine code assembler 5

Types of Target Code (cont.)  Virtual code » Code consists entirely of “virtual” instructions.

» Used by “Re-Targetable” compilers  Transporting to a new platform only requires implementing a virtual machine on the new hardware.

» Similar to interpreters 6

Translator for Java Java source code Java compiler Java bytecode Java bytecode Bytecode compiler Java interpreter absolute machine code 7

Types of Translators  Compilers – Conventional (textual source code) » Imperative, ALGOL-like languages » Other paradigms  Interpreters  Macro processors  Text formatters  Silicon compilers 8

Types of Translators (cont.)  Visual programming language  Interface – Database – User interface – Operating System 9

Conventional Translator skeletal source program absolute machine code preprocessor source program compiler target assembly program loader / linker relocatable machine code library, relocatable object files assembler 10

Structure of Compilers Source Program Lexical Analyzer (scanner) Tokens Syntax Analysis (Parser) Syntactic Structure Semantic Analysis Intermediate Representation Optimizer Symbol Table Code Generator Target machine code 11

Structure of Compilers Source Program Lexical Analyzer (scanner) Tokens

int x;

What about white spaces? Do they matter?

int x ; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; cin >> x ; if ( x > 5 ) cout << “Hello” ; else cout << “BOO” ;

12

Tokenize First or as needed?

int x; cin >> x; if(x>5) cout << “Hello”; else cout << “BOO”; Tokens = Meaningful units in a program Value/Type pairs int datatype x ID ; symbol cin >>

13

Tokenize First or as needed?

Array> someArray; Array < Array > someArray; int >> Array < int > >

14

Structure of Compilers Source Program Lexical Analyzer (scanner) Tokens Syntax Analysis (Parser) Syntactic Structure Parse Tree 15

Parse Tree (Parser) Program Data Declaration datatype ID

int x ; cin >>

16

Who is responsible for errors?

 int x$y;  int 32xy;  45b  45ab  x = x @ y; Lexical Errors / Token Errors?

17

Who is responsible for errors?

 X = ;  Y = x +;  Z = [; Syntax errors 18

Who is responsible for errors?

 45ab – One wrong token?

– Two tokens (45 & ab)? Are whitespaces needed?

 Either way is okay. – Lexical analyzer can catch the illegal token (45ab) – Parser can catch the syntax error. Most likely 45 followed by ab will not be syntactically correct.

19

Structure of Compilers Source Program Lexical Analyzer (scanner) Tokens Syntax Analysis (Parser) Syntactic Structure Semantic Analysis Symbol Table

int x; cin >> x; if(x>5) x = “SHERRY”; else cout << “BOO”;

20

Structure of Compilers Source Program Lexical Analyzer (scanner) Tokens Syntax Analysis (Parser) Syntactic Structure Semantic Analysis Intermediate Representation Optimizer Symbol Table Code Generator Target machine code 21

Structure of Compilers Source Program Lexical Analyzer (scanner) Tokens Syntax Analysis (Parser) Syntactic Structure Semantic Analysis Intermediate Representation Optimizer Symbol Table Code Generator Target machine code 22

Translation Steps:  Recognize when input is available.

 Break input into individual components.

 Merge individual pieces into meaningful structures.  Process structures.  Produce output.

23

Translation (Compilers) Steps:  Break input into individual components.

(lexical analysis)  Merge individual pieces into meaningful structures. (parsing)  Process structures. (semantic analysis)  Produce output. (code generation) 24

Compilers  Two major tasks: – Analysis of source – Synthesis of target  Syntax-directed translation – Compilation process driven by syntactic structure of the source being translated 25

Interpreters  Executes source program without explicitly translating to target code.

 Control and memory management reside in interpreter, not user program.

 Allow: – Modification of program as it executes.

– Dynamic typing of variables – Portability  Huge overhead (time & space) 26

Structure of Interpreters Interpreter Program Output Source Program Data 27

Misc. Compiler Discussions  History of Modern Compilers  Front and Back ends  One pass vs. Multiple passes  Compiler Construction Tools – Compiler-Compilers, Compiler-generators, Translator writing Systems » Scanner generator » Parse generator » Syntax-directed engines » Automatic code generator » Dataflow engines 28