Transcript Document

Compiler Principle
Prof. Lu Dong-ming
Text Book: Modern Compiler Implementation in C
Author: Andrew W. Appel
Review
1. Introduction
• The phases of a typical compiler
– From Source Program to Machine language
– Description of each compiling phases
• Concepts of module, interface and tools
– Phase – Module
– Abstract Syntax, IR tree, Tokens
– Context free grammar, regular expression
• Data structures for Tree Language
– Intermediate representations of the program
2. Lexical Analysis
• Lexical Tokens
– Concept of Token
– Token type, Reserved word
– Semantic values attached to Token
• Regular Expression
– Language, string, symbol, alphabet
– The notation of regular expression
– Alternation, concatenation, epsilon, repetition
• Finite Automata
– Definition: states (start, final), edges, symbol
2. Lexical Analysis
• DFA (deterministic finite automaton)
– Definition
– Language of a DFA
• NFA (Nondeterministic Finite Automata)
– Definition
• Conversion
– Regular Expression to An NFA
– NFA to DFA
– Algorithm to minimize the DFA
3. Parsing
• Context-Free Grammars
– Rules, terminal, non-terminal, start symbol
• Derivation
– Definition
– Leftmost , right most
– Parse Trees
• Ambiguous Grammar
– Definition
3. Parsing
• Predictive Parsing
– Recursive descent parser
– Conflict – first terminal symbol
• First and Follow Sets
– Definition
– The computation
– nullable
3. Parsing
• Constructing a predictive parser
– Predictive parsing table
– Concept of LL(1) grammar
– Eliminating Left Recursion
– Left factoring
– Error Recovery
3. Parsing
• LR Parsing
– Definition of LR(K): stack, input, first k tokens
– Contents of the stack
– Two kinds of actions: shift, reduce
• LR Parsing Engine
– Using states
– Transition table
– The elements in the transition table
• sn; gn; rk; a; Error
3. Parsing
• LR(0) Parser Generation
– LR(0) item – A state
– The basic operations
• Closure (I); goto(I, X)
– LR(0) parsing table construction
• Shift action; Goto action; Reduce action
• SLR Parser Generation
– SLR stands for Simpler LR
– Reduce actions – Follow set
3. Parsing
• LR(1) items and LR Parsing table
– Components of LR(1) item
– Closure(I), Goto(I, X)
– LR(1) parsing table and LALR(1) parsing table
• Hierarchy of Grammar Classes
• Using Parser Generator
– Yacc specification; Resolving conflict
• Recover using the error symbol
4. Abstract Syntax
• Concepts of semantic actions
– Recursive Descent
• Return by Parsing functions
– Yacc-Generator Parsers
• Rule – annotated with a semantic actions
• Abstract Parser Trees
–
–
–
–
Concrete parse tree – concrete syntax
Abstract syntax tree
Position
Abstract syntax constructor
5. Semantic Analysis
• Symbol Table
– Concepts: scope, bindings
– The functional, imperative style
– Multiple symbol table
• Efficient imperative symbol tables
– Hash tables with external chain
• Efficient function symbol tables
• Bindings for the Tiger compiler
– Type environment
– Value environment
5. Semantic Analysis
• Type-Checking expressions
– Variables
– Subscripts
– Fields
• Type-Checking Declaration
– Variable
– Type
– Function
– Recursive
6. Activation Record
• Stack Frame
– The typical stack frame layout
• Frame pointer and Stack pointer
• Parameter Passing
– In register or frame stack
– Incoming and outgoing arguments
• Nested function declaration
– Static Links
– Display
6. Activation Record
• Frames in The Tiger Compiler
– Abstraction : frame interface
• Representation of frame Description
– Information of frame data structure holding
• Local Variables
– In the frame or register (escapes - false)
– Calculating escapes
• Temporaries and Labels
– Determine which register
– Determine location of the procedure body
6. Activation Record
• Two Layers of Abstraction
– translate.h : semant – translate
– Frame.h : translate – uFrame
– Temp.h: translate – temp
• Managing Stack Links
– “extra parameter” does escape
– Passed in register and stored in frame
• Keeping track of levels
7. Translation to Intermediate Code
• Intermediate representation
– Several qualities
– Representation tree
– Description of tree operator
• Translation into tree
– Kinds of expression
• Ex : expression; Nx : no result; Cx : conditional
– Patchlist
– Conversion functions
7. Translation to Intermediate Code
• Simple variables
– Declared in current procedure’s frame stack
– Following static links
• Array Variables
– Different for different language
– Pascal: stands for contents
– C: stands for pointer
7. Translation to Intermediate Code
• Structured L-values
– Concept of l-value, r-value
– Scalar
• Subscription and Field Selection
– Computing address
– L-value and MEM nodes
– Safety checking
• Conditionals
– If-expression
7. Translation to Intermediate Code
• String, Record and Array Creation
– Allocated on the Heap
• While, for loop and Function call
• Function Definition
– Prologue
– Body
– Epilogue
– Fragments
The End of Review