Transcript Document
Compiler Principle Prof. Lu Dong-ming Text Book: Modern Compiler Implementation in C Author: Andrew W. Appel Review 1. Introduction • The phases of a typical compiler – From Source Program to Machine language – Description of each compiling phases • Concepts of module, interface and tools – Phase – Module – Abstract Syntax, IR tree, Tokens – Context free grammar, regular expression • Data structures for Tree Language – Intermediate representations of the program 2. Lexical Analysis • Lexical Tokens – Concept of Token – Token type, Reserved word – Semantic values attached to Token • Regular Expression – Language, string, symbol, alphabet – The notation of regular expression – Alternation, concatenation, epsilon, repetition • Finite Automata – Definition: states (start, final), edges, symbol 2. Lexical Analysis • DFA (deterministic finite automaton) – Definition – Language of a DFA • NFA (Nondeterministic Finite Automata) – Definition • Conversion – Regular Expression to An NFA – NFA to DFA – Algorithm to minimize the DFA 3. Parsing • Context-Free Grammars – Rules, terminal, non-terminal, start symbol • Derivation – Definition – Leftmost , right most – Parse Trees • Ambiguous Grammar – Definition 3. Parsing • Predictive Parsing – Recursive descent parser – Conflict – first terminal symbol • First and Follow Sets – Definition – The computation – nullable 3. Parsing • Constructing a predictive parser – Predictive parsing table – Concept of LL(1) grammar – Eliminating Left Recursion – Left factoring – Error Recovery 3. Parsing • LR Parsing – Definition of LR(K): stack, input, first k tokens – Contents of the stack – Two kinds of actions: shift, reduce • LR Parsing Engine – Using states – Transition table – The elements in the transition table • sn; gn; rk; a; Error 3. Parsing • LR(0) Parser Generation – LR(0) item – A state – The basic operations • Closure (I); goto(I, X) – LR(0) parsing table construction • Shift action; Goto action; Reduce action • SLR Parser Generation – SLR stands for Simpler LR – Reduce actions – Follow set 3. Parsing • LR(1) items and LR Parsing table – Components of LR(1) item – Closure(I), Goto(I, X) – LR(1) parsing table and LALR(1) parsing table • Hierarchy of Grammar Classes • Using Parser Generator – Yacc specification; Resolving conflict • Recover using the error symbol 4. Abstract Syntax • Concepts of semantic actions – Recursive Descent • Return by Parsing functions – Yacc-Generator Parsers • Rule – annotated with a semantic actions • Abstract Parser Trees – – – – Concrete parse tree – concrete syntax Abstract syntax tree Position Abstract syntax constructor 5. Semantic Analysis • Symbol Table – Concepts: scope, bindings – The functional, imperative style – Multiple symbol table • Efficient imperative symbol tables – Hash tables with external chain • Efficient function symbol tables • Bindings for the Tiger compiler – Type environment – Value environment 5. Semantic Analysis • Type-Checking expressions – Variables – Subscripts – Fields • Type-Checking Declaration – Variable – Type – Function – Recursive 6. Activation Record • Stack Frame – The typical stack frame layout • Frame pointer and Stack pointer • Parameter Passing – In register or frame stack – Incoming and outgoing arguments • Nested function declaration – Static Links – Display 6. Activation Record • Frames in The Tiger Compiler – Abstraction : frame interface • Representation of frame Description – Information of frame data structure holding • Local Variables – In the frame or register (escapes - false) – Calculating escapes • Temporaries and Labels – Determine which register – Determine location of the procedure body 6. Activation Record • Two Layers of Abstraction – translate.h : semant – translate – Frame.h : translate – uFrame – Temp.h: translate – temp • Managing Stack Links – “extra parameter” does escape – Passed in register and stored in frame • Keeping track of levels 7. Translation to Intermediate Code • Intermediate representation – Several qualities – Representation tree – Description of tree operator • Translation into tree – Kinds of expression • Ex : expression; Nx : no result; Cx : conditional – Patchlist – Conversion functions 7. Translation to Intermediate Code • Simple variables – Declared in current procedure’s frame stack – Following static links • Array Variables – Different for different language – Pascal: stands for contents – C: stands for pointer 7. Translation to Intermediate Code • Structured L-values – Concept of l-value, r-value – Scalar • Subscription and Field Selection – Computing address – L-value and MEM nodes – Safety checking • Conditionals – If-expression 7. Translation to Intermediate Code • String, Record and Array Creation – Allocated on the Heap • While, for loop and Function call • Function Definition – Prologue – Body – Epilogue – Fragments The End of Review