Transcript PPT
IA64 Complier Optimizations Alex Bobrek Jonathan Bradbury Outline • • • • • EPIC Style ISA Predication Register Model Speculative Control Flow Data Speculation Explicitly Parallel Instruction Computing (EPIC) Style ISA • IA-64 is just one implementation of EPIC • Main idea behind EPIC: – Make Hardware Simpler – Make Compiler Smarter • Similar to VLIW – Three Instructions per “bundle” – Includes template that describes dependencies Predication • Conditional execution of an instruction based on predicate register • Reduces branching – Increases ILP • Allows instructions to be moved across branches Predication (Cont.) Traditional: if(a<5) then b=c+d; else b=c-d; cmp bge add jmp ra,#5 L1 rc,rd,rb L2 sub rc,rd,rb L1: L2: Predicated: (p1) (p1) cmp.lt p1= ra,#5 add rc,rd,rb sub rc,rd,rb Register Model • IA-64 has 128 integer registers – R0-R31 are always program visible – R32-R127 are stacked • Each procedure can have it’s own variable sized stack frame • Register Stack Engine (RSE) handles spills in hardware • Compiler doesn’t have to worry about managing fills/spills – Done dynamically in hardware – Allows for shorter critical path length through code • OS has to flush stack on context switch Speculative Control Flow • Allows for loads to be moved outside of a basic block even if address is not known to be safe • Speculative Load (ld.s) instruction • Speculation Check (chk.s) instruction Traditional IA-64 Speculative Control Flow (cont.) • Every register has a NaT bit set if there is an exception • NaT bits are propagated through all instructions using the speculated value • The chk.s instruction will branch to recovery code if speculation fails • Deferral Models to allow for tradeoffs between OS and hardware exception handling Data Speculation • Allows for loads to be moved ahead of stores if compiler is unsure if addresses are the same • Advanced load (ld.a) • Alias check (chk.a) Traditional IA-64 Data Speculation (cont.) • Implemented using an Advanced Load Address Table (ALAT) – All speculative load addresses stored in table – All stores remove entries with same address – On chk, if address is in the table, speculation is successful • The chk.a will branch to recovery code if speculation has not succeeded Interesting Research Questions • How do the new IA-64 instructions impact existing compiler optimizations (partial redundancy elimination, liveness analysis...) ? • Does the EPIC approach outperform the runtime optimizations of traditional superscalar processors? • What other hardware aspects can be exposed to the compiler? Additional Papers Used • J. Huck, et. al., Introducing the IA-64 Architecture. IEEE Micro, Sept./Oct. 2000 • R. Krishnaiyer, et. al., An Advanced Optimizer for the IA-64 Architecture. IEEE Micro, Nov./Dec. 2000. • M. Schlansker, B. Ramakrishna Rau. EPIC: Explicitly Parallel Instruction Computing. IEEE Computer, Feb. 2000.