Transcript PPT

IA64 Complier Optimizations
Alex Bobrek
Jonathan Bradbury
Outline
•
•
•
•
•
EPIC Style ISA
Predication
Register Model
Speculative Control Flow
Data Speculation
Explicitly Parallel Instruction
Computing (EPIC) Style ISA
• IA-64 is just one implementation of EPIC
• Main idea behind EPIC:
– Make Hardware Simpler
– Make Compiler Smarter
• Similar to VLIW
– Three Instructions per “bundle”
– Includes template that describes dependencies
Predication
• Conditional execution of an instruction
based on predicate register
• Reduces branching
– Increases ILP
• Allows instructions to be moved across
branches
Predication (Cont.)
Traditional:
if(a<5)
then
b=c+d;
else
b=c-d;
cmp
bge
add
jmp
ra,#5
L1
rc,rd,rb
L2
sub
rc,rd,rb
L1:
L2:
Predicated:
(p1)
(p1)
cmp.lt p1= ra,#5
add rc,rd,rb
sub rc,rd,rb
Register Model
• IA-64 has 128 integer registers
– R0-R31 are always program visible
– R32-R127 are stacked
• Each procedure can have it’s own variable sized stack frame
• Register Stack Engine (RSE) handles spills in hardware
• Compiler doesn’t have to worry about managing fills/spills
– Done dynamically in hardware
– Allows for shorter critical path length through code
• OS has to flush stack on context switch
Speculative Control Flow
• Allows for loads to
be moved outside of
a basic block even if
address is not known
to be safe
• Speculative Load
(ld.s) instruction
• Speculation Check
(chk.s) instruction
Traditional
IA-64
Speculative Control Flow (cont.)
• Every register has a NaT bit set if there is an
exception
• NaT bits are propagated through all
instructions using the speculated value
• The chk.s instruction will branch to recovery
code if speculation fails
• Deferral Models to allow for tradeoffs
between OS and hardware exception
handling
Data Speculation
• Allows for loads to
be moved ahead
of stores if
compiler is unsure
if addresses are
the same
• Advanced load
(ld.a)
• Alias check
(chk.a)
Traditional
IA-64
Data Speculation (cont.)
• Implemented using an Advanced Load
Address Table (ALAT)
– All speculative load addresses stored in table
– All stores remove entries with same address
– On chk, if address is in the table, speculation is
successful
• The chk.a will branch to recovery code if
speculation has not succeeded
Interesting Research Questions
• How do the new IA-64 instructions impact
existing compiler optimizations (partial
redundancy elimination, liveness analysis...) ?
• Does the EPIC approach outperform the
runtime optimizations of traditional superscalar
processors?
• What other hardware aspects can be exposed
to the compiler?
Additional Papers Used
• J. Huck, et. al., Introducing the IA-64
Architecture. IEEE Micro, Sept./Oct. 2000
• R. Krishnaiyer, et. al., An Advanced Optimizer
for the IA-64 Architecture. IEEE Micro,
Nov./Dec. 2000.
• M. Schlansker, B. Ramakrishna Rau. EPIC:
Explicitly Parallel Instruction Computing. IEEE
Computer, Feb. 2000.