Transcript pptx
A Unified Model for Timing Speculation: Evaluating the Impact of Technology Scaling, CMOS Design Style, and Fault Recovery Mechanism Marc de Kruijf Shuou Nomura Karu Sankaralingam UW-Madison Computer Sciences Vertical Research Group © 2010 From Hard to Harder 10000nm 720nm 4000um 360nm 1500um Hard 180nm 90nm 45nm & beyond Harder DSN 2010 - 2 What is the Problem? Non-ideal transistor scaling Transistor wear-out Process, voltage, and temperature (PVT) variations Errors due to particle interference Noise coupling & crosstalk DSN 2010 - 3 What is the Problem? Dynamic verification NEED HIGH-LEVEL ANALYSIS TOOLS Performance Toolbox Reliability Toolbox DSN 2010 - 4 Our Contribution A model for timing speculation • Unifies hardware + system • Small set of high-level inputs Also…. processor designer Q. What is the impact of technology scaling? A. Further benefits are small to none. Q. What is the impact of CMOS design style? A. Very low power designs benefit most. Q. What is the impact of the fault recovery mechanism? A. Fine-grained recovery is key to high efficiencies. DSN 2010 - 5 Outline Timing Speculation Model Overview Hardware Efficiency Model System Recovery Model Results Conclusion DSN 2010 - 6 Timing Speculation … clock clock period ( = 1/frequency ) detect & recover circuit delay variations Timing failure! … slower clock OK! DSN 2010 - 7 Outline Timing Speculation Model Overview Hardware Efficiency Model System Recovery Model Results Conclusion DSN 2010 - 8 Model Overview Error rate 1. 2. 3. 4. Energy Time Energy Hardware Efficiency System Recovery Overall Efficiency Error rateError rate Model Inputs A hardware path delay distribution Effect of variations on path delay as N(μ,σ) The time between recovery checkpoints The time to restore a checkpoint DSN 2010 - 9 Hardware Efficiency Model Error prob. Energy # Paths Error prob. Input 1: Path delay distribution Input 2: Path delay variation (σ) e.g. frequency scaling … Clock period Energy Path delay Clock period Clock period Error rate Error prob. … Error prob. DSN 2010 - 10 System Recovery Model (applies to all backward error recovery systems) overhead(rate) = failures(rate) x ( waste(rate) + restore ) Time System Recovery Model Inputs 1. The time between recovery checkpoints (cycles) 2. The time to restore a checkpoint (restore) Error rate DSN 2010 - 11 Outline Timing Speculation Model Overview Hardware Efficiency Model System Recovery Model Results Conclusion DSN 2010 - 12 Results Is the model useful? What can we learn? Technology Node 11nm 45nm CMOS Design Style Recovery System High Performance CMOS Low Power CMOS Razor Reunion Ultra-low Power CMOS Paceline DSN 2010 - 13 Results Error rate Overall Efficiency Energy Time Energy Hardware Efficiency System Recovery Error rate DSN 2010 - 14 Error rate Hardware Model Inputs 1. Path delay distribution Application: H.264 decoding Hardware: OpenRISC processor 2. Effect of process variations as N(μ,σ) using ITRS data High Performance CMOS Low Power CMOS 45nm σ = 0.046μ 11nm σ = 0.051μ 45nm σ = 0.029μ 11nm σ = 0.042μ Ultra-low Power CMOS 45nm σ = 0.196μ DSN 2010 - 15 Energy EDP Hardware Efficiency Energy = Power x Time EDP = Power x Time2 Normalized EDP Error rate Results for High Performance CMOS Error rate DSN 2010 - 16 Recovery Model Inputs 1. The time between recovery checkpoints & 2. The time to restore a checkpoint Razor Reunion Latch-level detection + pipeline rollback 1 cycle checkpoint size & 5 cycle recovery cost DMR detection + checkpoint 100 cycle checkpoint size & 100 cycle recovery cost Paceline DMR detection + checkpoint + flush 100 cycle checkpoint size & 1000 cycle recovery cost DSN 2010 - 17 Time System Recovery Normalized Time Error rate Error rate DSN 2010 - 18 EDP Overall Efficiency Error rate 1. High Performance CMOS 2. Low Power CMOS 3. Ultra-low Power CMOS DSN 2010 - 19 Overall Efficiency Normalized EDP High Performance CMOS Error rate DSN 2010 - 20 Overall Efficiency Normalized EDP Low Power CMOS Error rate DSN 2010 - 21 Overall Efficiency Normalized EDP Ultra-low Power CMOS Error rate DSN 2010 - 22 Outline Timing Speculation Model Overview Hardware Efficiency Model System Recovery Model Results Conclusion DSN 2010 - 23 Conclusions A High-level Model Results Efficiency gains improve only minimally with scaling Ultra-low power (sub-threshold) CMOS benefits most Fine-grained recovery is key Future Work Incorporate more sources of variation A tool for processor designers? Under development at http://www.cs.wisc.edu/vertical DSN 2010 - 24 Questions? DSN 2010 - 25 DSN 2010 - ‹#› Timing Speculation Source of Timing Variation Manufacturing Runtime Application Process Speed Binning Online Timing Analysis Timing Speculation Figure adapted from Greskamp et al., Paceline: [...]. In PACT ’07. DSN 2010 - 27 System Recovery Model System Recovery Model Inputs 1. The time between recovery checkpoints (cycles) 2. The time to restore a checkpoint (restore) expected # failures expected # cycles before success executed upon failure DSN 2010 - 28 Overall Inputs 1. Path delay distribution 2. Effect of process variations on path delay as N(μ,σ) using ITRS data 3. 4. Application: H.264 decoding Hardware: OpenRISC processor High Performance CMOS @45nm σ = 0.046μ Low Power CMOS @45nm σ = 0.029μ Ultra-low Power CMOS @45nm σ = 0.196μ The time between recovery checkpoints & The time to restore a checkpoint Razor – Latch-level detection + pipeline rollback (1 & 5 cycles) Reunion – DMR detection + checkpoint (100 & 100 cycles) Paceline – DMR detection + checkpoint + flush (100 & 1000 cycles) DSN 2010 - 29