Transcript slides
Probabilistic Soft Error Rate Estimation from Statistical SEU Parameters Fan Wang* Vishwani D. Agrawal Department of Electrical and Computer Engineering Auburn University, AL 36849 USA *Presently with Juniper Networks, Sunnyvale, CA 17 May 14-16, 2008 th IEEE North Atlantic Test Workshop NATW'2008 1 Outline Background Problem Statement Analysis Results and Discussion Conclusion May 14-16, 2008 NATW'2008 2 Motivation for This Work With the continuous downscaling of CMOS technologies, the device reliability has become a major bottleneck. Sensitivity of electronic systems can potentially become a major cause of soft (non-permanent) failures. There is no comprehensive work that considers all factors that influence soft error rate. May 14-16, 2008 NATW'2008 3 Strike Changes State of a Single Bit α-particle or high-energy neutron Logic or Memory Device 10 Definition from NASA Thesaurus: “Single Event Upset (SEU): Radiation-induced errors in microelectronic circuits caused when charged particles [also, high energy particles] (usually from the radiation belts or from cosmic rays) lose energy by ionizing the medium through which they pass, leaving behind a wake of electronhole pairs.” May 14-16, 2008 NATW'2008 4 Impact of Neutron Strike on a Silicon Transistor neutron strike source Strikes release electron & hole pairs that can be absorbed by source & drain to alter the state of the device drain + + - - ++ - Transistor Device Neutron is a major cause of electronic failures at ground level. Another source of upsets: alpha particles from impurities in packaging materials. May 14-16, 2008 NATW'2008 5 Cosmic Rays p p n n p n n p n Earth’s Surface p n Source: Ziegler et al. Neutron flux is dependent on altitude, longitude, solar activity etc. May 14-16, 2008 NATW'2008 6 Problem Statement Given background environment data Neutron flux Background energy (LET*) distribution *These two factors are location-dependent. Given circuit characteristics Technology Circuit netlist Circuit node sensitive region data *These three factors are circuit-dependent. Estimate soft error rate in standard FIT** units. *Linear Energy Transfer (LET) is a measure of the energy transferred to the device per unit length as an ionizing particle travels through material. Unit: MeV-cm2/mg. **Failures In Time (FIT): Number of failures per 109 device hours May 14-16, 2008 NATW'2008 7 Measured Environmental Data Typical ground-level neutron flux: 56.5cm-2s-1. J. F. Ziegler, “Terrestrial cosmic rays,” IBM Journal of Research and Development, vol. 40, no. 1, pp. 19.39, 1996. Particle energy distribution at ground-level: “For both 0.5μm and 0.35μm CMOS technology at ground level, the largest population has an LET of 20 MeV-cm2/mg or less. Particles with energy greater than 30 MeV-cm2/mg are exceedingly rare.” Probability density K. J. Hass and J. W. Ambles, “Single Event Transients in Deep Submicron CMOS,” Proc. 42nd Midwest Symposium on Circuits and Systems, vol. 1, 1999. 0 15 30 Linear energy transfer (LET), MeV-cm2/mg May 14-16, 2008 NATW'2008 8 Proposed Soft Error Model Occurrence rate May 14-16, 2008 NATW'2008 9 Pulse Widths Probability Density Propagation fX(x) Delay τp 1 Dout X Y fY(y) 0 τp 2τp Din We use a “3-interval piecewise linear” propagation model 1) Non-propagation, if Din ≤τp. 2) Propagation with attenuation, ifτp < Din < 2τp. 3) Propagation with no attenuation, if Din 2τp. Where Din: input pulse width Dout: output pulse width τp : gate input output delay May 14-16, 2008 NATW'2008 10 Validating Propagation Model Using HSPICE Simulation Simulation of a CMOS inverter in TSMC035 technology with load capacitance 10fF May 14-16, 2008 NATW'2008 11 Pulse Width Density Propagation Through a CMOS Inverter May 14-16, 2008 NATW'2008 12 Soft Error Occurrence Rate Calculation for Generic Gate PSEU PSEU (1) i EMR j [Pnoncontrollin g (i)] electrical_ masking May 14-16, 2008 2 NATW'2008 logic _ masking 13 Comparing Methods of Analysis Factors Considered LET Spec. Reconv Fanout Sens. region Occur ance rate Vectors ? Altitude Ckt Tech. SET degrad Our work Yes No Yes Yes No Yes Yes Yes Rao et at. [1] Yes No No No Yes Yes Yes Yes Rajaraman et al. [2] No No No No Yes No No Yes Asadi-Tahoori [3] No No No Yes No No No No ZhangShanbhag[4] Yes No Yes Yes Yes Yes Yes No RejimonBhanja [5] No No No Yes Yes No No No May 14-16, 2008 NATW'2008 14 Experimental Result Comparison # PI # PO C432 36 7 C499 41 C880 Ckt # Gat es Our approach CPU s Rao et al. [1] CPU s Rajaraman et al[2] CPU min Error Prob. 160 0.04 1.18x103 <0.01 1.75x10-5 108 0.0725 32 202 0.14 1.41x103 0.01 6.26x10-5 216 0.0041 60 26 383 0.08 3.86x103 0.01 6.07x10-5 102 0.0188 C1908 33 25 880 1.14 1.63x104 0.01 7.50x10-5 1073 0.0011 FIT FIT Computing Platform Sun Fire 280R Pentium 2.4 GHz Sun Fire v210 Circuit Technology TSMC035 Std. 0.13 µm 70nm BPTM* Altitude Ground Ground N/A *BPTM: Berkley Predictive Technology Model May 14-16, 2008 NATW'2008 15 More Result Comparison Logic Circuit SER Estimation Ground Level Measured Data Devices SER* (FIT/Mbit) 0.13µ SRAMs[6] 10,000 to 100,000 SRAMs, 0.25μ and below [7] 10,000 to 100,000 1 Gbit memory in 0.25µ [8] 4,200 Our Work 1,000 to 10,000 Rao et al. [1] 1x10-5 to 8x10-5 * The altitude is not mentioned for these data. May 14-16, 2008 NATW'2008 16 Discussion We take the energy of neutron to be the key factor to induce SEU. In real cases, there can also be secondary particles generated through interaction with neutrons. Estimating sensitive regions in silicon is a hard task. Also, the polarity of SET should be taken into account. Because on the earth surface, typical error rates are very small, their measurement is time consuming and can produce large discrepancy. This motivates the use of analytical methods. For example, a circuit may experience 1 SEU in 6 months (4320 hours), equals 231,480 FIT. It is also likely that the circuit has 0 SEU in these 6 months, so the measured SER is 0 FIT. May 14-16, 2008 NATW'2008 17 Discussion Continued Fan-out stems should be considered. Two situations can arise: When an SET goes through a large fan-out, the large load capacitance can eliminate the SET, or If it is not canceled by the fan-out node, it will go through multiple fan-out paths to increase the SER. It is highly recommended to have more field tests for logic circuits. None of these SER approaches consider the process variation effects on SER. Without consideration of electrical masking, SER will be overestimated by 138% for a small 5-stage circuit [Wang et al., VLSID’07] Intra-die threshold voltage variation can result in a peak to peak SER variation of 41% in a small circuit [Ramakrishnan et al., ISQED’07] May 14-16, 2008 NATW'2008 18 Conclusion SER in logic and memory chips will continue to increase as devices become more sensitive to soft errors at sea level. By modeling the soft errors by two parameters, the occurrence rate and single event transient pulse width density, we are able to effectively account for the electrical masking of circuit. Our approach considers more factors and thus gives more realistic soft error rate estimation. May 14-16, 2008 NATW'2008 19 References [1] R. R. Rao, K. Chopra, D. Blaauw, and D. Sylvester, “An Efficient Static Algorithm for Computing the Soft Error Rates of Combinational Circuits," Proc. Design Automation and Test in Europe Conf., 2006, pp. 164-169. [2] R. Rajaraman, J. S. Kim, N. Vijaykrishnan, Y. Xie, and M. J. Irwin, “SEAT-LA: A Soft Error Analysis Tool for Combinational Logic,", Proc. 19th International Conference on VLSI Design, 2006, pp. 499-502. [3] G. Asadi and M. B. Tahoori, “An Accurate SER Estimation Method Based on Propagation Probability,” Proc. Design Automation and Test in Europe Conf.,2005, pp. 306-307. [4] M. Zhang and N. R. Shanbhag, “A Soft Error Rate Analysis (SERA) Methodology," Proc. IEEE/ACM International Conference on Computer Aided Design, ICCAD2004, 2004, pp. 111-118. [5] T. Rejimon and S. Bhanja, “An Accurate Probabilistic Model for Error Detection," Proc. 18th International Conference on VLSI Design, 2005, pp. 717-722. [6] J. Graham, “Soft Errors a Problem as SRAM Geometries Shrink, http://www.ebnews.com/story/OEG20020128S0079, ebn, 28 Jan 2002. [7] W. Leung; F.-C. Hsu; Jones, M. E., "The Ideal SoC Memory: 1T-SRAMTM," Proc. 13th Annual IEEE International on ASIC/SOC Conference, 2000, pp. 32-36. [8] Report, “Soft Errors in Electronic Memory-A White Paper," Technical report, Tezzaron Semiconductor, 2004. [9] F. Wang and V. D. Agrawal, “Sngle Event Upset: An Embedded Tutorial,” Proc. 21st International Conf. VLSI Design, 2008, pp. 429-434. [10] F. Wang and V. D. Agrawal, “Soft Error Rate Determination for Nanometer CMOS VLSI Logic,” Proc. 40th Southeastern Symp. System Theory, 2008, 324-328. [9] F. Wang, “Soft Error Rate Determination for Nanometer CMOS VLSI Circuits,” Master’s Thesis, Auburn University, May 2008. May 14-16, 2008 NATW'2008 20 Thank You . . . May 14-16, 2008 NATW'2008 21