Transcript slides

Probabilistic Soft Error Rate Estimation
from Statistical SEU Parameters
Fan Wang*
Vishwani D. Agrawal
Department of Electrical and Computer Engineering
Auburn University, AL 36849 USA
*Presently with Juniper Networks, Sunnyvale, CA
17
May 14-16, 2008
th
IEEE North Atlantic Test Workshop
NATW'2008
1
Outline

Background

Problem Statement

Analysis

Results and Discussion

Conclusion
May 14-16, 2008
NATW'2008
2
Motivation for This Work



With the continuous downscaling of CMOS
technologies, the device reliability has
become a major bottleneck.
Sensitivity of electronic systems can
potentially become a major cause of soft
(non-permanent) failures.
There is no comprehensive work that
considers all factors that influence soft error
rate.
May 14-16, 2008
NATW'2008
3
Strike Changes State of a Single Bit
α-particle
or high-energy neutron
Logic or
Memory
Device
10
Definition from NASA Thesaurus:
“Single Event Upset (SEU): Radiation-induced errors in
microelectronic circuits caused when charged particles [also,
high energy particles] (usually from the radiation belts or
from cosmic rays) lose energy by ionizing the medium
through which they pass, leaving behind a wake of electronhole pairs.”
May 14-16, 2008
NATW'2008
4
Impact of Neutron Strike on a Silicon Transistor
neutron strike
source
Strikes release electron
& hole pairs that can be
absorbed by source &
drain to alter the state of
the device
drain
+
+ - - ++
- Transistor Device


Neutron is a major cause of electronic failures at
ground level.
Another source of upsets: alpha particles from
impurities in packaging materials.
May 14-16, 2008
NATW'2008
5
Cosmic Rays
p
p
n
n
p
n
n
p
n
Earth’s Surface
p
n
Source:
Ziegler et al.
 Neutron flux is dependent on altitude, longitude, solar activity etc.
May 14-16, 2008
NATW'2008
6
Problem Statement
 Given background environment data
 Neutron flux
 Background energy (LET*) distribution
*These two factors are location-dependent.
 Given circuit characteristics
 Technology
 Circuit netlist
 Circuit node sensitive region data
*These three factors are circuit-dependent.
 Estimate soft error rate in standard FIT** units.
*Linear Energy Transfer (LET) is a measure of the energy transferred to the
device per unit length as an ionizing particle travels through material.
Unit: MeV-cm2/mg.
**Failures In Time (FIT): Number of failures per 109 device hours
May 14-16, 2008
NATW'2008
7
Measured Environmental Data

Typical ground-level neutron flux: 56.5cm-2s-1.
 J. F. Ziegler, “Terrestrial cosmic rays,” IBM Journal of Research
and Development, vol. 40, no. 1, pp. 19.39, 1996.

Particle energy distribution at ground-level:
“For both 0.5μm and 0.35μm CMOS technology at ground
level, the largest population has an LET of 20 MeV-cm2/mg or
less. Particles with energy greater than 30 MeV-cm2/mg are
exceedingly rare.”
Probability density
 K. J. Hass and J. W. Ambles, “Single Event Transients in Deep
Submicron CMOS,” Proc. 42nd Midwest Symposium on Circuits
and Systems, vol. 1, 1999.
0
15
30
Linear energy transfer (LET), MeV-cm2/mg
May 14-16, 2008
NATW'2008
8
Proposed Soft Error Model
Occurrence rate
May 14-16, 2008
NATW'2008
9
Pulse Widths Probability Density Propagation
fX(x)
Delay
τp
1
Dout
X
Y
fY(y)
0
τp
2τp
Din
We use a “3-interval piecewise linear” propagation model
1)
Non-propagation, if Din ≤τp.
2)
Propagation with attenuation, ifτp < Din < 2τp.
3)
Propagation with no attenuation, if Din  2τp.
Where

Din: input pulse width

Dout: output pulse width

τp : gate input output delay
May 14-16, 2008
NATW'2008
10
Validating Propagation Model Using HSPICE
Simulation

Simulation of a CMOS inverter in TSMC035 technology
with load capacitance 10fF
May 14-16, 2008
NATW'2008
11
Pulse Width Density Propagation Through
a CMOS Inverter
May 14-16, 2008
NATW'2008
12
Soft Error Occurrence Rate Calculation
for Generic Gate
PSEU  PSEU (1) 
i
EMR j  [Pnoncontrollin g (i)]
electrical_ masking
May 14-16, 2008
2
NATW'2008
logic _ masking
13
Comparing Methods of Analysis
Factors
Considered
LET
Spec.
Reconv
Fanout
Sens.
region
Occur
ance
rate
Vectors
?
Altitude
Ckt
Tech.
SET
degrad
Our work
Yes
No
Yes
Yes
No
Yes
Yes
Yes
Rao et at. [1]
Yes
No
No
No
Yes
Yes
Yes
Yes
Rajaraman et
al. [2]
No
No
No
No
Yes
No
No
Yes
Asadi-Tahoori
[3]
No
No
No
Yes
No
No
No
No
ZhangShanbhag[4]
Yes
No
Yes
Yes
Yes
Yes
Yes
No
RejimonBhanja [5]
No
No
No
Yes
Yes
No
No
No
May 14-16, 2008
NATW'2008
14
Experimental Result Comparison
#
PI
#
PO
C432
36
7
C499
41
C880
Ckt
#
Gat
es
Our approach
CPU
s
Rao et al. [1]
CPU
s
Rajaraman
et al[2]
CPU
min
Error
Prob.
160 0.04 1.18x103 <0.01 1.75x10-5
108
0.0725
32
202 0.14 1.41x103
0.01
6.26x10-5
216
0.0041
60
26
383 0.08 3.86x103
0.01
6.07x10-5
102
0.0188
C1908 33
25
880 1.14 1.63x104
0.01
7.50x10-5 1073 0.0011
FIT
FIT
Computing Platform
Sun Fire 280R
Pentium 2.4 GHz
Sun Fire
v210
Circuit Technology
TSMC035
Std. 0.13 µm
70nm BPTM*
Altitude
Ground
Ground
N/A
*BPTM: Berkley Predictive Technology Model
May 14-16, 2008
NATW'2008
15
More Result Comparison
Logic Circuit SER Estimation
Ground Level
Measured Data
Devices
SER*
(FIT/Mbit)
0.13µ SRAMs[6]
10,000 to
100,000
SRAMs, 0.25μ
and below [7]
10,000 to
100,000
1 Gbit memory
in 0.25µ [8]
4,200
Our Work
1,000 to
10,000
Rao et al. [1]
1x10-5 to
8x10-5
* The altitude is not mentioned for these data.
May 14-16, 2008
NATW'2008
16
Discussion



We take the energy of neutron to be the key factor to
induce SEU. In real cases, there can also be secondary
particles generated through interaction with neutrons.
Estimating sensitive regions in silicon is a hard task.
Also, the polarity of SET should be taken into account.
Because on the earth surface, typical error rates are
very small, their measurement is time consuming and
can produce large discrepancy. This motivates the use
of analytical methods.
For example, a circuit may experience 1 SEU in 6
months (4320 hours), equals 231,480 FIT. It is also
likely that the circuit has 0 SEU in these 6 months, so
the measured SER is 0 FIT.
May 14-16, 2008
NATW'2008
17
Discussion Continued

Fan-out stems should be considered. Two
situations can arise:




When an SET goes through a large fan-out, the large load
capacitance can eliminate the SET, or
If it is not canceled by the fan-out node, it will go through
multiple fan-out paths to increase the SER.
It is highly recommended to have more field
tests for logic circuits.
None of these SER approaches consider the
process variation effects on SER.


Without consideration of electrical masking, SER will be
overestimated by 138% for a small 5-stage circuit
[Wang et al., VLSID’07]
Intra-die threshold voltage variation can result in a
peak to peak SER variation of 41% in a small circuit
[Ramakrishnan et al., ISQED’07]
May 14-16, 2008
NATW'2008
18
Conclusion



SER in logic and memory chips will continue to
increase as devices become more sensitive to
soft errors at sea level.
By modeling the soft errors by two parameters,
the occurrence rate and single event transient
pulse width density, we are able to effectively
account for the electrical masking of circuit.
Our approach considers more factors and thus
gives more realistic soft error rate estimation.
May 14-16, 2008
NATW'2008
19
References
[1]
R. R. Rao, K. Chopra, D. Blaauw, and D. Sylvester, “An Efficient Static Algorithm
for Computing the Soft Error Rates of Combinational Circuits," Proc. Design
Automation and Test in Europe Conf., 2006, pp. 164-169.
[2] R. Rajaraman, J. S. Kim, N. Vijaykrishnan, Y. Xie, and M. J. Irwin, “SEAT-LA: A
Soft Error Analysis Tool for Combinational Logic,", Proc. 19th International
Conference on VLSI Design, 2006, pp. 499-502.
[3] G. Asadi and M. B. Tahoori, “An Accurate SER Estimation Method Based on
Propagation Probability,” Proc. Design Automation and Test in Europe
Conf.,2005, pp. 306-307.
[4] M. Zhang and N. R. Shanbhag, “A Soft Error Rate Analysis (SERA) Methodology,"
Proc. IEEE/ACM International Conference on Computer Aided Design, ICCAD2004, 2004, pp. 111-118.
[5] T. Rejimon and S. Bhanja, “An Accurate Probabilistic Model for Error Detection,"
Proc. 18th International Conference on VLSI Design, 2005, pp. 717-722.
[6] J. Graham, “Soft Errors a Problem as SRAM Geometries Shrink,
http://www.ebnews.com/story/OEG20020128S0079, ebn, 28 Jan 2002.
[7] W. Leung; F.-C. Hsu; Jones, M. E., "The Ideal SoC Memory: 1T-SRAMTM," Proc.
13th Annual IEEE International on ASIC/SOC Conference, 2000, pp. 32-36.
[8] Report, “Soft Errors in Electronic Memory-A White Paper," Technical report,
Tezzaron Semiconductor, 2004.
[9] F. Wang and V. D. Agrawal, “Sngle Event Upset: An Embedded Tutorial,” Proc.
21st International Conf. VLSI Design, 2008, pp. 429-434.
[10] F. Wang and V. D. Agrawal, “Soft Error Rate Determination for Nanometer CMOS
VLSI Logic,” Proc. 40th Southeastern Symp. System Theory, 2008, 324-328.
[9] F. Wang, “Soft Error Rate Determination for Nanometer CMOS VLSI Circuits,”
Master’s Thesis, Auburn University, May 2008.
May 14-16, 2008
NATW'2008
20
Thank You . . .
May 14-16, 2008
NATW'2008
21