Transcript Slide 1
TMR Schemes Voting Matrix Melanie Berg MEI Technologies/NASA GSFC [email protected] Overview Premise: Why do various FPGAs require separate mitigation strategies? Radiation Effects in FPGA devices Mitigation and Actel Anti-fuse Devices Mitigation and Xilinx Virtex Devices Tools European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 2 Radiation Effects in FPGA devices Single Event Transients (SETs) Single Event Upsets (SEUs) Single Event Functional Interrupts (SEFIs) European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 3 Single Event Effects (SEEs) and IC System Error SEUs or SETs can occur in: Combinatorial Logic Sequential Logic Configuration Memory Cells Depending on the Device and the design, each fault type will: Have a probability of occurrence Either have a significant or insignificant contribution to system error Every Device has different Error Responses – We must understand the differences and design appropriately Page 4 European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Combinatorial Logic Blocks and Potential Upsets… SETs in Anti-fuse FPGAs European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 5 Basic Combinatorial Logic Blocks and Potential Upsets TRANSIENT PSET STUCK UNTIL OVERWRITTEN Probability of Configuration Fault PConfiguration European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 6 DFF’s: SEUs and SEFIs Strike Caught in Loop Probability of SEU PDFFSEU reset D Q CLK PSEFI Probability of SEFI European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 7 Transient Capture on A DFF Data Input Pin (SET→SEU) D SET CLR clock fs T(fs)pulse P(fs)SETgen P(fs)SETprop PDFFEn P(fs)SET→SEU Q Q tp = 1/fs P(fs)SET→SEU Tpulse : System Frequency : SET Pulse Width : Probability SET generated with sufficient amplitude : Probability SET can propagate with sufficient amplitude : Probability DFF is enabled (active) : Probability SET can be caught by clock edge P fs set seu T ( fs) pulse P( fs) SETgen P( fs) SETprop PDFFEn 1 2 fs European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 8 sDFFerror Frequency Effects and Conventional DFF Upset Theory Composite Cross Section PDFFSEU & PDFFMBU Frequency ~0 P fs DFFerror PDFFSEU P( fs) SET SEU PDFFMBU European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 9 Summary: Most Significant Factors of System Error Probability P(fs)error Configuration SRAM Based FPGAs PConfiguration DFFs SEFIs STATIC Dynamic Clocks & Resets SEU SET→SEU Inaccessible control circuitry PDFFSEU P( fs) SET SEU PSEFI P fserror PConfiguration PDFFSEU P( fs) SET SEU PSEFI European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 10 Reducing System Error: Common Mitigation Techniques Mitigation can be: Embedded: built into the device library cells User does not verify the mitigation – manufacturer does User inserted: part of the actual design process User must verify mitigation… Complexity is a RISK!!!!!!!! Common Mitigation Types: Local Triple Modular Redundancy (LTMR) Global Triple Modular Redundancy (GTMR) P fserror PConfiguration PDFFSEU P( fs) SET SEU PSEFI European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 11 Example Mitigation Schemes will use Majority Voting MajorityVoter I1 I 2 I 0 I 2 I 0 I1 I0 0 0 0 0 1 1 1 1 I1 0 0 1 1 0 0 1 1 I2 0 1 0 1 0 1 0 1 European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Majority Voter 0 0 0 1 0 1 1 1 Page 12 Mitigation and Actel Antifuse Devices European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 13 ACTEL RTAX-S Architecture Basics Super Cluster: •Combinatorial Cells: C CELLS •DFF Cells: R Cells Source: RTAX-S/SL RadTolerant FPGAs 2009 Actel.com Embedded RHBD: Hardened Global Clocks and Resets Antifuse Configuration is SEU immune Embedded Localized TMR (LTMR) at each DFF (RCELL) European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 14 Local Triple Modular Redundancy (LTMR): Smallest Area & Power Non-Mitigated Mitigated Triple Each DFF + Vote… Data paths are not redundant – can only have one voter Unprotected: Clocks and Resets… SEFI Transients (SET->SEU) Internal/hidden device logic: SEFI Low P fserror PConfiguration PDFFSEU P( fs) SET SEU PSEFI European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 15 ACTEL RTAX-S Embedded Mitigation… LTMR and SETs Combinatorial logic: C-CELL C C R C TX TX RX RX B TX TX RX RX Combinatorial logic C-CELL C C R Super Cluster Sequential logic R-CELL X TX X RX X Combinatorial logic C-CELL C C European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg R Page 16 RTAX Example: Probability of Error Reduction P fserror PConfiguration PDFFSEU P( fs) SET SEU PSEFI 0 Low ~0 Error Probability is Per DFF bit Error Rate must reflect frequency of operation European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 17 Upper-Bound Error Prediction RHBD Anti-fuse FPGA DFF (near) Static Error Bit Rate no CCells PDFFSEU: dEbit 10 Errors Source: Actel 110 dt bit day 15MHz to 120MHz: Dynamic Error Bit Rate with 8 levels of CCells P(fs)SET→SEU: Source: NASA Goddard dEbit fs 8 Errors 110 6 10 dt bit day 9 European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 18 Upper-Bound Error Prediction Actel RHBD Anti-fuse FPGA With embedded LTMR Mitigation + Hardened Clocks: P fserror P( fs) SET SEU dE dEbit fs * #UsedDFFs dt dt Errors bits * n 6 x10 bit day design 8 dE Errors 3 3x10 dt design day Thousands of years in LEO !!!!! European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 19 Mitigation and Xilinx Virtex Devices European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 20 Xilinx XQR4VSX55: Radiation Test Data Xilinx Consortium: VIRTEX-4VQ STATIC SEU CHARACTERIZATION SUMMARY: April/2008 Probability Error Rate LEO GEO Upsets Upsets device day device day Configuration Pconfiguration Memory: XQR4VSX55 Combined SEFIs per device PSEFI dEconfiguration dt dESEFI dt 7.43 4.2 7.5x10-5 2.7x10-5 For non-mitigated designs the most significant upset M Berg, Trading ASIC and FPGA Considerations for System factor is: Insertion; IEEE Nuclear Science Radiation Effects Conference 2009 Configuration P European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 21 Global Triple Modular Redundancy (GTMR): Largest Area → Greatest Complexity Non-Mitigated Mitigated Triple Entire Design Triple I/O and Voters Unprotected – hidden device logic SEFIs Can not be an embedded strategy: Complex to verify Xilinx offers XTMR P fserror PConfiguration PDFFSEU P( fs) SET SEU PSEFI Low Low European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Low Page 22 XTMR – Capturing Asynchronous Input data Async_data_tr0 Dynamic Analysis: Metastability Filter D SET CLR •One domain leads the other two Async_data_tr1 D Q Q SET CLR Async_data_tr2 D SET CLR n INPUT: Async_DATA_tr0 INPUT: Async_DATA_tr1 INPUT: Async_DATA_tr2 n+1 n+2 D Q SET CLR D Q Q Q Q D Q SET CLR D Edge Detect Circuit SET CLR Q SET CLR D Q Q Q D Q E SET CLR D Q SET CLR SET CLR Q D Q E Q D Q E Q Q SET CLR SET CLR Q Q V O T E R Q Q n+3 INPUT SKEW n n+1 n+2 n+3 n+4 n+5 Edge_detect_tr0 Edge_detect_tr1 Edge_detect_tr2 Voted rising edge detect EDGE DETECT TIMING WAVEFORM European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 23 Time Domain Considerations: XTMR Single Bit Failures …Not Detected by Static Node Analysis n n+1 n+2 n+3 INPUT: Async_DATA_tr0 INPUT: Async_DATA_tr1 CONFIGURATION BIT HIT n+1 INPUT: Async_DATA_tr2 n+2 n+3 n+4 n+5 Edge_detect_tr0 Edge_detect_tr1 NO EDGE DETECTION Edge_detect_tr2 Voted rising edge detect European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 24 Voters and Asynchronous Signal Capture Metastability Filter Place voter after metastability filters It satisfies skew constraints because voter is anchored at DFF control points n+2 n+3 n+4 INPUT: Async_DATA_tr0 INPUT: Async_DATA_tr1 n+5 D CLR D D VOTER n+1 D Q SET SET CLR SET Q Q D Q SET D Q SET CLR SET D Q Q SET CLR D D SET CLR Q SET CLR D Q Q Q SET CLR D SET CLR Q Q Q Q Q D Q E SET D Q CLR Q Q SET D Q E Q D Q E CLR Q Q CLR Q SET D SET V O T E R Q Q CLR SET CLR Q Q Edge Detect Circuit Q Q SET CLR Q CLR Q D Q CLR Metastability Filter CLR n+1 Q CLR D INPUT: Async_DATA_tr2 SET Edge Detect Circuit D V O T E R SET CLR D Q D Q E SET CLR D SET CLR Q D Q E SET CLR Q D Q E Q Q SET CLR SET CLR Q Q V O T E R Q Q Edge Detect European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 25 Upper-Bound Error Prediction: Xilinx FPGA XTMR PConfiguration ??? SEUs are insignificant MBUs may be insignificant (still under investigation) Assumes proper scrubbing Assumes Unmitigated SEFIs P fs P SEFI error are the most predominant source: dESEFI Errors 5 3 10 nDevice dt Device day Errors dE dESEFI 5 3 10 n dt dt day European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 26 Tools European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 27 Mitigation and Actel Tools Mentor Graphics has offered LTMR for anti-fuse devices There is a desire to employ LTMR to Actel Flash Based products DTMR is another approach (GTMR with no clock redundancy) Flash Assist with SETs in Anti-fuse Device European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 28 Mitigation and Xilinx Tools Currently XTMR is commercially available from Xilinx NASA REAG has identified some issues: Asynchronous domain crossings Verification of XTMR insertion Mentor is now evaluating GTMR with Formal Checking NASA REAG is expecting to use Mentor GTMR (preliminary version) for V5 radiation testing European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg Page 29