ppt - Rudrajit Datta

Download Report

Transcript ppt - Rudrajit Datta

Post-Manufacturing ECC Customization Based on Orthogonal Latin Square Codes and Its Application to Ultra-Low Power Caches

Rudrajit Datta and Nur A. Touba

Computer Engineering Research Center Dept. of Electrical and Computer Engineering University of Texas at Austin

Motivation

For memories with high defect rates

Reduce check-bit overhead

Increase reliability

Applicable to low voltage caches

Agenda

       

Introduction Proposed Approach Application Related Work Orthogonal Latin Square (OLS) Codes Customization Results Conclusion

Introduction

Tolerate high defect rates for memories

Occurs in memories operating at ultra-low voltages

Expected in future nanoscale technologies

Eg. nanoscale crossbar architectures

Conventional method

ECC selected based on

Expected number of maximum defects per word

Introduction

Data Check Bit Generator

c full

Memory

Information Bits

c ful l

Check Bits

c full

Decoder Corrected Data

Observations

A priori information available for location of defects

Through post-manufacturing memory tests

Obtain a defect map

Use information to customize code

Reduce check bit storage in memory/caches

Proposed Approach

Data Check Bit Generator

c full

Switch Network

c used

Memory

Information Bits Decoder

c used

Check Bits

c used

Switch Network

c full

Corrected Data Config.

Bits

Proposed Approach

Customize code by disabling rows of the H-matrix

Possible if modular code used for ECC

Current work looks at OLS codes Configuration Bits

1 0 1 0

Application - Low-voltage Caches

Microprocessor voltage lowered while idle

Reduces power

Caches and memories susceptible at lower voltages

Unreliable below V

ccmin

Enable reliable cache operation at lower voltages

At lower voltages use part of cache to store extra check bits

Related Work

  

Word-disable and Bit-fix [Wilkerson 08]

Defect map

Identify vulnerable bits

• •

Mitigates only persistent errors Uses up half of the cache to store extra check-bits Two-dimensional ECC [Kim 07]

Slow

Complicated decoding Multi-bit segmented ECC [Chishti 09]

Orthogonal Latin Square (OLS) code

Single step decodable

High redundancy

Key Takeaways

Have full ECC on chip

Can handle all defect maps

Generate defect map

Disable part of the original code

• •

Reduces check bit redundancy Retain capability of original code w.r.t the defect map

One Step Majority Decoding

  

t-error correctable – information bit copied over 2t+1 times; each an independent copy One copy – bit itself Rest - 2t independent parity equations

d i d p + corrected d i Majority Voter + c p d q c q + d s c s

Orthogonal Latin Square Codes

Latin Square

m x m array

Row-columns permutation of digits

0,1,…..m-1

Orthogonal Latin Squares

Ordered pair of elements (r, c, s) appear only once

m 2

data bits, 2tm check bits, t-error correctable [Hsiao 70]

Single step decodable

Proposed Scheme

Implement full OLS code on chip

Run memory tests

Generate defect map

At manufacturing time or at boot-time

Identify vulnerable bits

Disable rows in OLS H-matrix

On chip-by-chip basis, based on defect map

Correct all erasures PLUS ‘e’ random error in each cache line

Reduce redundancy while providing same reliability

Definitions

“good row” – for information bit d

i

Row of OLS H-matrix

No ‘1’ in any other erasure position save bit d

Holds true for all lines In cache

i

“bad row” – for information bit d

i

Row of OLS H-matrix

• •

‘1’ in one or more erasure positions apart from bit d

i

Holds for at least one line of cache

line1 line2 “Good Rows” & “Bad Rows” d0 d1 d2 d3 d4 d5 d6 d7 E E E H-row1 H-row2 H-row3 H-row1 H-row2 H-row3 1 0 1 G B 0 1 0 G 0 1 0 B 0 0 1 B 1 1 0 G B 0 0 1 B 1 0 1 G B 1 1 0 G B

Necessary and Sufficient Conditions

Tolerate ‘e’ random errors

“good rows” – “bad rows” ≥ 2(e + 1)

Original code – t-error correcting

(Max vulnerable bits in any line) + et

Row Selection

Covering problem

Select enough good rows for each information bit d

i

Until constraint is satisfied

• •

NP-complete problem Apply heuristics H-row1 H-row2 G H-row3 “good rows” – “bad rows” B G 1 1 B B G B B G B G B

Covering Problem

   

Solve for cache line with maximum erasures first Apply solution to all other cache lines If unsatisfactory, add erasures from one of unsolved lines Repeat until solution fits entire cache

Implementation

corrected d i Adjustable ctl & ctl p + & ctl q + d i d p c p d q c q & ctl s + d s c s

Experimental Results

Results for Word Size of 256 Bits and Bit-Error Rate of 10 -3 Cache Size (Bytes) 16 KB 32 KB 64 KB 128 KB Check bits for conventional OLS Avg 155 166 175 208 Max 224 256 256 256 Check bits for customized OLS Avg 117 125 134 163 Max 145 148 156 177 Percentage reduction in Max. Check Bits 35.27

42.19

39.06

30.86

Experimental Results

Results for Constant Cache Size of 64KB Word Size (Bits) 256 484 Bit-error Rate 10 10 10 10 10 10 -3 -4 -5 -3 -4 -5 Check bits for conventional OLS Avg 175 98 66 295 143 92 Max 256 128 102 396 176 132 Check bits for customized OLS Avg 138 84 64 198 117 89 Max 156 107 68 230 139 115

Experimental Results

64 KB cache, 484-bit word, 10 -3 bit-error rate

Conclusion

Post-manufacturing customization

Reduces large check-bit overhead

• •

Provides requisite reliability Applicable to systems with high defect rate