ppt - Rudrajit Datta

Download Report

Transcript ppt - Rudrajit Datta

Designing a Fast and Adaptive Error
Correction Scheme for Increasing the
Lifetime of Phase Change Memories
Rudrajit Datta and Nur A. Touba
Computer Engineering Research Center
Dept. of Electrical and Computer Engineering
University of Texas at Austin
Introduction
 Challenges for traditional memories
• Scalability
• Device leakage
• Retention time
 Phase Change Memories (PCM) – a possible substitute
• Non-volatile
• Amenable to process scaling
• High density – 4x DRAM [Seznec 10]
Phase Change Memories
 Crystalline state





• Low resistance – ‘1’
Amorphous state
• High resistance – ‘0’
Thermally induced state changes
Scalable
Disadvantages
• Relatively quick degradation
[Fantini 06]
– ~107 writes [Ferreira 10]
• Slow writes
PCM in place of DRAM – fix PCM reliability
Previous Work
 Hybrid PCM/DRAM [Zhang 09]
• OS level paging scheme
• BCH code correcting up to 7 errors
– Slow
 Spread/minimize PCM writes
• [Ferreira 10] – minimize PCM writes
• [Lee 09] – buffer reorganization and partial writes
Previous Work
 Architectural solutions so far
 None using novel error correction code (ECC)
• PCM errors increasing function of time
– Function of writes/cell
• Very different from traditional DRAM
– Increasing permanent errors
Proposed Scheme
 Adaptive Error Correction
• OS monitors errors corrected
• Signals memory controller
– Increase number of check bits
 Physical line size of memory unchanged
• More check bits, less data bits
 Main memory to cache bandwidth affected
• Gradually decreasing cache line size
• Minimal performance impact
 Orthogonal Latin Square (OLS) codes used
• Fast – single step decode
• Modular
Proposed Scheme
Word 1
Word 2
Word 3
Word 4
OLS Check Bits
Enhanced ECC
Word 1
Word 2
OLS Check Bits
Word 3
Into Cache
Word 1
Word 2
Word 3
Proposed Scheme
Data
Regular
Check-bit
Generator
Enhanced
Check-bit
Generator
Signal from OS
Main
Memory
Information Bits
Regular
Check-bit
Generator
Check Bits
Enhanced
Check-bit
Generator
Corrected Data
Orthogonal Latin Square Codes
 Latin Square
• m x m array
• Row-columns permutation of digits 0,1,…..m-1
 Orthogonal Latin Squares
• Ordered pair of elements (r, c, s) appear only once
 m2 data bits, 2tm check bits, t-error correctable
[Hsiao 70]
Adaptive ECC
 Increase number of check bits per line
 Break up line into small segments
• Based on number of data bits
 Implement ECC separately on each segment
• Constraint – original line size unchanged
• (Data + ECC)Original = ∑Segments (DataSegment + ECCSegment)
 Overall error tolerance goes up
Adaptive ECC
Word 1
Word 2
Word 3
Word 4
ECC_OLS
Enhanced ECC
Word 1
Word 2
Word 3
ECC_OLS
Enhanced Adaptive ECC
Segment 1 ECC1 Segment 2 ECC2 Segment 3 ECC3 Segment 4 ECC4
Adaptive ECC – Numerical
example
 Original configuration
• 3-bit OLS code on 256-bit line – total 352 bits
• Corrects all 3-error patterns and less
 Increased check-bits
• 25% of data-bits store ECC – 192 data bits
– 2 64-bit data segments
– 4 16-bit data segments
• Check-bits – (352 – 192) = 160
– 3-bit OLS on the 64-bit segments
– 2-bit OLS on the 16-bit segments
Adaptive ECC – Numerical
example
 Enhanced ECC configuration corrects
•
•
•
•
99.97% 3-bit errors
99.73% 4-bit errors
…..
Small fraction of 14-bit errors
 Segmented ECC implementation boosts error tolerance
Results
Memory
Size
Fraction of Memory Used for Storing Extra
Check-bits
0.0
0.25
0.5
0.75
128MB
0.008
0.015
0.213
1.190
256MB
0.006
0.042
0.205
1.117
1GB
0.005
0.026
0.154
0.989
4GB
0.003
0.020
0.125
0.916
Error Tolerance (no. of errors / no. of bits * 100)
for varying memory sizes
Results
Percentage of operational memory lines versus
number of errors injected out of 100,000 experiments
Results
1.4
Proposed_Scheme
1.2
7-error BCH [Zhang 09]
1
0.8
Bit-error Rate
Tolerance
(%)
0.6
0.4
0.2
0
Time
Results
SPEC2006 Benchmarks
Results
SPEC2006 Benchmark – bzip2
Conclusion
 Novel error correction scheme for PCM
• Fast
• Adaptive
– Graceful decrease in memory capacity
• Increases PCM lifetime
– Switching period (to enhanced ECC) of the order of
years