presentation

Download Report

Transcript presentation

Die-Hard SRAM Design
Using Per-Column Timing Tracking
Shi-Yu Huang and Ya-Chun Lai
Feb. 10, 2007 @ Las Vegas (IC-DFN)
Design Technology Center (DTC)
National Tsing-Hua University, HsinChu, Taiwan
Outline
• Introduction
• Timing Tracking Scheme
– Traditional Replica-Based Scheme
– Our Scheme
• Experimental Results
• Conclusion
2/25
Nanometer Effects on SRAMs
Nanometer Effects
Worse Device Mismatch
Larger Leakage Current
Wider Variations of R and C
Could trigger
a yield crisis!
Lower VDD (smaller noise margins)
Worse Supply & Coupling Noise
Uncertain Delay
3/25
SRAM Memory Architecture
CS
WE
OE
A9
A8
bit line
word line
..
.
Row
Decoder
A0
Sense Amplifier / Drivers
A19
A10
Column Decoder
Input-Output
(M bits)
4/25
Reading An SRAM Cell
pulsed wordline
Wordline
Q’
0
An SRAM Cell
Q
1
cell
current
BL
BL
Bitlines’ Waveforms
BL
BL
5/25
Two Types of Sense Amplifiers
A Sense Amplifier
Continuous Type
VDD
Latch Type
VDD
VDD
sa_in
sa_in
sa_in
saout
se
saout
VDD
VDD
sa_in
sa_in
saout
se
Sensese
Enable
6/25
Three Major Problems for SRAM
• Mismatch in Bit Cells and Sense Amplifiers
– Vt mismatch shrinks the noise margin
• Bitline Leakage Current
– Could cause failure for READ operations
• Timing Tracking
– When to turn on sense amplifiers?
– When to turn off wordline? (pulsed wordline)
7/25
X-Calibration for Leakage Tolerance
(Presented in Last IC-DFN)
Leakage is calibrated in two steps:
BL
1
1
1
1
1
1
1
0
cell
cell
cell
cell
cell
cell
cell
cell
0
0
0
0
0
0
0
1
Leakage
Current
BL
1.5V
1.8V
X-calibration circuit
Transform the effects
of the bitline leakage
to a Voffset between (BL, BL)
Deduct Voffset
from the input of the sense amplifier
When performing sense amplification
S.A.
8/25
Die Photo of Test Chip
SRAM Type
Array
Organization
Conventional
Our
X-Calibration
1Kb cells
X-Calibration
(32 rows × 32 columns)
Technology
TSMC 0.18um CMOS 1P6M
BIST
486um × 265um
486um × 285um
(100%)
(107.6%)
Access Time
1.89 ns
1.93 ns
(1.8V)
(100%)
(102%)
Supply Current
3.7 mA
4.15 mA
(mA)
(100%)
(112%)
Area
1.373mm
BIST
Conventional
1.108mm
9/25
Shmoo Plots
Ours with
X-Calibration
Supply Voltage (V)
Conventional
Supply Voltage (V)
Target speed: 150MHz @ 250C
Measurement result: Leakage tolerance improved by 317%
Pass
Fail
Ileak=76.6uA
Pass
Fail
Ileak=320uA
Injected Leakage Current (uA)
10/25
Outline
• Introduction
• Timing Tracking Scheme
– Traditional Replica-Based Scheme
– Per-Column Timing Tracking Scheme
• Experimental Results
• Conclusion
11/25
Traditional Scheme – Replica Bitline
Property: replica bitline pair develops a logic signal (i.e., sense enable)
when an accessed bitline pair builds up 100mV signal
replica bitline pair
active wordline
decoder
accessed
logic
sense amps
CLK
Ref: B. S. Amrutur et al., “A replica technique for wordline and sense control in low-power SRAMs,”
IEEE Journal of Solid-State Circuits, Vol. 33, No. 8, pp. 1208-1219, Aug. 1998.
12/25
Problems of
Replica Bitline Based Timing Control
The factors on the speed of a bitline pair: leakage, RC, driving of cell
 Each column could have its own bitline development speed
 A single sense enable control is susceptible to sensing errors
Voltage (V)
Read cycle
Read cycle
BL / BL
SE
13/25
Adaptive Sensing Control
Each sense amp. adapts to its current driving bitline pair!
Voltage (V)
Read cycle
Read cycle
BL / BL
SE
14/25
Operating Flow
Typical READ control steps
Added timing tracking steps
Row address decoding
Timing tracker start-up
Wordline activation
Timing tracker monitoring
Bitline discharging
ΔVBL>100mV?
S.E. active ?
N
Y
N
Y
Sense enable generation
Sense amplification
Timing tracker disabling
15/25
Overall Architecture
Row
WL
Decoder Driver BL
det_en
BL
MC
MC
MC
MC
Cell Array
MC
MC
MC
MC
MUX2
MUX2
Timing
Tracker
Timing
Tracker
se
SA
Controller,
Input Buffer,
Address Buffer
WL
Latch&
Buffer
SA
I/O Circuitry
Latch&
Buffer
16/25
Transient Waveforms for Read
Row
WL
Decoder Driver BL
BL
MC
CLK
MC
BL / BL
MC
det_en
MC
MUX2
WL
Timing
Tracker
se
SA
Latch&
Buffer
det_en
se
Desired property: SE goes high when bitline pair has 100mV!
17/25
Outline
• Introduction
• Timing Tracking Scheme
– Traditional replica-based scheme
– Per-Column Timing Tracking
• Experimental Results
• Conclusion
18/25
Effect of Variation on Sense Amp. Vt
• As Vt mismatch in sense amplifier becomes excessive,
the probability of read failure increases.
1.2
proposal
proposed
Pass Rate
1
0.8
dummy bitline
replica-based
0.6
0.4
0
10
20
30
40
50
60
Local standard deviation of Vt for transistors in SA (mV)
19/25
Effect of Variation on Bitline
Capacitance
• Our is insensitive to bitline capacitance variation.
• On the contrary, replica-based method is vulnerable.
Pass Rate
1.2
Proposal
proposed
1
100fF
0.8
300fF
500fF
0.6
dummy bitline
replica-based
0.4
0
10
20
30
40
50
60
Local standard deviation of Vt for transistors in SA (mV)
20/25
Layout of Test Chip
(Technology):
TSMC 0.18um CMOS 1P6M
Capacitor
Proposed
1.208mm
(Creating Nanometer Effects):
We used different loadings on
different bitlines so as to mimic
the different operating speeds
in deeper nanometer technologies
Compared
1.108mm
21/25
Layout of Compared SRAM
Row decoder
Cell array
IO circuitry
Column decoder & Output buffer
Control & Input buffer &
Row address buffer
Column address buffer
22/25
Layout of Proposed SRAM
Row decoder
Cell array
IO circuitry &
Timing tracker
Column decoder & Output buffer
Control & Input buffer &
Row address buffer
Column address buffer
23/25
Test Chip Characteristics
Technology
TSMC 0.18um CMOS 1P6M
Package
40-pin S/B
SRAM macro organization
32 rows x 64 columns
Test chip area
1.108 mm x 1.208 mm
Power supply voltage
1.8 V
Operating Clock frequency
200 MHz
Power dissipation for compared SRAM
13.185 mW (100%)
Power dissipation for proposed SRAM
17.930 mW (136.8%)
Access time for compared SRAM
1.969 ns
Access time for proposed SRAM
2.301 ns (116.8%)
24/25
Conclusion
• Why Timing Control in an SRAM?
– (1) for latch-based sense amplifier enabling
– (2) for pulsed wordline control
– So as to achieve lower power dissipation
• Drawback of Existing Replica-Based Scheme
– Replica simply cannot track every bitline pair
• Proposed Per-Column Timing Tracking
– Adaptive on-the-fly
– More tolerant to process variation
– Suitable for deeper nanometer technologies
25/25