A Survey of DDR4 SDRAM Design Improvement Methods

Download Report

Transcript A Survey of DDR4 SDRAM Design Improvement Methods

A Survey of DDR4 SDRAM
Design Improvement
Methods
16 January 2014
Edmund Leong 梁文禎
0260814
NCTU Memory Systems IEE5011 FALL 2013
1
Overview
• Introduction - DDR4 Specifications
• A Far End Cross Talk Cancellation Method
• Driver Design
• A Low Jitter DLL Design
• Fast Parallel CRC and DBI Calculation Method
• Conclusion
NCTU Memory Systems IEE5011 FALL 2013
2
DDR4 Specifications (1/3)
• P = α CL VDD2 f
• DDR → DDR4
• f↑ 8x
• VDD↓2.75x
• Based on simplified
equation, power
consumption is still
increasing.
• Other methods are
introduced to reduce
power consumption
NCTU Memory Systems IEE5011 FALL 2013
3
DDR4 Specifications (2/3)
• Change from center tapped termination (CTT)/SSTL
to pseudo open drain (POD)
• Reduction of VDD to GND path when DQ is logic
high.
NCTU Memory Systems IEE5011 FALL 2013
4
DDR4 Specifications (3/3)
• Data Bus Inversion (DBI)
• 2.5V Vpp for word lines
• CRC protection
• CA parity error
detection
• Point to point topology
NCTU Memory Systems IEE5011 FALL 2013
5
Overview
• Introduction - DDR4 Specifications
• A Far End Cross Talk Cancellation Method
• Driver Design
• A Low Jitter DLL Design
• Fast Parallel CRC and DBI Calculation Method
• Conclusion
NCTU Memory Systems IEE5011 FALL 2013
6
Far End Crosstalk Cancellation
Method (1/4)
• crosstalk cancellation methods:
• Circuit implementation
• Wider spacing between signal traces
• Use Via stub capacitance
NCTU Memory Systems IEE5011 FALL 2013
7
Far End Crosstalk Cancellation
Method (2/4)
• Far end crosstalk can be reduced by using via stubs
• Inter Symbol Interference (ISI) is not affected
• Resonant frequency is over 10GHz
𝑉𝐹𝐸𝑋𝑇
𝑡𝑓𝑙𝑖𝑔ℎ𝑡 𝐶𝑚 𝐿𝑚 𝑑𝑉𝑎𝑔𝑔 (𝑡 − 𝑡𝑓𝑙𝑖𝑔ℎ𝑡)
𝑡 =
( − )
2
𝐶
𝐿
𝑑𝑡
NCTU Memory Systems IEE5011 FALL 2013
8
• 8 port s-parameter is measured up to
20GHz with a vector analyzer
• Resonance by the stub starts around
15GHz
NCTU Memory Systems IEE5011 FALL 2013
9
Far End Crosstalk Cancellation
Method (4/4)
NCTU Memory Systems IEE5011 FALL 2013
10
Overview
• Introduction - DDR4 Specifications
• A Far End Cross Talk Cancellation Method
• Driver Design
• A Low Jitter DLL Design
• Fast Parallel CRC and DBI Calculation Method
• Conclusion
NCTU Memory Systems IEE5011 FALL 2013
11
Driver Design (1/5)
• Type 0 – standard termination
• Type I – switched termination preemphasis
• Type II – constant termination de-emphasis
NCTU Memory Systems IEE5011 FALL 2013
12
Driver Design (2/5)
• Type 0 – standard termination
• R2 is always open
• Always driving with RS termination
• No boost to high frequency content
NCTU Memory Systems IEE5011 FALL 2013
13
Driver Design (3/5)
• Type I – switched termination preemphasis
• R2 termination is active only during transition bit
• Termination during transition is Rs||R2.
• Termination during non transition is Rs only.
• Level of pre-emphasis is controlled by Rs and R2
NCTU Memory Systems IEE5011 FALL 2013
14
Driver Design (4/5)
• Type II – constant termination de-emphasis
• R1 and R2 in series which is the Thevenin
equivalent to Rs.
• Transition bit driven by Rs.
• Non Transition bit driven by R1-R2 network
NCTU Memory Systems IEE5011 FALL 2013
15
Driver Design (5/5)
Simulation of DDR4 2400MT/s, 1 DIMM per channel
Driver and Termination
Best termination value
Rs - Rt
Best eye width (ps)
Type 0, VDDQT
40 – 60
187
Type I, VDDQT
40 – 60
187
Type II, VDDQT
40 - 120
232
Type II, CTT
40 – 120
228
• With optimized resistor values, difference of VDDQT or CTT termination has
minimal effect on the performance of the net
NCTU Memory Systems IEE5011 FALL 2013
16
Overview
• Introduction - DDR4 Specifications
• A Far End Cross Talk Cancellation Method
• Driver Design
• A Low Jitter DLL Design
• Fast Parallel CRC and DBI Calculation Method
• Conclusion
NCTU Memory Systems IEE5011 FALL 2013
17
Low Jitter DLL Design (1/4)
NCTU Memory Systems IEE5011 FALL 2013
18
Low Jitter DLL Design (2/4)
• Conventional Charge Pump
1
𝑊
𝐼𝐷 = 𝜇𝑛,𝑝 𝐶𝑜𝑥
2
𝐿
NCTU Memory Systems IEE5011 FALL 2013
𝑉𝐺𝑆 − 𝑉𝑇𝐻
2
1 + 𝜆𝑉𝐷𝑆
19
Low Jitter DLL Design (3/4)
• New Charge Pump design
NCTU Memory Systems IEE5011 FALL 2013
20
Low Jitter DLL Design (4/4)
TVLSI’10
JSSC’11
Proposed’13
Process (nm)
54
130
90
DRAM Interface
GDDR3
DDR
DDR4
Supply (V)
1.8
1.2
1.2
DLL Type
ADDLL
ADDLL
ADDLL
Frequency (GHz)
1.4
0.11-1.4
1.6
Peak-to-peak jitter
(ps)
29 @ 1.4 GHz
15.11 @ 1.4 GHz
12.33 @ 1.6 GHz
Power (mW)
29.5 @ 1GHz
74.4 @ 1.4 GHz
33.6 @ 1.6 GHz
Area (mm2)
0.11
0.387
0.047
Proposed – Design and Diagnostics of Electronic Circuits & Systems (DDECS), IEEE International Symposium 2013
NCTU Memory Systems IEE5011 FALL 2013
21
Overview
• Introduction - DDR4 Specifications
• A Far End Cross Talk Cancellation Method
• Driver Design
• A Low Jitter DLL Design
• Fast Parallel CRC and DBI Calculation Method
• Conclusion
NCTU Memory Systems IEE5011 FALL 2013
22
Fast Parallel CRC and DBI
Calculation Method (1/7)
• DDR4 introduces CRC ATM-8 HEC
• CRC calculation is based on DBI inverted data
• DDR4 adds CRC value at the end of data burst
NCTU Memory Systems IEE5011 FALL 2013
23
Fast Parallel CRC and DBI
Calculation Method (2/7)
• CLmin = tCore + Max(0, tCRC – tPrep) + tAlign
• tCalc + Flight time 1 + Flight time 2 < 4nCK
• In 3.2Gbps DDR4, calculation time constrain is
about 1.2ns
NCTU Memory Systems IEE5011 FALL 2013
24
Fast Parallel CRC and DBI
Calculation Method (3/7)
• (a) has internal nodes that
do not swing full rail.
• Vdd-Vth swing
• (c) has internal nodes with
full rail swing
• Inverter to prevent long chain
of transmission gate
NCTU Memory Systems IEE5011 FALL 2013
25
Fast Parallel CRC and DBI
Calculation Method (4/7)
• DBI is activated when more
then half of DQ bits are 0.
• Each CRC calculation inputs
are determined by bit
mapping (eg. Gray boxes).
• Serial DBI CRC calculations
are too inefficient
• A parallel method is needed
NCTU Memory Systems IEE5011 FALL 2013
26
Fast Parallel CRC and DBI
Calculation Method (5/7)
• CRC starts with all DBI bits = 0
• For each CRC[i], information needed for post
processing CRC+DBI correction:
• Inclusion of DBI#[k] in CRC[i]
• Oddness of DQ bits associated with burst k and CRC[i]
• Actual DBI#[i]
• D[k]= self’ * Odd * DBI#[0]’+
self * Even * DBI#[k] + self * Odd
CRC_new[i] = CRC[i] xor D[0] xor …
… xor D[7]
NCTU Memory Systems IEE5011 FALL 2013
27
Fast Parallel CRC and DBI
Calculation Method (6/7)
•
DBI#[k] inputs into third
stage of XOR tree
Critical path is one tXOR
more than XOR tree
•
Stage
Input CRC[i] Empty
Slots input Slots
1
64
37
27
2
32
19
13
3
16
10
6
4
8
5
3
5
4
3
1
6
2
2
0
32
6
CRC_new[i]
NCTU Memory Systems IEE5011 FALL 2013
28
Fast Parallel CRC and DBI
Calculation Method (7/7)
NCTU Memory Systems IEE5011 FALL 2013
29
Conclusion
• Specifications of DDR4 require very high speeds which
places importance on signal integrity
• Transmission line theory is important for impedance
matching in termination to reduce reflections and
cross talks.
• Crosstalk can be minimize with closely placed via stubs
• Driver design with constant termination de-emphasis
can widen eye diagram
• A good DLL design is needed to reduce jitter
• Parallel CRC + DBI calculations can relax speed
constrains
NCTU Memory Systems IEE5011 FALL 2013
30
References
•
E. Desjardins (2012, Sept. 12). JEDEC Announces Publication of DDR4 Standard [Online]. Available:
http://www.jedec.org/news/pressreleases/jedec-announces-publication-ddr4-standard
•
DDR4 SDRAM, JEDEC standard JESD79-4. Sept 2012.
•
D. Wang (2013, Dec. 3). Why migrate to DDR4? [Online]. Available: http://www.eetimes.com/document.asp?doc_id=1280577
•
H. Goto (2010, Aug. 16). Towards Next-Generation 4Gbps DDR4 Memory [Online]. Available:
http://pc.watch.impress.co.jp/docs/column/kaigai/20100816_387444.html
•
C-M Nieh, J. Park, “Far-end Crosstalk Cancellation using Via Stub for DDR4 Memory Channel,” in IEEE 63rd Electronic Components
and Technology Conference (ECTC), pp. 2035-2040, 2013.
•
N. Pham, D. Dreps, R. Mandrekar, N. Na, “Driver Design for DDR4 Memory Subsystems,” in IEEE 19th Electrical Performance of
Electronic Packaging and Systems (EPEPS), pp.297-300, 2010.
•
Y-H. Tu, K-H. Cheng, H-Y. Wei, H-Y. Huang, “A Low Jitter Delay-Locked-Loop Applied for DDR4,” in IEEE 16th Design and Diagnostics
of Electronic Circuits and Systems (DEECS), pp. 98-101, 2013.
•
Hsiang-Hui Chang, Jung-Yu Chang, Chun-Yi Kuo, Shen-Iuan Liu, “A 0.7-2GHz Self-Calibrated Multiphase Delay-Locked Loop” IEEE
Journal of Solid-State Circuits, Vol. 41, No. 5, May 2006
•
W-J. Yun, H-W. Lee, D. Shin, and S. Kim, “A 3.57 Gbps Low Jitter All Digital DLL with Dual DCC Circuit for GDDR3 DRAM in 54nm
CMOS Technology,” in IEEE Trans. On VLSI, 2010.
•
Y-S. Kim, S-K. Lee, H-J. Park, and J-Y. Sim, “A 110MHz to 1.4GHz locking 40-phase all-digital DLL,” in IEEE Journal of Solid-State
Circuits, vol. 46, no. 2, pp. 435-444, Feb 2011.
•
J. Moon, J. S. Kih, “Fast Parallel CRC & DBI Calculation for High-speed Memories: GDDR5 and DDR4”, in IEEE International
Symposium on Circuits and Systems (ISCAS), pp 317-320. 2011.
•
K. Lin, C. Wu, “A Low-cost Realization of Multiple-input Exclusive-OR gates,” ASIC Conference and Exhibit, Proceedings of the 8th
Annual IEEE Ineternational, pp.307-310. Sept 1995.
NCTU Memory Systems IEE5011 FALL 2013
31