ISLPED_presentation - Robust Low Power VLSI

Download Report

Transcript ISLPED_presentation - Robust Low Power VLSI

A Charge Pump Based Receiver
Circuit to Reduce Interconnect Power
Dissipation
Aatmesh Shrivastava, John Lach, and Benton H.
Calhoun
University of Virginia, Charlottesville
International Symposium on Low Power Electronics and Design
Interconnect Power Dissipation
[1] Magen, et. al.
SLIP 2004
• Interconnect consumes >50% of dynamic power in a
micro-processor.
• 90% of interconnect power is in 10% of interconnect.
2
Interconnect Power in the context
of next generation computing
• A Exa-byte of data is to be transmitted per second to
enable exascale computing.
[2] P. Kogge et. al. DARPA/ITPO 08
• State of the interconnect consumes 1-3pJ/bit/mm. A
exabyte/s will need 10-30 Mega-Watt Power. [2]
3
Outline
• Voltage Scaling for Interconnect
– Driver
– Receiver
• Literature Review
• Proposed Interconnect Receiver
– Charge Pump
– Complete circuit diagram
– Simulation
• Implementing the interconnect in 4 core
Alpha
• Results
• Design Comparison
4
Voltage Scaling for interconnects
• Voltage Scaling has been used to reduce interconnect
power [4-10].
• Logic runs at rated VDD, wires at reduced VDDI.
Interconnect driver circuits are needed
• Key Question :- Performance overhead vs Power.
5
Interconnect Driver
[4] H. Zhang, et. al.
TVLSI 2000
• Two NMOS transistors are used at output stage
• A signal at logic level ( 1V) is converted to a signal
interconnect level (0.3V)
• We use this driver in our proposed interconnect
circuit.
6
Interconnect Receiver
ON
VDDI
0
ON
OFF
• Restores the signal back to the logic level. Poor
performance, VDDI > VT.
• Differential amplifier [8-10] can be used for better
performance but have higher power overhead.
• We propose an improved single ended receiver.
7
Approx. Power-Performance-Area
Prior Art
Schemes
B/W
(Ghz)
Swing
(V)
Normalized
Energy
Basic ( no scaling)
>1
1
1
Single-ended [4,5,7]
<0.25
0.6
0.6
Differential [8-10]
>1
0.05
0.8
Capacitive [6]
<0.25
0.05
0.2
• In prior art either energy saving is less or
performance is poor.
8
Delay vs Energy/bit : Prior Art
• Existing solutions do not address power and
performance in conjunction.
9
Proposed receiver ckt
• Charge-pump is used.
• It boosts the signal to three times the interconnect
swing
• Good performance and much lower power
10
Charge Pump in the Receiver
• When IN is at 0, A is precharged to 0.3V. So when IN
goes high A goes to 0.6V (Ideal case).
• Similarly when IN is at 0.3V, A is precharged to 0V. So
when IN goes low A goes to -0.3V (ideal).
• Total swing at A is 0.9V. C swings from VT to VDD-VT
11
Complete Circuit Diagram
Charge Pump
Pulse generator
MP3
MN3
VDD-VTL
VDD=1V VTL+0.3
LVT
LVT
VTL V
VTLTL
VDD=1V
φ1
Delay
VTL
MP2
φ2
HVT
C
CCH
Delay
1
1
MP1
0V
0.3
HVT
VDDI=0.3V
0
IN
0
B
A
MN5
CCL
φ1
0
MN1
φ2
OUT
0V
HVT
0.6
MN4
1V
0
0.3V
0.3
MNX
Weak
Keeper
0
-0.3
12
Simulation results
IN
OUT
• Reduced swing interconnect signal gets
reconstructed with good performance.
13
Delay vs Energy/bit
• Proposed Solution gives very good performance and
very low energy.
14
Energy savings in a processor
• Data-Bus of alpha was implemented using differential,
basic and proposed interconnect circuit.
• Over the set of splash benchmarks, the proposed
interconnect saves up to 70% of energy.
15
PPA : Power-Performance-Area
Schemes
B/W
Norm. Energy
(GHz)
Swing
(V)
Area of 1
repeater
Basic
>1
1
1
2X
Single Ended
[4,5,7]
Differential
[8-10]
Capacitive [6]
<0.25
0.6
0.6
15-24X
>1
0.05
0.8
100-250X
<0.25
0.05
0.2
NA
This Work
>1
0.3
0.3
22X
• Novel interconnect circuit has best in class PPA
16
Thank You
17
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
Nir Magen et. Al. “Interconnect-Power Dissipation in a Microprocessor” Workshop on System Level Interconnect Prediction
2004
P. Kogge, K. Bergman, S Borka, et. al, “ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems”
DARPA/IPTO, September 2008
E. Kusse and J.M. Rabaey, “Low-Energy Embedded FPGA Structures” IEEE International Symposium on Low Power
Electronics Design, August 1998 .
H. Zhang, V. George and J.M. Rabaey, “Low-Swing On-Chip Signalling Techniques: Effectiveness and Robustness” IEEE
Transactions on Very Large Scale Integration (VLSI), Vol-8 No-3, June 2000
J.C.G. Montesdeoca, J.A. Montiel-Nelson and S. Nooshabadi, “CMOS Driver Receiver Pair for Low Swing Signalling for Low
Energy On-chip Interconnects” IEEE Transactions on Very Large Scale Integration (VLSI), Vol-17 No-2, February 2009.
R. Ho, I. Ono, F. Liu, A. Chow, J. Schauar and R. Drost, “High Speed and Low Energy capacitively driven wires” IEEE
International Solid State Circuits Conference, February 2007.
M. Ferretti and P.A. Beere “Low Swing Signaling Using a Dynamic Diode-Connected Driver” European Solid-State Circuits
Conference, September 2001.
A. Narshimha, M. Kasotiya and R. Sridhar “A Low-Swing Differential signaling Scheme for on-chip Global Interconnects”
International Conference on VLSI Design, January 2005.
N. Tzartzanis, W.W. Walker “Differential Current Mode Sensing for Efficient On-Chip global Signaling” IEEE Journal of Solid
State Circuits, Vol-40 No-11, November 2005.
H. Ito, M. Kimura, K. Miyashita, T. Ishii, K. Okada and K. Masu, “A Bidirectional and Multidrop Transmission Line Interconnect
for Multipoint to Multipoint On-Chip Communication” IEEE Journal of Solid State Circuits, Vol-43 No-4, April 2008.
V. Alder and E.G. Friedman, “Repeater Design to Reduce Delay and Power in Resistive Interconnects”. IEEE Transactions on
Circuits and Systems-II, Vol-45 No-45, May 1998.
P.E. Allen and D.R. Holberg., “CMOS Analog circuit design” Oxford Press 2002.
R.E. Kessler, E.J. McLellan and D.A. Webb, “The Alpha 21264 Microprocessor Architecture” International Conference on
Computer Design, October 1998.
N.L. Binkert, R.G. Dreslinski, L.R. Hsu, K.T. Lim, A.G. Saidi and S.K. Reinhardt, “The M5 Simulator: Modeling Networked
Systems” IEEE Micro, July 2006.
18
Back Up
19
Complete Circuit Diagram
0.3V
0V
IN
0.6V
0.3V
0.3V
A
0V
• When IN goes hi, A goes to 0.6V,
bringing B to ground.
• OUT goes high completing the
transition.
• It also brings C to VDD-VT and
precharges A to 0.6V
B
-0.3V
1V
0V
1V
OUT
C
0V
1V-VT
VT
VT
1V
φ1
1V
φ2
TCRIT
TCRIT
20
Graph of A
21
Initial Condition
VDD=1V
BUS<0>
MP2
C
Rx
HVT
LVT
BUS<1>
Rx
MP1
RESET
HVT
VDDI=0.3V
MNR
B
MN1
BUS<N>
RESET
A
Rx
HVT
RESET
a) RESET implementation in
Receiver ckt of Figure 9
b) RESET implementation in BUS
22
Static current
VDD=1V
VDD-VTL
Mean=66nA
MP2
VTL
HVT
C
1µA
100nA
MP1
HVT
B=0
B
A
MN1
HVT
0.6V
MNX
HVT
0.3V
0.3V
Mean=125nA
0V
Weak
Keeper
-0.3V
a) First Stage of receiver having
high leakage
1µA
B=1
b) Monte Carlo result of leakage for
B=0 and B=1 cases
23
Voltage Sensitivity
Mean=316nA
100nA
Mean=165pS
1µA
a) leakage at VDDI=0.35V
b) Propagation delay (INàOUT) at VDDI=0.25V
24