Advanced VLSI Design - lecture_18

Download Report

Transcript Advanced VLSI Design - lecture_18

EE 587 SoC Design & Test

Partha Pande School of EECS Washington State University [email protected]

1

Power & Low Power Design Physical Design Methodologies

2

Metric1 : Power

• Ref. 5.9 of HJS • If we improve a design relative to power but it slows down the circuit, then it might not be acceptable • Comparing the power of two designs might be misleading • the lower power design might just be slower 3

Metric 2 : Energy / Operation

• Rather than looking at power, look at the total energy needed to complete some operation. Fixes obvious problems with the Power metric, since changing the operating frequency does not change the answer 4

Metric 3 : EDP

5

Energy vs. Delay

6

Technology Optimization

• Energy per transition is proportional to 2

V dd

• When the supply voltage approaches the threshold then delay increases significantly 7

Technology Optimization

• Modification of the threshold voltage • Reduction of threshold voltage and supply reduction is offset by an increase in leakage current 8

Transistor Sizing

• Optimum transistor sizing •

The first stage is driving the gate capacitance of the second and the parasitic capacitance

input gate capacitance of both stages is given by NCref, where Cref represents the gate capacitance of a MOS device with the smallest allowable (W/L)

9

Transistor Sizing

• • • • •

When there is no parasitic capacitance contribution (i.e., lowest power.

α = 0), the energy increases linearly with respect to N and the solution of utilizing devices with the smallest (W/L) ratios results in the At high values of α, when parasitic capacitances begin to dominate over the gate capacitances, the power decreases temporarily with increasing device sizes and then starts to increase, resulting in a optimal value for N.

The initial decrease in supply voltage achieved from the reduction in delays more than compensates the increase in capacitance due to increasing N. after some point the increase in capacitance dominates the achievable reduction in voltage, since the incremental speed increase with transistor sizing is very small Minimum sized devices should be used when the total load capacitance is not dominated by the interconnect

10

Power Dissipation in Interconnects

• In the deep-submicron era, interconnect wires (and the associated driver and receiver circuits) are responsible for an ever increasing fraction of the energy consumption of an integrated circuit.

• Most of this increase is due to global wires, such as busses and clock and timing signals.

• More than 90% of the power dissipation of traditional FPGA components (over a wide range of applications) is due to the interconnect • For gate array and cell library based designs it has been found that the power consumption of wires and clock signals can be up to 40% and 50% of the total on-chip power consumption respectively.

11

Energy Metric

E dyn

 (

C w

C L

) 

V DD

V swing

12

Low-swing Circuits

Conventional Level Converter

• Extra power rail • Special low-Vt device needed 13

in

Dynamically-Enabled Drivers

EN VDD PRE REF REF SA CL EN2 out • The basic idea is to control the charging/discharging time of the drivers so that a desirable swing on the interconnect is obtained.

• Wire is floating when the driver is disabled 14

Low Swing Bus

• Power dissipated in an n-bit bus

P

n

f

C w

 2

V DD

• Increasing the number of switching bits n causes a proportional increase in power dissipation 15

Low Swing Bus

• The voltage swing can be reduced by using an additional bus wire, called the dummy ground • This dummy ground is initially discharged to the real ground level and then immediately isolated from the ground.

• The charge of bus wiring capacitance is discharged to the dummy ground instead of the real ground.

• When

n

bits of the bus signals switch from “I” to “0,” the voltage swing is reduced to

V swing

 

n V dd

 1  1 n nCw Cw Dummy Ground 16

Low Swing Bus

• The bus power dissipation required to switch n bits of the bus is given as

P

n

f

C w

V swing

V dd

n n

 1 

f

C w

V dd

2 • The voltage swing is further reduced as the number of switching bits increases 17

in VDD

SSDLC

VDD CL in2 N3 P3 P1 A B N1 VDD P2 out N2 • Symmetric Source-Follower Driver with Level Converter • The driver limits the interconnect swing from Vtn to Vdd-Vtn • Assume that node

in2

goes from low to high;

Vtn

to

Vdd-Vtn.

• Initially, node A sits at

Vtn

and node B sits at Ground .

• During the transition period, with both N3 and P3 conducting, A and

B

rise to Vdd-Vtn • Consequently, N2 is turned on, and

out

goes to low. The feedback transistor PI pulls A further up to Vdd to cut off P2 completely.

in2

and

B

stay at Vdd-Vtn.

18

Level Converter with Low-Vt Device

E new E full

  

REF V dd

  2 19

Gated Clocks

Logic Block

CLK REG MSB REG CLK REG For Bits 0-N-2 Gated Clock REG For Bits 0-N-2 MSB Comparator A>B Comparator A>B For Bits 0-(N-1) Conditionally Switched 20

Low Power Through Circuit Design

• Low-Power Logic Styles: CMOS Versus Pass-Transistor Logic by Zimmermann and Fichtner • Power savings through proper choice of logic styles – Switching Capacitance – Transition Activity – Short Circuit Currents • Power dissipation of various logic styles need to be analyzed 21

Circuit Design Styles

• Nonclocked Logic – CMOS, Pseudo-NMOS, Differential Cascade Voltage Switch (DCVS), Pass-Transistor • Clocked Logic – Domino, Differential Current Switch Logic (DCSL) 22

Complementary CMOS - Advantages

• Simple monotonic gates can be realized

very efficiently

with only a few transistors, one signal inversion level, few circuit nodes – Area and Power reduces, delay reduces • Robustness against voltage scaling and transistor sizing • Input signals are connected to gate inputs only 23

Complementary CMOS- Disadvantages

• Large PMOS transistors – Area, Power, Delay increase • Series transistors in the output stage – Weak output driving capability • Delay increases 24

Pseudo-NMOS Logic

   Reduced complexity of logic and hence, lower capacitance, and faster speed Ratioed Logic, better suited for large fan-in design Static Current  Power Dissipation is high 25

Performance of Pseudo-nMOS

Size, W/L

4 2 1 0.5

0.25

p Logic 0 voltage

0.693 V 0.273 V 0.133 V 0.064 V 0.031 V

Logic 0 static power

564 μW 298 μW 160 μW 80 μW 41 μW

Delay 0 → 1

14 ps 56 ps 123 ps 268 ps 569 ps J. M. Rabaey, A. Chandrakasan and B. Nokolić,

Digital Integrated Circuits

, Upper Saddle River, New Jersey: Pearson Education, 2003.

26

Negative Aspects of Pseudo-nMOS

• Output 0 state is ratioed logic.

• Faster gates mean higher static power.

• Low static power means slow gates.

27

DCVS Logic

 No static power dissipation  Speed advantage of ratioed logic  Has larger area and switched capacitances 28

Pass-Transistor Logic Styles

• One pass-transistor network is sufficient to perform the logic operation – Smaller no. of transistors, smaller input loads • Threshold Voltage Drop – Swing restoration Circuit required • Multiplexer Structure – Dual Rail Logic required 29

Complementary Pass-Transistor Logic (CPL)

      Small input loads  Power and delay reduces Efficient XOR and MUX implementation Good output drive Cross-coupled pull-up  Large short-circuit current Substantial number of nodes Inefficient realization of simple gates 30

Double Pass-Transistor Logic (DPL)

  Both PMOS and NMOS logic networks are used in parallel  Full swing on the output signals Number of transistors and the number of nodes are quite high  Substantial capacitive load 31

Swing Restored Pass-Transistor Logic (SRPL)

    Derived from CPL, Output inverters are cross-coupled to a latch structure  Swing restoration and output buffering at the same time Transistor sizing is difficult, poor output driving capability Slow switching Large short-circuit current 32

Single-Rail Pass-Transistor Logic (LEAP)

  Single NMOS networks are required  Area, Power, Delay decreases Swing restoration only works for

V dd

V tn

V tp

 Robustness in the low voltages is not guaranteed 33

Comparisons between CMOS and Pass Transistor

• Pass-Transistor logic is claimed to be the low-power logic styles – All the comparisons were based on the full adder implementation • Not representative • Full adders have limited importance even in arithmetic circuits 34

Comparisons between CMOS and PL

• Higher Performance for CPL over CMOS in case of full adder implementation • In case of multiplexer and other monotonic gates CMOS outperforms others • In case of XOR CPL is faster, but power-delay product is more • CPL provides best performance among all pass-transistor design styles 35

Domino Logic

       Nonratioed logic – sizing of pMOS transistor is not important for output levels.

Higher Speed Only implements noninverting logic gates Best suited for large fan-in gates Switching activity is high Lower noise immunity Large clock load 36

Logic Activity

• Probability of 0 → 1 transition: – Static CMOS, p0 p1 = p0(1 – p0) – Dynamic CMOS, p0 • Example: 2-input NOR gate – Static CMOS, Pdyn = 0.1875 C L V DD 2 f CK – Dynamic CMOS, Pdyn = 0.75 C L V DD 2 f CK p1=0.5

p1=0.5

p1=0.25

p0=0.75

37

Selecting a Logic Style

• Static CMOS: most reliable and predictable, reasonable in power and speed, voltage scaling and device sizing are well understood.

• Pass-transistor logic: beneficial for multiplexer and XOR dominated circuits like adders, etc.

• For large fanin gates, static CMOS is inefficient; a choice can be made between pseudo-nMOS, dynamic CMOS and domino CMOS .

38