No Slide Title

Download Report

Transcript No Slide Title

ELEC 301
Volkan Kursun
CMOS Inverter
EE141
ELEC
301 SPRING 2009
1
VOLKAN KURSUN
Announcements
 Extra
Lecture
 Venue: Room 4619 (Lift 31-32)
 Date: April 22 2009 (next Wednesday)
 Time: 7PM to 9PM
EE141
ELEC
301 SPRING 2009
2
VOLKAN KURSUN
Announcements
 HW1,
 Final






HW2, and HW3 solutions are on the web
exam
Venue: LG4204
Date: 29 May 2009 (Fri)
Time: 12:30-15:30
Closed book exam
No copy sheets
Bring a calculator
EE141
ELEC
301 SPRING 2009
3
VOLKAN KURSUN
Design Optimization: Endless Cycle
Quote from the beginning of Part 2 in the book:
“A design is what the designer has when time and money run out”
EE141
ELEC
301 SPRING 2009
4
VOLKAN KURSUN
CMOS Inverter
N Well
VDD
VDD
PMOS
2l
Contacts
PMOS
In
Out
In
Out
Metal 1
Polysilicon
NMOS
NMOS
GND
EE141
ELEC
301 SPRING 2009
5
VOLKAN KURSUN
Two Inverters
Share power and ground
Abut cells
VDD
Connect in Metal
EE141
ELEC
301 SPRING 2009
6
VOLKAN KURSUN
CMOS Inverter
First-Order DC Analysis
VOL = 0
VOH = VDD
VM = f(Rn, Rp)
CUT-OFF transistor: open switch (infinite
resistance assumption)
V DD
V DD
Vin = 0V
Vin = VDD
VDD
ON
CUT-OFF
PMOS
PMOS
V out
In
Out
In
V in = V DD
EE141
ELEC
301 SPRING 2009
ON
V out
Out
NMOS
NMOS
Rn
Rp
VDD
CUT-OFF
V in = 0
7
VOLKAN KURSUN
Static CMOS Properties
 The
high and low output levels are VDD and
GND, respectively
 Voltage swing is equal to the supply voltage
 High noise margins
 The
logic levels are not dependent on the
relative device sizes
 Ratioless logic
 In ratioed logic (such as NMOS) logic levels are
determined with the relative transistor sizes
EE141
ELEC
301 SPRING 2009
8
VOLKAN KURSUN
Static CMOS Properties
 In
steady state, there is always a low resistance
path between the output node and either VDD or
GND
 Low output impedance
 Less sensitive to noise
– Fast recovery of node voltages from disturbances
induced by noise
 No
direct current path exists between VDD and
GND at steady state
 No static DC power consumption
 Primarily because of this, CMOS was a low power
technology when it was first proposed in 1960s
– Leakage is considered separately - not included in static
9
DC power
EE141
ELEC
301 SPRING 2009
VOLKAN KURSUN
Static CMOS Properties
 Input
resistance of a CMOS gate is very high
 Small DC current flows through the gate insulator
(ideally zero)
 Steady state output current is negligible (ideally
zero)
 A CMOS gate can drive a large number of CMOS
gates (ideally infinite fanout)
– Functionality maintained even with very high fanout
 Increasing
fanout also increases the
propagation delay
 Although the steady state behavior is not effected,
the dynamic (transient) response is degraded with
the increased fanout
EE141
ELEC
301 SPRING 2009
10
VOLKAN KURSUN
Voltage
Transfer
Characteristics
EE141
ELEC
301 SPRING 2009
11
VOLKAN KURSUN
PMOS Load Lines
VDD
IDSp = -IDSn
VGSn = Vin, VDSn = Vout
PMOS
VGSp = Vin – VDD, VDSp = Vout - VDD
In
Out
IDn
NMOS
V out
= VDSn
IDp
IDn
IDn
Vin=0
Vin=0
V in=1.5
Vin=1.5
V DSp
V DSp
Vout
VGSp=-1
VGSp=-2.5
EE141
ELEC
301 SPRING 2009
Vin = V DD+VGSp
IDn = - IDp
Vout = V DD+VDSp
12
VOLKAN KURSUN
CMOS Inverter Load Characteristics
ID n
PMOS
Vin = 0
Vin = 2.5
Vin = 0.5
Vin = 2
Vin = 1
Vin = 1.5
Vin = 1.5
Vin = 1
Vin = 1.5
Vin = 2
Vin = 2.5
NMOS
Vin = 1
Vin = 0.5
Vin = 0
Vout
EE141
ELEC
301 SPRING 2009
13
VOLKAN KURSUN
CMOS Inverter Voltage Transfer Characteristics (VTC)
VGS_NMOS < VT_NMOS
VGS_PMOS < VT_PMOS and VGD_PMOS < VT_PMOS
NMOS off
PMOS res
VGS_NMOS > VT_NMOS and VGD_NMOS < VT_NMOS
2.5
Vout
2
NMOS s at
PMOS res
1
1.5
NMOS sat
PMOS sat
0.5
NMOS res
PMOS sat
0.5
1
EE141
ELEC
301 SPRING 2009
1.5
2
VGS_PMOS < VT_PMOS and VGD_PMOS < VT_PMOS
VGS_NMOS > VT_NMOS and VGD_NMOS < VT_NMOS
VGS_PMOS < VT_PMOS and VGD_PMOS > VT_PMOS
Switching threshold voltage:
VM = Vin = Vout
NMOS res
PMOS off
2.5
Vin
VGS_NMOS > VT_NMOS and
VGD_NMOS > VT_NMOS
VGS_PMOS > VT_PMOS
*res: resistive (linear) region
14
VOLKAN KURSUN
Switching Threshold Voltage


At the switching threshold voltage point both NMOS and PMOS
are in saturation (VGD = 0V)
Obtain an analytical expression for VM by equating the currents
through the transistors
2
W
Vmin
Vmin
I DSAT  Cox ((VGS  VT )Vmin 
)(1  lVDS )  kVmin (VGS  VT 
)(1  lVDS )
L
2
2
Lv
Vmin  VDSAT  sat or Vmin  (VGS  VT )


Assume velocity saturation (Vmin = VDSAT) and ignore channel
length modulation
Assuming identical oxide
thickness and channel
length for NMOS and PMOS
EE141
ELEC
301 SPRING 2009
15
VOLKAN KURSUN
Sizing for VM = VDD/2
(W / L) p
(W / L) n


 nCoxnVDSATn (VM  VTn  VDSATn / 2)
 p CoxpVDSATp (VM  VDD  VTp  VDSATp / 2)
To achieve VM = VDD/2 with long-channel devices
n

(W / L) n  p
(W / L) p
To achieve VM = VDD/2 with short-channel devices
 Due to velocity saturation
(W / L) p  n
Lvsat

VDSAT 
(W / L) n  p


EE141
ELEC
301 SPRING 2009
16
VOLKAN KURSUN
Sizing Example

In our generic 0.25 micron CMOS process, using the
process parameters given in the book, a VDD = 2.5V, and
a minimum size NMOS device ((W/L)n of 1.5)
NMOS
PMOS
VT0(V)
0.43
-0.4
(V0.5)
0.4
-0.4
VDSAT(V)
0.63
-1
l(V-1)
0.06
-0.1
k’(A/V2)
115 x 10-6
-30 x 10-6
Problem?
Where??
(W/L)p 115 x 10-6 0.63 (1.25 – 0.43 – 0.63/2)
=
x
x
= 3.5
-6
(W/L)n -30 x 10 -1.0 (1.25 – 0.4 – 1.0/2)
(W/L)p = 3.5 x 1.5 = 5.25 for a VM of 1.25V
EE141
ELEC
301 SPRING 2009
17
VOLKAN KURSUN
VTC Shifts with the Transistor Ratio
2.5
Changing the
Wp/Wn ratio
shifts the VTC
 Increasing
PMOS size
 VTC shifts to
the right
 VM moves
closer to VDD
 Increasing
NMOS size
 VTC shifts left
 VM moves
closer to GND

EE141
ELEC
301 SPRING 2009
Increased
Wp/Wn ratio
2
VDD = 2.5V
V out (V)
1.5
WPMOS
Vin
Vout
1
WNMOS
Decreased
Wp/Wn ratio
0.5
0
0
0.5
1
1.5
V (V)
in
2
2.5
18
VOLKAN KURSUN
Switching Threshold Voltage as a
Function of Transistor Ratio
VM is relatively
insensitive to
variations in
device ratio
 Size PMOS
smaller than
that required
for perfect
symmetry

– Ratio = 3,
VM = 1.22
– Ratio = 2,
VM = 1.13
EE141
ELEC
301 SPRING 2009
1.8
1.7
1.6
Wp/Wn = 3.4 for VM = VDD/2
1.5
M
V (V)
1.4
1.3
1.2
1.1
1
0.9
0.8
10
0
10
W /W
p
n
1
19
VOLKAN KURSUN
Determining VIH and VIL

Piece-wise linear approximation of VTC
 Transition region approximated with a straight line whose
gain is equal to the actual gain at Vin = VM
Vout
– Intersection with VOH and VOL approximates VIL and VIH, respectively
V OH
VM
V in
V OL
V IL
V IH
A simplified approach
EE141
ELEC
301 SPRING 2009
Higher gain in the
transition region is
desirable for a higher
noise immunity
20
VOLKAN KURSUN
Inverter Gain
Determined by technology parameters such as the channel length
modulation coefficient (l) and VT. Designer influence through
supply voltage and VM (transistor sizing).
0
-2
r
-4
k pVDSATp
knVDSATn
 satpWp

 satnWn
gain
-6
-8
-10
-12
-14
-16
-18
0
0.5
1
1.5
2
2.5
V (V)
in
EE141
ELEC
301 SPRING 2009
21
VOLKAN KURSUN
Gain as a Function of VDD

Gain in transition region is enhanced with VDD scaling for VDD > VT
 Transition region becomes narrower
 VTC becomes more similar to the ideal VTC

Alternatively, in the subthreshold circuits, gain is degraded with smaller VDD
 At around VDD = 2 to 4 kT/q, the gain in the transition region approaches 1
 For sufficient gain and full-voltage-rail functionality VDD must be larger than 2kT/q
2.5
0.2
2
0.15
Vout(V)
Vout (V)
1.5
0.1
1
0.05
0.5
Gain=-1
0
0
0.5
1.5
1
V (V)
in
EE141
ELEC
301 SPRING 2009
2
2.5
0
0
0.05
0.1
V (V)
in
0.15
0.2
22
VOLKAN KURSUN
Impact of Process Variations
2.5

Fast transistor
 ∆Tox: -3nm
 ∆L: -25nm
 ∆W: +30nm
 ∆Vt: -60mV
Slow transistor
 Opposite
2
Fast PMOS
Slow NMOS
1.5
V out (V)

Nominal
Fast NMOS
Slow PMOS
1
0.5
0
0
0.5
1
1.5
V (V)
EE141
ELEC
301 SPRING 2009
in
2
2.5
23
VOLKAN KURSUN
Propagation Delay
EE141
ELEC
301 SPRING 2009
24
VOLKAN KURSUN
CMOS Inverter Propagation Delay: Approach 1

Represent the transistor with a current source whose value
is equal to the average (dis)charge current over the time
interval of interest
VDD
tpHL = CL Vswing/2
Iav
CL
Vout
~
Iav
CL
kn VDD
Vin = V DD
EE141
ELEC
301 SPRING 2009
25
VOLKAN KURSUN
CMOS Inverter Propagation Delay
Approach 2: Switch Model

Voltage dependence of Ron and CL: represent Ron and CL
with constant linear elements whose values are averaged
over the time interval of interest
VDD
tpHL = f(R on.CL)
= 0.69 RonCL
Vout
ln(0.5)
Vout
CL
Ron
1
VDD
0.5
0.36
Vin = V DD
EE141
ELEC
301 SPRING 2009
RonCL
t
26
VOLKAN KURSUN
The Transistor as a Switch
VGS  V T
Ron
S
D
How to find equivalent Ron (Req)?
1) By integration:
Modeled as a switch with
infinite off resistance and a
finite on resistance, Ron
ID
V GS = VDD
Rmid
R0
V DS
2) Or by averaging the values at the
end points of the transition region that
determines the propagation delay
EE141
ELEC
301 SPRING 2009
VDD/2
VDD
27
VOLKAN KURSUN
The Transistor as a Switch
* Equivalent resistance extracted for
VGS = VDD and VDS = VDD → VDD/2
Ron extracted for W/L = 1
For larger devices divide Req by W/L
EE141
ELEC
301 SPRING 2009
28
VOLKAN KURSUN
LH and HL Delays with Switch Model
Turned-ON transistor: represent with an equivalent channel
resistance
V DD
VDD
V DD
CUT-OFF
tpLH = f(R p , CL)
Rp
= 0.69*Rp*CL
Vin = VDD
PMOS
In
Out
NMOS
V out
CL
ON
V out
VDD
ON
CL
PMOS
In
Out
Vin = 0V
V in = 0
Rn
tpHL = f(Rn , CL)
NMOS
CUT-OFF
(a) Output low-to-high transition
V in = V DD
= 0.69*Rn*CL
(b) Output high-to-low transition
EE141
ELEC
301 SPRING 2009
29
VOLKAN KURSUN
How to Equalize HL and LH Propagation Delays?
• Propagation delay is proportional to the time
constant of the network formed by the pull-down (or pullup) resistor and the load capacitance
VDD
tpHL = f(Reqn, CL)
tpHL = ln(2) Reqn CL = 0.69 Reqn CL
Vin = V DD
Vout = 0
Reqn
tpLH = ln(2) Reqp CL = 0.69 Reqp CL
CL
tp = (tpHL + tpLH)/2 = 0.69 CL(Reqn + Reqp)/2
To equalize the HL and LH propagation delays
equalize the resistances of NMOS and PMOS
transistors (Identical condition for a symmetrical VTC)
EE141
ELEC
301 SPRING 2009
30
VOLKAN KURSUN
Average Propagation Delay
3
2.5
?
tp = (tpHL + tpLH)/2 = 0.69 CL(Reqn + Reqp)/2
Vout(V)
2
tp = 0.69 CL (Reqn+Reqp)/2
1.5
1
tpLH
tpHL
0.5
0
-0.5
0
0.5
?
1
t (sec)
EE141
ELEC
301 SPRING 2009
1.5
2
2.5
-10
x 10
31
VOLKAN KURSUN
Components of Load Capacitance
Vout1
Vin
Vout2
CL
M2
Vin
CGD_M2
CGD_M1
CG4
M2_drain
CDB2
M1_drain
CDB1
M4
Vout1
Vout2
Cw
CG3
M3
M1
Load capacitance (CL) has 3 components:
1) intrinsic MOS transistor capacitances
2) wiring (interconnect) capacitance
3) extrinsic MOS transistor fanout capacitances
EE141
ELEC
301 SPRING 2009
32
VOLKAN KURSUN
MOS Capacitance Model
CGS = CGCS + CGSO
G
CGD = CGCD + CGDO
CGS
CGD
S
D
CGB
CSB
CSB = CSdiff
B
CDB
CDB = CDdiff
CGB = CGCB
EE141
ELEC
301 SPRING 2009
33
VOLKAN KURSUN
Caps Effecting the Delay of the First Gate
Vdd
Vdd
Vdd
Vdd
Cg_P2
P2
P1
Cdb_P1
Input1 Cgdo_P1
Cgdo_N1
Output1
Output2
Cdb_N1
Cwire
N2
N1
Cg_N2
EE141
ELEC
301 SPRING 2009
34
VOLKAN KURSUN
Gate-Drain Capacitance: The Miller Effect

M1 and M2 are either in cut-off or in saturation



Cgd is due to gate overlap only
Oxide capacitance is either between gate and body or gate and
source
The floating gate-drain capacitor is replaced by a
capacitor from drain-to-ground
V
CGD1
Vin
V
M1

Vout
Vout
2CGD1
V
Vin
V
M1
A capacitor experiencing identical but opposite voltage
swings at both its terminals can be replaced by a
capacitor to ground whose value is two times the original
value
35
EE141
ELEC
301 SPRING 2009
VOLKAN KURSUN
Junction Cap: Equivalence Coefficient



The drain-to-body junction capacitances experience reversebody-bias during the output critical transitions that determine
the delay
A coefficient can be used to relate the actual average
(equivalent effective) junction capacitance to the zero-bodybias junction cap (Cj0)
Ceq = Keq * Cj0
 b

[(b  V1 )1 m  (b  V2 )1 m ]
(V1  V2 )(1  m)
m
K eq
Assume the voltage across a p-n junction transitions from V1 to
V2. m is the junction grading coefficient and Φb is the built-in
junction potential. V1 and V2 are the initial and final voltages
across the junction, respectively. V1 and V2 are negative for
reverse bias and positive for forward bias.
EE141
ELEC
301 SPRING 2009
36
VOLKAN KURSUN
Junction Bias Voltages
The initial and final junction voltages experienced by the
transistors of the first gate are V1 and V2, respectively.
While determining the initial and final junction voltages,
consider only the junctions and the junction voltages that
are necessary for calculating the propagation delays.
Output low to high transition:
N1-drain: V1 = 0V and V2 = -1.25V
P1-drain: V1 = -2.5V and V2 = -1.25V
Output high to low transition:
N1-drain: V1 = -2.5V and V2 = -1.25V
P1-drain: V1 = 0V and V2 = -1.25V
EE141
ELEC
301 SPRING 2009
37
VOLKAN KURSUN
Junction Cap: Equivalence Coefficient
Ceq = Keq * Cj0
m
We can simplify
the
calculations
 b diffusion capacitance
1 m
1 m
further
by
using
a
K
to
relate
the
linearized
Keven

[(


V
)

(


V
)
]
eq b
eq
1
b
2
capacitor
(V1 to the
V2 )(value
1  mof) the junction capacitance under
zero-bias

Assume the voltage across a p-n junction transitions from V1 to V2. m is the
Cbeqis =
junction grading coefficient and Φ
theKbuilt-in
eq Cj0 junction potential. V1 and V2
are the initial and final voltages across the junction, respectively. V1 and V2 are
negative for reverse bias and positive for forward bias.
NMOS
PMOS
EE141
ELEC
301 SPRING 2009
high-to-low
Keqbp
Keqsw
0.57
0.61
0.79
0.86
low-to-high
Keqbp
Keqsw
0.79
0.81
0.59
0.7
38
VOLKAN KURSUN
Extrinsic Fan-out Capacitance
Vout1
Vin
Vout2
CL
M2
Vin
CGD_M2
CGD_M1
CG4
M2_drain
CDB2
M1_drain
CDB1
M4
Vout1
Vout2
Cw
CG3
M3
M1
Load capacitance (CL) has 3 components:
1) intrinsic MOS transistor capacitances
2) wiring (interconnect) capacitance
3) extrinsic MOS transistor fanout capacitances
EE141
ELEC
301 SPRING 2009
39
VOLKAN KURSUN
Extrinsic (Fan-Out) Capacitance

The extrinsic, or fan-out, capacitance is the total gate
capacitance of the loading gates M3 and M4.
Cfan-out = Cgate (NMOS) + Cgate (PMOS)
= (CGSOn+ CGDOn+ WnLnCox) + (CGSOp+ CGDOp+ WpLpCox)

Simplification of the actual situation


Assumes all the components of Cgate are between Vout and GND
(or VDD)
Assumes the channel capacitances of the loading gates are constant
EE141
ELEC
301 SPRING 2009
40
VOLKAN KURSUN
Pre-Layout Look: Load Capacitance
Vout1
Vin
Vout2
CL
M2
Vin
CGD_M2
CGD_M1
CG4
M2_drain
CDB2
M1_drain
CDB1
M4
Vout1
Vout2
Cw
CG3
M3
M1
Load capacitance (CL) has 3 components:
1) intrinsic MOS transistor capacitances
2) wiring (interconnect) capacitance
3) extrinsic MOS transistor fanout capacitances
EE141
ELEC
301 SPRING 2009
41
VOLKAN KURSUN
Layout of Two Chained Inverters
VDD
9λ/2λ
PMOS
1.125/0.25
1.2m
=2l
In
Out
Metal1
Polysilicon
0.125
NMOS
0.375/0.25
GND
3λ/2λ
The physical dimensions (area and perimeter) of the drain and
source diffusion areas must be determined to be able to
estimate the diffusion (junction) capacitances
42
EE141
ELEC
301 SPRING 2009
VOLKAN KURSUN
Drains of NMOS Transistors
VDD
9λ/2λ
PMOS
1.125/0.25
NMOS Transistors’ Drains
1.2m
=2l
AD = 4λ*4λ + 1λ*3λ = 19λ2
Out
In
PD = 5λ + 4λ + 4λ + λ + λ = 15λ
Polysilicon
NMOS
0.375/0.25
GND
3λ/2λ
W/L
AD (m2)
PD (m)
AS (m2)
PS (m)
NMOS
0.375/0.25
0.3 (19λ2)
1.875 (15λ)
0.3 (19λ2)
1.875 (15λ)
PMOS
1.125/0.25
0.7 (45λ2)
2.375 (19λ)
0.7 (45λ2)
2.375 (19λ)
EE141
ELEC
301 SPRING 2009
43
VOLKAN KURSUN
Sources of NMOS Transistors
VDD
9λ/2λ
PMOS
1.125/0.25
1.2m
=2l
Out
In
NMOS Transistors’ Sources
Polysilicon
AS = 4λ*4λ + 1λ*3λ = 19λ2
PS = 5λ + 4λ + 4λ + λ + λ = 15λ
NMOS
0.375/0.25
GND
3λ/2λ
W/L
AD (m2)
PD (m)
AS (m2)
PS (m)
NMOS
0.375/0.25
0.3 (19λ2)
1.875 (15λ)
0.3 (19λ2)
1.875 (15λ)
PMOS
1.125/0.25
0.7 (45λ2)
2.375 (19λ)
0.7 (45λ2)
2.375 (19λ)
EE141
ELEC
301 SPRING 2009
44
VOLKAN KURSUN
Sources of PMOS Transistors
VDD
9λ/2λ
PMOS
1.125/0.25
PMOS Transistors’ Sources
AS = 5λ*9λ = 45λ2
1.2m
=2l
PS = 5λ + 9λ + 5λ = 19λ
Out
In
Polysilicon
NMOS
0.375/0.25
GND
3λ/2λ
W/L
AD (m2)
PD (m)
AS (m2)
PS (m)
NMOS
0.375/0.25
0.3 (19λ2)
1.875 (15λ)
0.3 (19λ2)
1.875 (15λ)
PMOS
1.125/0.25
0.7 (45λ2)
2.375 (19λ)
0.7 (45λ2)
2.375 (19λ)
EE141
ELEC
301 SPRING 2009
45
VOLKAN KURSUN
Drains of PMOS Transistors
VDD
9λ/2λ
PMOS
1.125/0.25
1.2mTransistors’ Drains
PMOS
=2l
Out
AD = 5λ*9λ = 45λ2
In
PD = 5λ + 9λ + 5λ = 19λ
Polysilicon
NMOS
0.375/0.25
GND
3λ/2λ
W/L
AD (m2)
PD (m)
AS (m2)
PS (m)
NMOS
0.375/0.25
0.3 (19λ2)
1.875 (15λ)
0.3 (19λ2)
1.875 (15λ)
PMOS
1.125/0.25
0.7 (45λ2)
2.375 (19λ)
0.7 (45λ2)
2.375 (19λ)
EE141
ELEC
301 SPRING 2009
46
VOLKAN KURSUN
Components of CL (0.25 m)
Intrinsic capacitors
C
Miller
Expression
Value (fF) Value (fF)
H L
LH
0.23
0.23
CGD1
2 Con Wn
CGD2
2 Cop Wp
0.61
0.61
CDB1
KeqbpnADnCj + KeqswnPDnCjsw
0.66
0.90
CDB2
KeqbppADpCj + KeqswpPDpCjsw
1.5
1.15
CG3
(2 Con)Wn + CoxWnLn
0.76
0.76
CG4
(2 Cop)Wp + CoxWpLp
2.28
2.28
Cw
from extraction
0.12
0.12
CL

6.1
6.0
Extrinsic fan-out
CGDOn+ CGSOn
capacitors
EE141
ELEC
301 SPRING 2009
CGDOp+ CGSOp
47
VOLKAN KURSUN
Inverter Transient Response
3
VDD = 2.5V 0.25m
Vin
2.5
W/Ln = 1.5 W/Lp = 4.5
Vout (V)
2
Reqn =13 k W ( 1.5)
Reqp = 31 k W ( 4.5)
1.5
tpHL
1
tf
tpLH
tr
tpHL = 0.69 * (13kΩ / 1.5)
t = 36 psec
* 6.1fFpHL= 36 ps
0.5
tpLH = 29 psec
tpLH = 0.69 * (31kΩ / 4.5)
* 6.0 so
fF = 29 ps
0
-0.5
0
x 10-10
0.5
1
t (sec)
EE141
ELEC
301 SPRING 2009
1.5
2
tp =+32.5
= (36
29)psec
/2=
32.5ps
2.5 tp
48
VOLKAN KURSUN
Inverter Delay:
Simulation versus RC Model
From simulation: tPHL = 39ps and tPLH = 31.7ps
3
2.5
VDD=2.5V 0.25m
Vin
W/Ln = 1.5 W/Lp = 4.5
Vout (V)
Reqnat
= 13
kWoutput
( 1.5)
Voltage
overshoot and undershoot observed
the
2
during the input transitions (due to the gate-to-drain
W (oxide
 4.5)
R
=
31
k
eqp
1.5
capacitances)
measured by
tf contribute totr the higher delay
t = 0.69 * (13kΩ /
tpHL The effect tof
1
pLH the overshoots onpHL
simulation.
the
tpHL =output
36 psecvoltage
1.5)
* 6.1fF = 36 ps
levels
0.5 are ignored by the simple RC model. t
= 29 psec
tpLHpLH= 0.69 * (31kΩ /
so * 6.0 fF = 29 ps
4.5)
0
-0.5
0
0.5
EE141
ELEC
301 SPRING 2009
x 10-10
1
t (sec)
1.5
2
2.5
32.5
psec/ 2 =
tp t=p =(36
+ 29)
49
32.5ps
VOLKAN KURSUN

Delay as a Function of VDD
Propagation delay increases significantly at lower supply
voltages (and the noise margins are reduced)
 The two primary reasons why VDD cannot be scaled
aggressively to lower the power consumption
5.5
5
tp(normalized)
4.5
4
3.5
Black line: drawn by simulation
3
Similar in shape to the
2.5
Ron versus VDD curve
2
1.5
1
0.8
1
1.2
1.4
1.6
V
EE141
ELEC
301 SPRING 2009
1.8
(V)
DD
2
2.2
2.4
50
VOLKAN KURSUN


Lower VDD: Stronger Delay Dependence
For high VDD, delay is only weakly dependent on VDD due to channel length modulation effect
With the given simple formula that ignores the channel length modulation effect, delay becomes
virtually independent of VDD for sufficiently high supply voltages that satisfy (VDD >> Vtn +
VDSATn/2)
Cand
L
5.5
Device
reliability
the power consumption
t PHL  0concerns
.52
Cancel limits
out (set by
W requirements) enforce firm upper boundaries
cooling5 and battery lifetime
 nCoxVDSATn
on the highest supply
L voltage that can be used in a CMOS circuit
tp(normalized)
4.5
For very high supply voltages delay is weakly
dependent on VDD: Increasing the VDD too
much provides only small improvement in
speed.
reliability
concerns
and the
The red data
areDevice
calculated
using
the formula
power consumption limits (set by cooling and
battery lifetime requirements) enforce firm
upper boundaries on the highest supply
voltage that can be used in a CMOS circuit.
4
3.5
3
2.5
2
1.5
1
0.8
1
1.2
1.4
1.6
V
1.8
(V)
DD
EE141
ELEC
301 SPRING 2009
2
2.2
2.4
51
VOLKAN KURSUN
Design for High Speed
Strategy 1: Reduce the load capacitance CL
 Remember that CL has 3 components
– Lower the intrinsic capacitances of the driver


Do not oversize the driver
» A larger driver has both higher gate overlap capacitance and higher
source/drain diffusion capacitance
» Self-loading (always remember that every driver must drive its own
capacitances as well)
Draw diffusion capacitance aware layouts
– Lower the wire capacitance

Try to minimize by careful physical design and layout (more on wire design
later)
– Lower the extrinsic capacitance (fanout capacitance)

Do not oversize the fanout gates
» Larger gates are more difficult to drive
» More fanout capacitance → Longer driver delay
EE141
ELEC
301 SPRING 2009
52
VOLKAN KURSUN
Design for High Speed
Strategy 2: Strengthen the driver by increasing
the W/L ratio of the driver transistors
 Increasing W/L ratio increases the current produced by a transistor
 However, higher W/L ratio (larger transistor), also increases the
gate-to-drain oxide capacitance and the source/drain diffusion
capacitances of a transistor
– Self-loading
– Once the intrinsic capacitances start to dominate the extrinsic fan-out
and wire capacitances, increasing the transistor sizes does not reduce
the delay

All the extra current produced by a larger transistor is used to drive its own
capacitances with negligible effect on the gate delay beyond this point
 Furthermore, increasing the W/L ratio of the transistors inside the
driver causes the input capacitance of the driver to increase
– The driver itself becomes more difficult to drive
– The delay of the driver of the driver increases
– Overall delay of a multi-stage circuit could degrade due to the
oversizing of the intermediate gates in a chain
EE141
ELEC
301 SPRING 2009
53
VOLKAN KURSUN
Design for High Speed
Strategy 3: Increase the supply voltage VDD
 Increasing VDD lowers the resistance of the transistors
 Delay is reduced (speed is enhanced) at a higher supply
voltage
 The prices paid for this higher speed are
– Higher power consumption
– Degraded device reliability
 Enhanced hot-carrier effects
 Punch-through
 Gate-oxide breakdown
 Increasing VDD above a certain level yields only small
enhancements in speed
– Avoid using very high supply voltage to avoid device
reliability issues and to lower the power consumption
EE141
ELEC
301 SPRING 2009
54
VOLKAN KURSUN
Design for Symmetry versus
Design for Speed
• We have seen that the tPHL and tPLH can be equalized by
matching the resistances of the PMOS and NMOS transistors
• However, a gate with symmetric output responses does not
imply the fastest gate
• Average propagation delay of a CMOS inverter can be reduced by
reducing the size of the PMOS transistor
• When the width of the PMOS transistor is increased
tPLH is reduced by increasing the pull-up current (by
lowering the resistance of the PMOS transistor)
 tPHL is increased due to the higher parasitic capacitance of
the larger PMOS transistor
 A larger PMOS transistor increases CL [note that there is
no change in the pull-down (NMOS) resistance]
There is an optimum PMOS size that minimizes the
average propagation delay
55


EE141
ELEC
301 SPRING 2009
VOLKAN KURSUN
How to Design for Lowest Average Delay
Vdd
Vdd
Consider 2 identical
cascaded inverters
Vdd
Vdd
Cg_P2
P2
P1
Cdb_P1
The load capacitance of
the first gate is:
Input1 Cgdo_P1
Cgdo_N1
Output1
Output2
Cdb_N1
Cwire
N2
N1
Cg_N2
CL  Cdb _ P1  Cdb _ N 1  C gdo _ P1  C gdo _ N 1  Cwire  C g _ P 2  C g _ N 2
Assume the PMOS transistors are β times the NMOS transistors
Assume all the capacitances scale linearly with the size and
assume that the unit capacitances (overlap, diffusion, and gate) of
the NMOS and PMOS transistors are equal
WPMOS

WNMOS
EE141
ELEC
301 SPRING 2009
Cdb _ P1  Cdb _ N 1
C g _ P 2  C g _ N 2
C gdo _ P1  C gdo _ N 1
56
VOLKAN KURSUN
Design for Lowest Average Delay
C gdo _ P1  C gdo _ N 1
Cdb _ P1  Cdb _ N 1
Vdd
Vdd
Vdd
P2
P1
C g _ P 2  C g _ N 2

WPMOS
WNMOS
Vdd
Cg_P2
Cdb_P1
Input1 Cgdo_P1
Cgdo_N1
Output1
Output2
Cdb_N1
The load capacitance of
the first gate becomes:
Cwire
N2
N1
C L  (1   )(Cdb _ N 1  C gdo _ N 1  C g _ N 2 )  Cwire
Cg_N2
Let the resistance of the NMOS transistor be Reqn
Let the resistance of a PMOS transistor sized the same with the
NMOS transistor be Reqp
The average propagation delay is
Reqp
ln( 2)
tp 
(1   )(Cdb _ N1  Cgdo _ N1  Cg _ N 2 )  Cwire ( Reqn 
)
2


EE141
ELEC
301 SPRING 2009

57
VOLKAN KURSUN
Design for Lowest Average Delay
C gdo _ P1  C gdo _ N 1
Cdb _ P1  Cdb _ N 1
C g _ P 2  C g _ N 2
W
  PMOS
WNMOS
The resistance ratio of identically sized
PMOS and NMOS transistors is
R
r

Vdd
Vdd
Vdd
P2
P1
Cdb_P1
Input1 Cgdo_P1
Cgdo_N1
Output1
Output2
Cdb_N1
Cwire
eqp
Reqn
Vdd
Cg_P2
N2
N1
Cg_N2

Reqp
ln( 2)
tp 
(1   )(Cdb _ N1  Cgdo _ N1  Cg _ N 2 )  Cwire ( Reqn 
)
2

The optimum PMOS to NMOS transistor size ratio can be found
by differentiating the tp equation wrt β and equalizing to zero
t p

0
EE141
ELEC
301 SPRING 2009


Cwire
 optimum  r 1 

 Cdb _ N 1  Cgdo _ N1  Cg _ N 2 
58
VOLKAN KURSUN
Optimum Size for Lowest Average Delay
• We have seen that by matching the resistances of the PMOS
and NMOS transistors
 WPMOS 


• tPHL and tPLH are equalized
LPMOS  Reqp

• Symmetrical VTC is achieved 

r
symmetry 
(VM = VDD/2)
 WNMOS  Reqn


 LNMOS 
• However, the optimum PMOS to NMOS ratio that minimizes
the average propagation delay is


Cwire
 optimum  r 1 

 Cdb _ N 1  Cgdo _ N1  Cg _ N 2 
For small (negligible) wire capacitance as compared to the
intrinsic and extrinsic fan-out capacitances, the optimum beta
can be approximated by
EE141
ELEC
301 SPRING 2009
 optimum  r
59
VOLKAN KURSUN
Design for Symmetry versus
Design for Speed
• Compare the sizing requirements for achieving different goals:
symmetry versus maximum average speed
 WPMOS 


LPMOS  Reqp

 symmetry 

r
 WNMOS  Reqn


 LNMOS 


Cwire
 optimum  r 1 

 Cdb _ N 1  Cgdo _ N1  Cg _ N 2 
For small (negligible) wire capacitance:
EE141
ELEC
301 SPRING 2009
 optimum  r
60
VOLKAN KURSUN
PMOS/NMOS Ratio: Smaller Can be Faster

Smaller transistor sizes and smaller area can result in a
faster design at the expense of symmetric VTC,
symmetric output voltage transitions, and noise margins
-11
5
x 10
tpHL
tpLH
 of 2.4 (= 31 kW/13
kW) provides a
symmetrical response
t(sec)
4.5
tp
 of 1.6 to 1.9
provides the optimum
performance
4
3.5
3
1
 = Wp/Wn
1.5
2
EE141
ELEC
301 SPRING 2009
2.5
3

3.5
4
4.5
5
61
VOLKAN KURSUN
Why Worry About Average Delay?


Why not only consider the worst (longest) of the LH and
HL delays as the primary performance metric?
Answer:
 To form a complex network with some useful functionality,
several CMOS gates are cascaded
 CMOS circuits are inverting
 An HL transition at the output of one gate triggers an LH
transition at the output of the fan-out gate
 Critical path delays are composed of accumulated HL and LH
delays of the individual gates along the path
 Therefore using the average of the LH and HL delays as the
performance metric makes sense
t
t
-11
5
x 10
pHL
pLH
t(sec)
4.5
tp
4
3.5
EE141
ELEC
301 SPRING 2009
62
3
1
1.5
2
2.5
3

3.5
4
4.5
5
VOLKAN KURSUN
Device Sizing

Divide capacitive load, CL, into


Cint : intrinsic capacitance - diffusion and Miller effect
Cext : extrinsic capacitance - wire and fanout
tp = 0.69 Req Cint (1 + Cext/Cint) = tp0 (1 + Cext/Cint)
where tp0 = 0.69 Req Cint is the intrinsic (unloaded) delay of the gate

Widening both PMOS and NMOS by a factor of S reduces Req by
an identical factor (Req = Rref/S) however also raises the
intrinsic capacitance by the same factor (Cint = SCiref)
tp = 0.69 Rref Ciref (1 + Cext/(SCiref)) = tp0(1 + Cext/(SCiref))
 tp0

is independent of the sizing of the gate; with no load the enhanced
current of the gate is totally offset by the increased intrinsic capacitance
when the gate size is increased
Maximum attainable (ultimate) speed is observed for infinitely large S
- Impact of external load is eliminated, the delay is reduced to intrinsic delay
- Any S sufficiently larger than (Cext/Cint) yields good performance
(close to the ultimate speed limit) with smaller area penalty
63
EE141
ELEC
301 SPRING 2009
VOLKAN KURSUN
Sizing for the Ultimate Speed
x 10-11
3.8
for a fixed load
3.6
The majority of the
improvement is already
obtained for S = 5. Sizing
factors larger than 10
barely yield any extra gain
while causing significantly
larger area and higher
power consumption.
3.4
3.2
tp(sec)
3
2.8
2.6
2.4
2.2
2
1
3
5
7
9
11
13
15
S
The ultimate speed of a CMOS
circuit is determined by the
intrinsic delay of the circuit
EE141
ELEC
301 SPRING 2009
self-loading effect
(intrinsic capacitance
dominates)
64
VOLKAN KURSUN
Impact of Input Rise/Fall Times on Delay
In reality, the input signal
changes gradually (and both
PMOS and NMOS conduct for
some time). This affects the
actual (net) current available for
charging/discharging CL and
impacts the propagation delay.
x 10-11
5.4
5.2
5
4.8
tp(sec)

4.6
4.4
4.2
 tp
increases linearly with the
increasing input slope, ts,
for
ts > tp
 ts
4
3.8
3.6
0
is nonzero due to the limited driving
capability of the preceding gate
EE141
ELEC
301 SPRING 2009
2
4
6
8
x 10-11
ts(sec)
for a minimum-size inverter
with a fan-out of a single gate
65
VOLKAN KURSUN
Impact of Input Rise/Fall Times on Delay

A gate is never designed in isolation: its performance is
affected by both the fan-out and the driving strength of the
gate(s) providing the input(s)
i-1: the driver’s delay scaled by a factor η
tip = tistep + h ti-1step
Propagation delay of gatei
Propagation delay of
gatei for a step input

(h 0.25)
Fraction of the propagation delay of
gatei-1 (the driver gate) for a step input
Keep the signal rise times smaller than or equal to the gate
propagation delays



An empirical parameter
Desirable for higher speed
Desirable for lower (short-circuit) power consumption
Keeping rise and fall times of the signals small and of
approximately equal values is one of the major challenges
in high-performance designs - slope engineering.
EE141
ELEC
301 SPRING 2009
66
VOLKAN KURSUN
Sizing a
Chain of
Gates for
Maximum
Speed
EE141
ELEC
301 SPRING 2009
67
VOLKAN KURSUN
Optimum Sizing of a Gate
Embedded in a Real Environment

Increasing the size of a gate reduces the extrinsic delay

Increasing the size of a gate however also increases the
input capacitance of the gate

Gate sizing should be done carefully considering the
effect of sizing on the delays of the preceding gates

We will look at next how to determine the optimum size
of a gate embedded in a real environment
 Example: a chain of CMOS inverters
EE141
ELEC
301 SPRING 2009
68
VOLKAN KURSUN
Optimum Sizing of a Gate
Embedded in a Real Environment




To determine the effect of sizing on the input gate
capacitance, we need to establish a relationship
between the intrinsic output capacitance and the
input gate capacitance of a CMOS inverter
Let us assume
Cint  C g
Both Cint and Cg linearly scale with the size of the gate
Therefore this relationship is independent of the sizing
is a function of technology

 Close to 1 in a typical CMOS technology
EE141
ELEC
301 SPRING 2009
69
VOLKAN KURSUN
Optimum Sizing of a Gate
Embedded in a Real Environment

Let us rewrite the delay equation derived in the previous
lecture:
Cext
t p  t p 0 (1 
)
SCiref
Cext
f
t p  t p 0 (1 
)  t p 0 (1  )
C g

Cint  C g
Cext
where f 
Cg


f: effective fan-out (the ratio of the external load
capacitance to the input capacitance of a gate)
Delay is a function of the effective fan-out
EE141
ELEC
301 SPRING 2009
70
VOLKAN KURSUN
Optimum Design of an Inverter
Chain for Maximum Speed
In
Out
CL
If the load capacitance CL and the size of the first inverter are given:
- How many stages are needed to minimize the delay?
- How to size the inverters for maximum speed?
EE141
ELEC
301 SPRING 2009
71
VOLKAN KURSUN
Minimize the Delay of an Inverter Chain
In
Out
Cg1
1
2
N
CL
tp = tp1 + tp2 + …+ tpN
 C g , j 1 

t pj  0.69 Rref Ciref 1 
 C 
g, j 

N
t p   t p, j
j 1
EE141
ELEC
301 SPRING 2009
Assume Cext is composed
of only the input
capacitance of the fan-out
gate, ignore the wire
capacitance for simplicity
 C g , j 1 
, C g , N 1  C L
 t p 0  1 
 C 
i 1 
g, j 
N
72
VOLKAN KURSUN
Optimum Sizing Constraint
In
Out
Cg1
1
2
CL
N
Delay equation has N - 1 unknowns, Cg,2 – Cg,N
To minimize the delay, take N - 1 partial derivatives and equalize
to zero
Result is a set of constraints for the optimum
design that provides the maximum speed:
C g, j1
C g, j

C g, j
C g, j-1
Optimum size of each stage is the geometric mean of the two
neighbors
EE141
ELEC
301 SPRING 2009
C g , j  C g , j 1C g , j 1
73
VOLKAN KURSUN
Optimum Design Observations
In
Out
Cg1
1
2
Constraint for the optimum
(maximum speed) design:
N
C g, j1
C g, j

CL
C g, j
C g, j-1
Optimum size of each stage is the geometric mean of the two
neighbors
g, j
g , j 1 g , j 1
C
 C
C
- In an optimum design, each stage has the same effective
fan-out (f = Cext/Cg = Cout/Cin)
- In an optimum design, each stage has the same delay
EE141
ELEC
301 SPRING 2009
74
VOLKAN KURSUN
Optimum Tapering Factor for Given N
In
Out
Cg1
1
2
Multiply the fan-out of each stage:
f1 * f 2 * f 3 * ... * f N 
Cg 2
Cg1
*
N
Cg3
C g2
*
C g4
C g3
CL
CL
* ... *
C gN
Since in an optimum design, each stage has the same effective
fan-out (f1 = f2 = f3 = … = fN = foptimum)
f
N
optimum
CL
CL N

 f optimum  N
 F
C g1
C g1
EE141
ELEC
301 SPRING 2009
75
VOLKAN KURSUN
Minimum Achievable Delay
In
Out
Cg1
1
2
The minimum delay of each stage is then:
t p _ optimum  t p 0 (1 
f optimum

)  t p 0 (1 
CL
N
N
F

)
Since in an optimum design each stage has the same delay, the
minimum delay through the chain is
tchain_ optimum  Nt p 0 (1 
EE141
ELEC
301 SPRING 2009
N
F

)
76
VOLKAN KURSUN
Example: Sizing an Inverter
Chain for Minimum Delay
In
Out
Cg,1
1
f=2
f2 = 4
CL = 8 Cg,1
CL/Cg,1 has to be evenly distributed over N = 3 inverters
CL/Cg,1 = 8/1
3
foptimum = 8 = 2
Assuming γ = 1,
N
f optimum
F
t

t
(
1

)

t
(
1

)  t p 0 (1  2)  3t p 0
p0
Delay of each stage = p _ optimum p 0


f optimum
)  9t p 0
Minimum delay of the chain = tchain_ optimum  3t p 0 (1 

EE141
ELEC
301 SPRING 2009
3 inverter chain with equalized delay
77
(condition for optimum design) at each stage
VOLKAN KURSUN
Choosing the Optimum Number of
Stages for Maximum Speed
t p  Nt p 0  Nt p 0
N
F

For large N, the number of stages becomes too large,
the first component of the delay equation due to the
intrinsic delay of the buffers becomes dominant
For small N, optimum fan-out increases and the sizes of
the buffers become very large, the second component of
delay due to the extrinsic capacitances starts to
dominate
78
EE141
ELEC
301 SPRING 2009
VOLKAN KURSUN
Determining N: Optimum Number of Stages

What is the optimal value for N given F (=fN) ?



if the number of stages is too large, the intrinsic delay of the
stages becomes dominant
if the number of stages is too small, the effective fan-out of each
stage becomes dominant
The optimum N is found by differentiating the minimum
delay expression divided
wrt by the number of stages and
setting the result to 0, giving
N
N
 + F - ( F lnF)/N = 0

For  = 0 (ignoring self-loading) N = ln (F) and the
effective-fan out becomes f = e = 2.71828

For  = 1 (the typical case) the optimum effective fan-out
(tapering factor) turns out to be close to 3.6
EE141
ELEC
301 SPRING 2009
79
VOLKAN KURSUN
Buffer Design for Maximum Speed
Question: Given a specific load capacitance and a specific input
capacitance (which means a given specific first stage buffer size)
design a driver circuit for minimum delay (assume γ = 1)
1
1
64
8
N
foptimum tchain_optimum
1
64
65(xtp0)
2
8
18 (xtp0)
64
Optimized 3-stage circuit is the fastest for this load
1
1
4
64
16
2.8
8
22.6
4
15(xtp0)
4
2.8
15.3(xtp0)
64
t p  Nt p 0 (1 
EE141
ELEC
301 SPRING 2009
3
N
F

)
80
VOLKAN KURSUN
Buffer Design for Various Loads
F
( = 1)
10

Unbuffered Two Stage
Chain
11
8.3
Opt. Inverter
Chain
8.3
100
101
22
16.5
1,000
1001
65
24.8
10,000
10,001
202
33.1
Impressive speed-ups with optimized cascaded
inverter chains for very large capacitive loads
Assignment: Find the optimum number of stages and the
optimum fan-out for each load at home.
Verify the data in the table.
EE141
ELEC
301 SPRING 2009
81
VOLKAN KURSUN
Delay with Long Interconnects (RC)

When gates are farther apart, wire capacitance and
resistance can no longer be ignored.
(rw, cw, L)
Vin
cint
Vout
cfan
tp = 0.69RdrCint + (0.69Rdr+0.38Rw)Cw + 0.69(Rdr+Rw)Cfan
where Rdr = (Reqn + Reqp)/2
Quadratic with length
= 0.69Rdr(Cint+Cfan) + 0.69(Rdrcw+rwCfan)L + 0.38rwcwL2
Linear with length
 Wire delay rapidly becomes the dominant factor (due to
the quadratic term) in the delay budget for longer wires.
EE141
ELEC
301 SPRING 2009
82
VOLKAN KURSUN
Power
Dissipation
EE141
ELEC
301 SPRING 2009
83
VOLKAN KURSUN
Why Power Matters
 Packaging
costs
 Power distribution network design
 Chip and system cooling costs
 Noise immunity and system reliability
 Battery life (in portable systems)
 Environmental concerns
 Office equipment accounted for 5% of total US
commercial energy usage in 1993
 Energy Star compliant systems
EE141
ELEC
301 SPRING 2009
84
VOLKAN KURSUN
Why worry about power? -- Power Dissipation
Lead microprocessor power continues to increase
Power (Watts)
100
P6
Pentium ®
10
8086 286
1
8008
4004
486
386
8085
8080
0.1
1971
1974
1978
1985
1992
2000
Year
Power delivery and dissipation will be prohibitive
EE141
ELEC
301 SPRING 2009
Source: Borkar, De Intel 85
VOLKAN KURSUN
Why worry about power? -- Chip Power Density
Sun’s
Surface
Power Density (W/cm2)
10000
Rocket
Nozzle
1000
…chips become hot…
Nuclear
Reactor
100
8086 Hot Plate
10 4004
P6
8008 8085
Pentium®
386
286
486
8080
1
1970
EE141
ELEC
301 SPRING 2009
1980
1990
Year
2000
2010
Source: Borkar, De Intel 86
VOLKAN KURSUN
Chip Power Density Distribution
Al-SiC+ Epoxy Die Attach
WillametteMap
Power Distribution
Power
On-Die Temperature
110
250
100
50
0
Heat Flux (W/cm2)
150
200-250
150-200
100-150
90
80
50-100
0-50
70
60
Temperature (C)
100
200
50
40
Power is not uniformly dissipated across the chip
 Silicon is not a good heat conductor
 Max junction temperature is determined by hot-spots
 Impact on packaging, w.r.t. cooling

EE141
ELEC
301 SPRING 2009
87
VOLKAN KURSUN
Why worry about power ? -- Battery Size/Weight
50
Battery
(40+ lbs)
Nominal Capacity (W-hr/lb)
Rechargable Lithium
40
Ni-Metal Hydride
30
20
Nickel-Cadmium
10
0
65
70
75
80
85
90
95
Year
Expected battery lifetime increase
over the next 5 years: 30 to 40%
EE141
ELEC
301 SPRING 2009
From Rabaey, 1995
88
VOLKAN KURSUN
Why worry about power? -- Standby Power
Year
2002
2005
2008
2011
2014
Power supply Vdd (V)
1.5
1.2
0.9
0.7
0.6
Threshold VT (V)
0.4
0.4
0.35
0.3
0.25

Subthreshold leakage increases as VT decreases to meet frequency
demands, leading to excessive battery draining standby power
consumption.
8KW
50%
…phones leak!
Standby Power
40%
1.7KW
30%
20%
400W
88W
12W
10%
0%
2000
EE141
ELEC
301 SPRING 2009
2002
2004
2006
2008
Source: Borkar, De Intel
89
VOLKAN KURSUN
Power and Energy Figures of Merit




Power consumption in Watts
 determines battery life in hours
Peak power
 determines power ground wiring designs
 sets packaging limits
 impacts signal noise margin and reliability analysis
Energy in Joules
 rate at which power is consumed over time
Energy = power * delay
 Joules = Watts * seconds
 Lower energy number means less power to perform a
computation at the same frequency
EE141
ELEC
301 SPRING 2009
90
VOLKAN KURSUN
Power versus Energy
Power is height of curve
Watts
Lower power design could simply be slower
Approach 1
Approach 2
Watts
time
Energy is area under curve
Two approaches require the same energy
Approach 1
Approach 2
time
EE141
ELEC
301 SPRING 2009
91
VOLKAN KURSUN
PDP and EDP
Power-delay product (PDP) = Pav * tp = (CLVDD2)/2



PDP is the average energy consumed per switching event
(Watts * sec = Joule)
lower power design could simply be a slower design
Energy-delay product (EDP) = PDP * tp = Pav * tp2


EDP is the average energy
consumed multiplied by the
computation time required
takes into account that one
can trade increased delay
for lower energy/operation
(e.g., via supply voltage
scaling that increases delay,
but decreases energy
consumption)
Energy-Delay (normalized)

15
energy-delay
10
energy
5
delay
0
0.5

allows one to understand tradeoffs better
EE141
ELEC
301 SPRING 2009
1
1.5
Vdd (V)
2
2.5
92
VOLKAN KURSUN
Understanding Tradeoffs

Which design is the “best” (fastest, coolest, both) ?
Lower
EDP
b
c
a
d
1/Delay
EE141
ELEC
301 SPRING 2009
better
93
VOLKAN KURSUN
Where Does Power Go in CMOS?
• Dynamic Power Consumption
Charging and Discharging Capacitors
• Short Circuit Currents
Short Circuit Path between Supply Rails during Switching
• Leakage
Leaking diodes and transistors
EE141
ELEC
301 SPRING 2009
94
VOLKAN KURSUN
CMOS Energy & Power Equations
E = CL VDD2 P01 + tsc VDD Ipeak P01 + VDD Ileakage
P0→1: probability of energy consuming transitions at the output of a gate
(activity factor)
f01 = P01 * fclock
P = CL VDD2 f01 + tscVDD Ipeak f01 + VDD Ileakage
Dynamic
power
Short-circuit
power
Leakage
power
f0→1: frequency of energy consuming transitions (switching activity)
EE141
ELEC
301 SPRING 2009
95
VOLKAN KURSUN
Dynamic Power Consumption
Vdd
Vin
Vout
CL
Energy/transition = CL *
VDD2
f01
* P01
Pdyn = Energy/transition * f = CL * VDD2 * P01 * f
Pdyn = CEFF * VDD2 * f
where CEFF = P01 CL
Not a function of transistor sizes?
Data dependent - a function of switching activity!
EE141
ELEC
301 SPRING 2009
96
VOLKAN KURSUN
Lowering Dynamic Power
Capacitance:
Function of fan-out,
wire length, transistor
sizes
Supply Voltage:
Has been dropping
with successive
generations
Pdyn = CL VDD2 P01 f
Activity factor:
How often, on average,
do wires switch?
EE141
ELEC
301 SPRING 2009
Clock frequency:
Increasing…
97
VOLKAN KURSUN
Modification for Circuits with Reduced Swing
Vdd
Vdd
Vdd -Vt
CL
E0
1
= CL  Vdd  ( Vdd – Vt 
Can exploit reduced sw ing to low er power
(e.g., reduced bit-line swing in memory)
EE141
ELEC
301 SPRING 2009
98
VOLKAN KURSUN
Transistor Sizing and VDD Scaling for Minimum Energy
 Problem:
minimize the energy dissipation of a
circuit with a specified lower boundary on speed
 Lower the supply voltage to reduce energy
 Compensate for the loss in speed by increasing the
transistor sizes
 Tradeoff: increasing transistor sizes increases the
capacitance
 At a specific lower supply voltage, the increasing
capacitance of transistors begins to dominate the
power and energy consumption
– Energy increases with further scaling of VDD
EE141
ELEC
301 SPRING 2009
99
VOLKAN KURSUN
Transistor Sizing and VDD Scaling for Minimum Energy
In
Out
Cg1

1
f
Cext
Goal: Minimize the energy of the whole circuit
 Design parameters are f and VDD
Delay of the reference circuit sets the upper boundary
 tp  tpref
of delay (lower boundary of speed)
– Assume a reference circuit with f = 1 and VDD = Vref

 Cext





f  
f2 
f   Cg 2

t p  t p 0  1    1     t p 0  1    1 
 
 

   
 




Cext



fCg1
f  

t p  t p 0  1    1 
 

 

EE141
ELEC
301 SPRING 2009 





 



   t  1  f   1  F  

  p 0     
f


  100

 
VOLKAN KURSUN

Transistor Sizing and VDD Scaling for Minimum Energy
In
Out
Cg1

1
f
Cext
Goal: Minimize the energy of the whole circuit
 Design parameters: f and VDD
Delay of the reference circuit sets the upper boundary
 tp  tpref
of delay (lower boundary of speed)
– Reference circuit with f = 1 and VDD = Vref
1
V
3 VDD
7
Req 
dV 
(1  lVDD )

VDD / 2 I DSAT (1  lV )
4 I DSAT
9
ignore _ channel _ length _ mod ulation _(assume : l  0)
 Req 
3 VDD
4 I DSAT
EE141
ELEC
301 SPRING 2009
101
VOLKAN KURSUN
Intrinsic Delay As a Function of VDD
In
Out
Cg1
Req 
1
f
Cext
1
V
3 VDD
7
dV

(
1

lVDD )

VDD / 2 I DSAT (1  lV )
4 I DSAT
9
ignore _ channel _ length _ mod ulation _(assume : l  0)
3 VDD
 Req 
4 I DSAT
CintVDD
3 VDD
t p 0  ln( 2)Cint Req  0.69Cint
 0.52
VDSAT
W
4 I DSAT
Cox VDSAT (VDD  VT 
)
L
2
VDD
 t p 0
VDSAT
VDD  VT 
2
102
EE141
ELEC
301 SPRING 2009
VOLKAN KURSUN
Transistor Sizing and VDD Scaling for Minimum Energy
In
Out
Cg1

1
f
Cext
Goal: Minimize the energy of the whole circuit
 Design parameters: f and VDD
Delay of the reference circuit sets the upper boundary
 tp  tpref
of delay (lower boundary of performance)
– Reference circuit with f = 1 and VDD =Vref

f  
F 
 
t p  t p 0  1    1 
f  
   
VDD
t p0 
VDD  VTE
EE141
ELEC
301 SPRING 2009
VTE  VT  VDSAT / 2
103
VOLKAN KURSUN
Reference Circuit Delay
In
Out
Cg1

f
1
Cext
Reference circuit delay (with f = 1 and VDD = Vref)
t pref
t pref

f

 t p 0 ref  1 
 
 t p 0 ref (3  F 
 
 1  F 
F 

   t p 0 ref  1    1   
  1 
f  
1 
 1 
 
t p 0 ref 
EE141
ELEC
301 SPRING 2009
Vref
Vref  VTE
VTE  VT  VDSAT / 2
104
VOLKAN KURSUN
Optimization Circuit Delay
In
Out
Cg1

1
f
Cext
Design parameters: f and VDD


f  
F 
f   F 
   t p 0  1    1   
t p  t p 0  1    1 
f  
f 
   
 1  

F
t p  t p 0  2  f  
f 

t p0
EE141
ELEC
301 SPRING 2009
VDD

VDD  VTE
VTE  VT  VDSAT / 2
105
VOLKAN KURSUN
Transistor Sizing And VDD for Speed Criterion
 Speed
tp
t pref
Constraint (assuming  = 1)

t p0
t p 0 ref

F
 2  f  
f  VDD Vref  VTE


(3  F 
Vref VDD  VTE

F
 2  f  
f 

1
(3  F 
 This
equation establishes a relationship
between the sizing factor f and the supply
voltage based on the speed criterion
EE141
ELEC
301 SPRING 2009
106
VOLKAN KURSUN
Transistor Sizing And VDD for Performance Criterion
 Speed
tp
t pref
constraint (assuming  = 1)

t p0
t p 0 ref

F
 2  f  
f  VDD Vref  VTE


(3  F 
Vref VDD  VTE

F
 2  f  
f 

1
(3  F 
4
F=1
3.5
F=2
3
2.5
vdd (V)
Required VDD to
satisfy the speed
criterion for
different F and f
F=5
2
1.5
F=10
1
F=20
0.5
0
EE141
ELEC
301 SPRING 2009
1
2
3
4
f
5
6
7
107
VOLKAN KURSUN
Transistor Sizing, VDD, and Energy
In
Out
Cg1
 Energy
1
f
Cext
for single transition
C g _ total  C g (inv1)  C g (inv 2)  C g1 (1  f ),
Cint_total  Cint (inv1)  Cint (inv 2)  Cint1 (1  f )  C g1 (1  f )
Ctotal  C g _ total  Cint_total  Cext  C g1 (1  f )(1   )  FCg1
2
2
E  VDD
Ctotal  VDD
C g1 (1   (1  f   F 
V C g1 (1  1(1  f   F   VDD   2  2 f  F 
E
 
assu min g _   1 :





Eref
V C g1 (1  1(1  1  F   Vref   4  F 
2
DD
2
ref
EE141
ELEC
301 SPRING 2009
2
108
VOLKAN KURSUN
Simultaneous Transistor Sizing And VDD Scaling



Increasing the size of the second inverter increases the speed, permitting
the reduction of the supply voltage
 As the supply voltage is reduced the energy is reduced
Increasing size is effective for lowering energy (enhancing speed) until the
optimum tapering factor is reached at f = sqrt (F)
Further increases in device size (and f) degrade the speed and require an
increase in VDD to satisfy the performance criterion
4
1.5
F=1
3.5
2
2
vdd (V)
2.5
normalized energy
3
Required
VDD to
maintain
speed for
different F
and f
F=1
5
2
10
1.5
1
1
5
10
0.5
20
20
0.5
0
1
EE141
ELEC
301 SPRING 2009
2
3
4
f
5
6
7
0
1
2
3
4
f
5
109
6
7
VOLKAN KURSUN
Simultaneous Transistor Sizing And VDD Scaling - Conclusions


Device sizing combined with supply voltage scaling can be very effective to
lower energy consumption
 Particularly true for large fan-outs
 Note that for F = 1, the reference case is the fastest solution
Oversizing the transistors causes waste of energy (and valuable silicon real
estate)
 Can also degrade speed in a multi-stage circuit (remember optimum f)
4
1.5
F=1
F=1
3.5
2
vdd (V)
2.5
normalized energy
3
5
2
10
1.5
2
1
5
10
0.5
20
1
20
0.5
0
1
2
EE141
ELEC
301 SPRING 2009
3
4
f
5
6
7
0
1
2
3
4
f
5
6
7
110
VOLKAN KURSUN
Short Circuit Power Consumption
Vin
Isc
Vout
CL
Finite slope of the input signal causes a direct current
path between VDD and GND for a short period of time
during input switching when both the NMOS and PMOS
transistors are simultaneously conducting
EE141
ELEC
301 SPRING 2009
111
VOLKAN KURSUN
Short Circuit Current
Esc = tsc VDD Ipeak P01
Psc = tsc VDD Ipeak f01

Duration and slope of the input signal, tsc
 Ipeak


determined by
The saturation current of the PMOS and NMOS transistors which
depend on their sizes, process technology, supply voltage, and temperature
Strong function of the ratio of the input and output slopes
- A function of CL
EE141
ELEC
301 SPRING 2009
112
VOLKAN KURSUN
Impact of CL on Psc
Isc  0
Vin
Isc  Imax
Vout
CL
Vin
Vout
CL
Large capacitive load
Small capacitive load
Output fall time significantly
larger than input rise time.
Output fall time substantially
smaller than the input rise
time.
113
EE141
ELEC
301 SPRING 2009
VOLKAN KURSUN
Ipeak as a Function of CL
2.5
x 10-4
CL = 20 fF
2
When load capacitance
is small, Ipeak is large.
CL = 100 fF
1.5
1
0.5
CL = 500 fF
0
0
2
4
-0.5
Short circuit dissipation
is minimized by
increasing the rise/fall
times of the output
6
x 10-10 signal
time (sec)
500 psec input slope
EE141
ELEC
301 SPRING 2009
114
VOLKAN KURSUN
Ptotal as a Function of Rise/Fall Times
8
When load capacitance
is small (tsin/tsout > 2 for
VDD > 2V) the power is
dominated by Psc
7
VDD= 3.3 V
6
5 For sharp input
4 rise/fall, power is
higher due to
VDD = 2.5 V
3 coupling
2
1
VDD = 1.5V
0
0
2
tsin/tsou
t
W/Lp = 1.125 m/0.25 m
W/Ln = 0.375 m/0.25 m
CL = 30 fF
EE141
ELEC
301 SPRING 2009
4
If VDD < VTn + |VTp|, Psc is
eliminated since both
devices are never on at
the same time
Ratio of input and output rise
and fall times
normalized wrt the step input (zero rise/fall
time) power dissipation
115
VOLKAN KURSUN
Short-Circuit Power Summary

In a multi-stage circuit, short-circuit power is
minimized by matching the rise/fall times of the input
and output signals at each stage
 When the load capacitance is too small, the total power can be
dominated by short-circuit power
 For very large load capacitance, almost all the dynamic power is
consumed for charging/discharging the load capacitance
 Note that for large load the output transition would be slow, hence the
short-circuit power consumption in the following stage would be higher

Short-circuit current is reduced by lowering VDD
 For VDD < Vtn + |Vtp|, short-circuit current is completely eliminated
– Both devices are never on simultaneously

Since VT scaling lags VDD scaling (subthreshold
leakage constraints), short-circuit power dissipation is
becoming less important
116
EE141
ELEC
301 SPRING 2009
VOLKAN KURSUN
Leakage (Static) Power Consumption
Sub-threshold leakage current
VDD Ileakage
Drain junction
leakage
VDD
Vout = 0
Gate leakage
Sub-threshold leakage is the dominant component at high
temperature
Gate oxide leakage dominates at low temperature!
All increase exponentially with temperature!
EE141
ELEC
301 SPRING 2009
117
VOLKAN KURSUN
Leakage as a Function of VT

Continued scaling of supply voltage and the subsequent scaling
of threshold voltage will make subthreshold leakage the
dominant component of power dissipation even in active circuits
10-2
ID (A)

10-7
VT=0.4V
VT=0.1V
10-12
0
0.2
0.4
0.6
VGS (V)
EE141
ELEC
301 SPRING 2009
0.8
An 90mV/decade VT
roll-off - so each
90mV increase in VT
gives an order of
magnitude reduction
in leakage (but
adversely affects
performance)
1
118
VOLKAN KURSUN
Exponential Increase in Leakage Currents
10000
Ileakage(nA/m)
1000
0.25
0.18
0.13
0.1
100
10
1
30
EE141
ELEC
301 SPRING 2009
40
50
60
70
80
Temp(C)
90
100
110
From De,1999
119
VOLKAN KURSUN
Static Power Consumption
Vd d
Istat
Vo ut
V in =5V
CL
Pstat = P(In=1) .Vdd . Istat
Wasted •energy
… over dynamic consumption
Dominates
Should be avoided
• Not a function of switching frequency
EE141
ELEC
301 SPRING 2009
120
VOLKAN KURSUN
Review: Energy & Power Equations
E = CL VDD2 P01 + tsc VDD Ipeak P01 + VDD Ileakage
f01 = P01 * fclock
P = CL VDD2 f01 + tscVDD Ipeak f01 + VDD Ileakage
Dynamic power
(~50-60% today
and decreasing
relatively)
EE141
ELEC
301 SPRING 2009
Short-circuit
power
(~10% today
and decreasing
relatively)
Leakage power
(~30-40% today
and increasing)
121
VOLKAN KURSUN
Power and Energy Design Space
Constant
Throughput/Latency
Energy
Design Time
Variable
Throughput/Latency
Non-active Modules
Logic Design
Active
Reduced Vdd
Run Time
DFS, DVS
Clock Gating
Sizing
Multi-Vdd
(Dynamic
Freq, Voltage
Scaling)
Sleep Transistors
Leakage
+ Multi-VT
Multi-Vdd
+ Variable V T
Variable V T
EE141
ELEC
301 SPRING 2009
122
VOLKAN KURSUN