Transcript Document

Introduction to
CMOS VLSI
Design
Sequential Circuits
Outline






Sequencing
Sequencing Element Design
Max and Min-Delay
Clock Skew
Time Borrowing
Two-Phase Clocking
CMOS VLSI Design
2
Sequencing
 Combinational logic
– output depends on current inputs
 Sequential logic
– output depends on current and previous inputs
– Requires separating previous, current, future
– Called state or tokens
– Ex: FSM, pipeline
clk
in
clk
clk
clk
out
CL
Finite State Machine
CMOS VLSI Design
CL
CL
Pipeline
3
Sequencing Cont.
 If tokens moved through pipeline at constant speed,
no sequencing elements would be necessary
 Ex: fiber-optic cable
– Light pulses (tokens) are sent down cable
– Next pulse sent before first reaches end of cable
– No need for hardware to separate pulses
– But dispersion sets min time between pulses
 This is called wave pipelining in circuits
 In most circuits, dispersion is high
– Delay fast tokens so they don’t catch slow ones.
CMOS VLSI Design
4
Sequencing Overhead
 Use flip-flops to delay fast tokens so they move
through exactly one stage each cycle.
 Inevitably adds some delay to the slow tokens
 Makes circuit slower than just the logic delay
– Called sequencing overhead
 Some people call this clocking overhead
– But it applies to asynchronous circuits too
– Inevitable side effect of maintaining sequence
CMOS VLSI Design
5
Sequencing Elements
 Latch: Level sensitive
– a.k.a. transparent latch, D latch
 Flip-flop: edge triggered
– A.k.a. master-slave flip-flop, D flip-flop, D register
 Timing Diagrams
– Transparent
– Opaque
– Edge-trigger
clk
Q
D
Flop
D
Latch
clk
Q
clk
D
Q (latch)
Q (flop)
CMOS VLSI Design
6
Sequencing Elements
 Latch: Level sensitive
– a.k.a. transparent latch, D latch
 Flip-flop: edge triggered
– A.k.a. master-slave flip-flop, D flip-flop, D register
 Timing Diagrams
– Transparent
– Opaque
– Edge-trigger
clk
Q
D
Flop
D
Latch
clk
Q
clk
D
Q (latch)
Q (flop)
CMOS VLSI Design
7
Latch Design
 Pass Transistor Latch
 Pros
+
+
 Cons
–
–
–
–
–
–
CMOS VLSI Design

D
Q
8
Latch Design
 Pass Transistor Latch
 Pros
+ Tiny
+ Low clock load
 Cons
– Vt drop
– nonrestoring
– backdriving
– output noise sensitivity
– dynamic
– diffusion input
CMOS VLSI Design

D
Q
Used in 1970’s
9
Latch Design
 Transmission gate
+
-

D
Q

CMOS VLSI Design
10
Latch Design
 Transmission gate
+ No Vt drop
- Requires inverted clock

D
Q

CMOS VLSI Design
11
Latch Design
 Inverting buffer
+
+
+ Fixes either
•
•
–
CMOS VLSI Design

X
D

Q

D
Q

12
Latch Design
 Inverting buffer
+ Restoring
+ No backdriving
+ Fixes either
• Output noise sensitivity
• Or diffusion input
– Inverted output
CMOS VLSI Design

X
D

Q

D
Q

13
Latch Design
 Tristate feedback
+
–

X
D

Q


CMOS VLSI Design
14
Latch Design
 Tristate feedback
+ Static
– Backdriving risk

X
D

Q

 Static latches are now essential

CMOS VLSI Design
15
Latch Design
 Buffered input
+
+

X
D

Q


CMOS VLSI Design
16
Latch Design
 Buffered input
+ Fixes diffusion input
+ Noninverting

X
D

Q


CMOS VLSI Design
17
Latch Design
 Buffered output
+

Q
X
D



CMOS VLSI Design
18
Latch Design
 Buffered output
+ No backdriving

X
D

 Widely used in standard cells
+ Very robust (most important)
- Rather large
- Rather slow (1.5 – 2 FO4 delays)
- High clock loading
CMOS VLSI Design
Q


19
Latch Design
 Datapath latch
+
-

Q
X
D



CMOS VLSI Design
20
Latch Design
 Datapath latch
+ Smaller, faster
- unbuffered input

Q
X
D



CMOS VLSI Design
21
Flip-Flop Design
 Flip-flop is built as pair of back-to-back latches


X
D
Q




X
D

Q


CMOS VLSI Design
Q



22
Enable
 Enable: ignore clock when en = 0
– Mux: increase latch D-Q delay
– Clock Gating: increase en setup time, skew
Symbol
Multiplexer Design
Clock Gating Design
 en
D
1
Q
0
en
Q
D
 en
1
0
Q
Q
D
en
Flop
D
Flop

Flop
Q
en

D
Latch

Latch
D
Latch

Q
en
CMOS VLSI Design
23
Reset
 Force output low when reset asserted
 Synchronous vs. asynchronous

Q
D
reset
Synchronous Reset

Q

reset
D

Q
Q







Asynchronous Reset
Q


Q


reset
reset
D
D






reset
reset

CMOS VLSI Design
Q
reset

reset
D
Flop
Symbol
D
Latch



24
Set / Reset
 Set forces output high when enabled
 Flip-flop with asynchronous set and reset


reset
set
D



Q

set
reset

CMOS VLSI Design

25
Sequencing Methods
clk
clk
Combinational Logic
tnonoverlap
Combinational
Logic
1
Combinational
Logic
Latch
2
Latch
1
Half-Cycle 1
tpw
p
Combinational Logic
Latch
p
Latch
Pulsed Latches
p
tnonoverlap
Tc/2
2
Latch
2-Phase Transparent Latches
1
Half-Cycle 1
CMOS VLSI Design
Flop
clk
Flop
Flip-Flops
 Flip-flops
 2-Phase Latches
 Pulsed Latches
Tc
26
Review Timing Definitions
CMOS VLSI Design
27
Timing Diagrams
Contamination and
Propagation Delays
tcd
Logic Cont. Delay
tpcq
Latch/Flop Clk-Q Prop Delay
tccq
Latch/Flop Clk-Q Cont. Delay
tpdq
Latch D-Q Prop Delay
tpcq
Latch D-Q Cont. Delay
tsetup
Latch/Flop Setup Time
thold
Latch/Flop Hold Time
CMOS VLSI Design
A
tpd
Y
Y
clk
clk
Flop
Logic Prop. Delay
D
Q
tcd
tsetup
thold
D
tpcq
Q
D
tccq
clk
clk
Latch
tpd
A
Combinational
Logic
tccq
Q
tsetup
tpcq
D
tcdq
thold
tpdq
Q
28
Max-Delay: Flip-Flops
clk
sequencing overhead
clk
Q1
Combinational Logic
D2
F2

F1
t pd  T c  
Tc
1. rising edge of clk trigger F1
2. data at Q1 after clk-to-Q
delay tpcq
clk
Q1
tsetup
tpcq
tpd
3. cont. logic delay to D2
D2
4. setup time for F2 before
rising edge of clk
CMOS VLSI Design
29
Max-Delay: Flip-Flops
sequencing overhead
clk
Q1
Combinational Logic
D2
F2
clk
F1
t pd  T c   t setup  t pcq 
Tc
tpd is the time allow for
combinational logic
clk
Q1
design the CL block
satisfying the constraint
CMOS VLSI Design
tsetup
tpcq
tpd
D2
30
Max Delay: 2-Phase Latches
sequencing overhead
Q1
Combinational
Logic 1
D2
1
Q2
Combinational
Logic 2
D3
L3
D1
2
L2

L1
t pd  t pd 1  t pd 2  T c  
1
Q3
1
2
Tc
D1
Q1
D2
Q2
tpdq1
tpd1
tpdq2
tpd2
D3
CMOS VLSI Design
31
Max Delay: 2-Phase Latches
sequencing overhead
Q1
Combinational
Logic 1
D2
1
Q2
Combinational
Logic 2
D3
L3
D1
2
L2
pdq
L1
 2t 
t pd  t pd 1  t pd 2  T c 
1
Q3
1
assume that tpdq1 = tpdq2
propagation delay D1
to Q1, D2 to Q2
2
Tc
D1
Q1
D2
Q2
tpdq1
tpd1
tpdq2
tpd2
D3
CMOS VLSI Design
32
Max Delay: Pulsed Latches
p
D1
sequencing overhead
p
Q1
D2
Combinational Logic
L2

L1
t pd  T c  m ax 
Q2
Tc
D1
(a) tpw > tsetup
tpdq : D to Q propa. delay
tpdq
Q1
tpd
D2
tcdq : D to Q contamination delay
p
tpcq : clk to Q propagation delay
tpcq
Q1
Tc
tpd
tpw
tsetup
(b) tpw < tsetup
D2
CMOS VLSI Design
33
Max Delay: Pulsed Latches
D1
p
Q1
D2
Combinational Logic
sequencing overhead
L2
t pd  T c  m ax  t pdq , t pcq  t setup  t pw 
L1
p
Tc
D1
If the pulse is wide enough, tpw >
(a) t
tsetup , max-delay constraint is similar
to the two-phase latches except only
one latch is in the critical path
pw
> tsetup
tpdq
Q1
tpd
D2
p
tpd < Tc - tpdq
tpcq
Q1
Tc
tpw
tpd
tsetup
(b) tpw < tsetup
If pulse width is narrow than the
setup time, data must set up before
the pulse rises
D2
tpd < Tc + tpw – tpcq – tsetup
CMOS VLSI Design
Q2
34
Min-Delay: Flip-Flops
tcd minimum logic contamination
delay
clk
F1
t cd 
Q1
CL
clk
D2
F2
If thold > tcd, the data can
incorrectly propagate through F1
and F2 two successive flip flops
on one clock edge, resulting in
system failure
clk
Q1 tccq
D2
CMOS VLSI Design
tcd
thold
35
Min-Delay: Flip-Flops
tcd minimum logic contamination
delay of CL block
clk
F1
t cd  t hold  t ccq
Q1
CL
clk
D2
F2
1. rising edge of clk trigger F1
2. after clk-to-Q cont. delay
Q1 begins change
clk
3. D2 begins to change after CL
cont delay
Q1 tccq
4. D2 should not change for at
least thold w.r.t. the rising clk, if
D2 changes it corrupts F2
D2
tcd
thold
so t cd  t hold  t ccq
CMOS VLSI Design
36
Min-Delay: 2-Phase
Latches
L1
1
t cd 1, t cd 2 
2. Data should not reach L2 until
a hold time delay the previous
falling edge of 2 i.e. L2 becomes
safely opaque.
CL
L2
2
D2
1. Data pass through L1 from
rising edge of 1
Q1
1
tnonoverlap
tccq
2
t cd 1, t cd 2  t hold  t ccq  t nonoverlap
Q1
D2
tcd
thold
We need tcd large enough to have correct
operation, meet thold requirement of L2
CMOS VLSI Design
37
Min-Delay: 2-Phase Latches
t cd 1, t cd 2  t hold  t ccq  t nonoverlap
L1
1
Paradox: hold applies
twice each cycle, vs.
only once for flops.
But a flop is made of two
latches!
CL
2
D2
1
L2
Hold time reduced by
nonoverlap
Q1
tnonoverlap
tccq
2
Q1
D2
tcd
thold
Contamination delay constraint applies to each phase of logic for latchbased systems, but to the entire cycle of logic for flip-flops.
CMOS VLSI Design
38
Min-Delay: Pulsed Latches
p
Q1
CL
p
D2
p
L2
Hold time increased
by pulse width
L1
t cd 
tpw
thold
Q1 tccq
tcd
D2
CMOS VLSI Design
39
Min-Delay: Pulsed Latches
t cd  t hold  t ccq  t pw
L1
p
CL
p
D2
p
L2
Hold time increased
by pulse width
Q1
tpw
thold
Q1 tccq
tcd
D2
tccq + tcd  tpw + thold
CMOS VLSI Design
40
Time Borrowing
 In a flop-based system:
– Data launches on one rising edge
– Must setup before next rising edge
– If it arrives late, system fails
– If it arrives early, time is wasted
– Flops have hard edges
 In a latch-based system
– Data can pass through latch while transparent
– Long cycle of logic can borrow time into next
– As long as each loop completes in one cycle
CMOS VLSI Design
41
Time Borrowing Example
1
2
Combinational Logic
Borrowing time across
half-cycle boundary
Combinational
Logic
Borrowing time across
pipeline stage boundary
2
Combinational Logic
Latch
(b)
Latch
1
1
Latch
2
Latch
(a)
Latch
1
Combinational
Logic
Loops may borrow time internally but must complete within the cycle
CMOS VLSI Design
42
How Much Borrowing?
2
  t setup  t nonoverlap 
D1
L1
t borrow 
Tc
1
2
Q1
Combinational Logic 1
D2
L2
2-Phase Latches
Q2
1
2
tnonoverlap
Tc
Tc/2
Nominal Half-Cycle 1 Delay
tborrow
tsetup
D2
Pulsed Latches
t borrow  t pw  t setup
CMOS VLSI Design
Data can depart the first latch on the rising edge
of the clock and does not have to set up until the
falling edge of the clock on the receiving latch
43
Clock Skew
 We have assumed zero clock skew
 Clocks really have uncertainty in arrival time
– Decreases maximum propagation delay
– Increases minimum contamination delay
– Decreases time borrowing
CMOS VLSI Design
44
Skew: Flip-Flops
F1
tpd : propagation delay of CL
clk
tpcq
tpdq
tsetup
D2
clk
F1
tccq + tcd  tskew + thold
Q1
CL
clk
D2
CMOS VLSI Design
tskew
tpd
Q1
tcd : contamination delay of CL
Launching flop receives its
clock early, the receiving
flop receives its clock late
 clock skew effectively
increases the hold time
D2
Tc
sequencing overhead
t cd  t hold  t ccq  t skew
Combinational Logic
F2
t pd  T c   t pcq  t setup  t skew 
Q1
clk
F2
clk
tskew
clk
thold
Q1 tccq
D2
tcd
45
Skew: Latches
 2t 
t cd 1 , t cd 2  t hold  t ccq  t nonoverlap  t skew
Tc
2
Combinational
Logic 1
D2
Q2
Combinational
Logic 2
D3
Q3
pdq
sequencing overhead
t borrow 
Q1
1
L3
t pd  T c 
D1
2
L2
2-Phase Latches
L1
1
1
2
  t setup  t nonoverlap  t skew 
Latch-based design, clock skew does not degrade performance
Data arrives at the latches while they are transparent even clocks are skewed.
Latch based design systems are skew-tolerant.
CMOS VLSI Design
46
Skew: Pulsed Latches
Pulsed Latches
t pd  T c  m ax  t pdq , t pcq  t setup  t pw  t skew 
sequencing overhead
t cd  t hold  t pw  t ccq  t skew
t borrow  t pw   t setup  t skew 
If the pulse width is wide enough, the skew will not increase overhead
If the pulse width is narrow, skew can degrade the performance
CMOS VLSI Design
47
Two-Phase Clocking
 If setup times are violated, reduce clock speed
 If hold times are violated, chip fails at any speed
 An easy way to guarantee hold times is to use 2phase latches with big nonoverlap times
 Call these clocks 1, 2 (ph1, ph2)
CMOS VLSI Design
48
Safe Flip-Flop
In industry, use a better timing analyzer
– Add buffers to slow signals if hold time is at risk


Q
X
D

Q





Power PC 603 datapath used this flip-flop
CMOS VLSI Design
49
Differential Flip-flops
Accepts true and complementary
inputs
Produce true and complementary
outputs
Works well for low-swing inputs
such as register file bitlines and
low-swing busses
When  is low, precharge X, X’
When  is high, either X or X’ is
pulled down, cross-coupled
pMOS work as a keeper
Cross-coupled NAND gates work
as a SR latch capturing and
holding the data .
CMOS VLSI Design
50
Differential Flip-flops
Replace the cross-coupled NAND gates by a faster latch
CMOS VLSI Design
51
Summary
 Flip-Flops:
– Very easy to use, supported by all tools
 2-Phase Transparent Latches:
– Lots of skew tolerance and time borrowing
 Pulsed Latches:
– Fast, lowest sequencing overhead, susceptible to mindelay problem
CMOS VLSI Design
52