Sequential Circuits Outline Floorplanning Sequencing Sequencing Element Design Max and Min-Delay Clock Skew Time Borrowing Two-Phase Clocking Project Strategy Proposal – Specifies inputs, outputs, relation between them Floorplan – Begins with.
Download
Report
Transcript Sequential Circuits Outline Floorplanning Sequencing Sequencing Element Design Max and Min-Delay Clock Skew Time Borrowing Two-Phase Clocking Project Strategy Proposal – Specifies inputs, outputs, relation between them Floorplan – Begins with.
Sequential Circuits
Outline
Floorplanning
Sequencing
Sequencing Element Design
Max and Min-Delay
Clock Skew
Time Borrowing
Two-Phase Clocking
Project Strategy
Proposal
– Specifies inputs, outputs, relation between them
Floorplan
– Begins with block diagram
– Annotate dimensions and location of each block
– Requires detailed paper design
Schematic
– Make paper design simulate correctly
Layout
– Physical design, DRC, NCC, ERC
Floorplan
How do you estimate block areas?
– Begin with block diagram
– Each block has
• Inputs
• Outputs
• Function (draw schematic)
• Type: array, datapath, random logic
Estimation depends on type of logic
MIPS Floorplan
10 I/O pads
mips
(4.6 M2)
control
1500 x 400
(0.6 M2)
zipper 2700 x 250
datapath
2700 x 1050
(2.8 M2)
bitslice 2700 x 100
2700
3500
10 I/O pads
5000
10 I/O pads
1690
3500
5000
10 I/O pads
wiring channel: 30 tracks = 240
alucontrol
200 x 100
(20 k2)
Area Estimation
Arrays:
– Layout basic cell
– Calculate core area from # of cells
– Allow area for decoders, column circuitry
Datapaths
– Sketch slice plan
– Count area of cells from cell library
– Ensure wiring is possible
Random logic
– Compare complexity do a design you have done
MIPS Slice Plan
srcB
writedata
memdata
adr
bitlines
srcA
aluresult
immediate
pc
aluout
44 24 93 93 93 93 93 44 24 52 48 48 48 48 16 86 93 131 93 44 24 93 131 39 93 39 24 44 39 39 160131
mux4
zerodetect
ALU
fulladder
or2
and2
mux2
inv
and2
aluout
PC
flop
and2
mux4
flop
srcA
inv
mux2
srcB
flop
mux4
flop
readmux
writemux
MDR
adrmux
register file
ramslice
srampullup
dualsrambit0
dualsram
dualsram
dualsram
writedriver
inv
mux2
flop
flop
flop
flop
flop
inv
mux2
IR3...0
Typical Layout Densities
Typical numbers of high-quality layout
Derate by 2 for class projects to allow routing and
some sloppy layout.
Allocate space for big wiring channels
Element
Area
Random logic (2 metal layers)
1000-1500 2 / transistor
Datapath
250 – 750 2 / transistor
Or 6 WL + 360 2 / transistor
SRAM
1000 2 / bit
DRAM
100 2 / bit
ROM
100 2 / bit
Sequencing
Combinational logic
– output depends on current inputs
Sequential logic
– output depends on current and previous inputs
– Requires separating previous, current, future
– Called state or tokens
– Ex: FSM, pipeline
clk
in
clk
clk
clk
out
CL
Finite State Machine
CL
CL
Pipeline
Sequencing Cont.
If tokens moved through pipeline at constant speed,
no sequencing elements would be necessary
Ex: fiber-optic cable
– Light pulses (tokens) are sent down cable
– Next pulse sent before first reaches end of cable
– No need for hardware to separate pulses
– But dispersion sets min time between pulses
This is called wave pipelining in circuits
In most circuits, dispersion is high
– Delay fast tokens so they don’t catch slow ones.
Sequencing Overhead
Use flip-flops to delay fast tokens so they move
through exactly one stage each cycle.
Inevitably adds some delay to the slow tokens
Makes circuit slower than just the logic delay
– Called sequencing overhead
Some people call this clocking overhead
– But it applies to asynchronous circuits too
– Inevitable side effect of maintaining sequence
Sequencing Elements
Latch: Level sensitive
– a.k.a. transparent latch, D latch
Flip-flop: edge triggered
– A.k.a. master-slave flip-flop, D flip-flop, D register
Timing Diagrams
– Transparent
– Opaque
– Edge-trigger
clk
D
Q (latch)
Q (flop)
clk
Q
D
Flop
D
Latch
clk
Q
Sequencing Elements
Latch: Level sensitive
– a.k.a. transparent latch, D latch
Flip-flop: edge triggered
– A.k.a. master-slave flip-flop, D flip-flop, D register
Timing Diagrams
– Transparent
– Opaque
– Edge-trigger
clk
D
Q (latch)
Q (flop)
clk
Q
D
Flop
D
Latch
clk
Q
Latch Design
Pass Transistor Latch
Pros
+
+
Cons
–
–
–
–
–
–
D
Q
Latch Design
Pass Transistor Latch
Pros
+ Tiny
+ Low clock load
Cons
– Vt drop
– nonrestoring
– backdriving
– output noise sensitivity
– dynamic
– diffusion input
D
Q
Used in 1970’s
Latch Design
Transmission gate
+
-
D
Q
Latch Design
Transmission gate
+ No Vt drop
- Requires inverted clock
D
Q
Latch Design
Inverting buffer
+
+
+ Fixes either
•
•
–
X
D
Q
D
Q
Latch Design
Inverting buffer
+ Restoring
+ No backdriving
+ Fixes either
• Output noise sensitivity
• Or diffusion input
– Inverted output
X
D
Q
D
Q
Latch Design
Tristate feedback
+
–
X
D
Q
Latch Design
Tristate feedback
+ Static
– Backdriving risk
X
D
Q
Static latches are now essential
Latch Design
Buffered input
+
+
X
D
Q
Latch Design
Buffered input
+ Fixes diffusion input
+ Noninverting
X
D
Q
Latch Design
Buffered output
+
Q
X
D
Latch Design
Buffered output
+ No backdriving
X
D
Widely used in standard cells
+ Very robust (most important)
- Rather large
- Rather slow (1.5 – 2 FO4 delays)
- High clock loading
Q
Latch Design
Datapath latch
+
-
Q
X
D
Latch Design
Datapath latch
+ Smaller, faster
- unbuffered input
Q
X
D
Flip-Flop Design
Flip-flop is built as pair of back-to-back latches
X
D
Q
Q
X
D
Q
Enable
Enable: ignore clock when en = 0
– Mux: increase latch D-Q delay
– Clock Gating: increase en setup time, skew
Symbol
Multiplexer Design
Clock Gating Design
en
D
1
Q
0
en
Q
D
en
1
0
Q
Flop
D
Q
D
en
Flop
Flop
en
Q
en
D
Latch
Latch
D
Latch
Q
Reset
Force output low when reset asserted
Synchronous vs. asynchronous
Q
D
reset
Synchronous Reset
Q
reset
D
Q
reset
reset
D
Flop
Symbol
D
Latch
Q
Q
Asynchronous Reset
Q
Q
reset
reset
D
D
reset
reset
Set / Reset
Set forces output high when enabled
Flip-flop with asynchronous set and reset
reset
set
D
Q
set
reset
Sequencing Methods
clk
clk
Flop
clk
Flop
Combinational Logic
tnonoverlap
2
Combinational
Logic
Half-Cycle 1
1
Combinational
Logic
Latch
1
Half-Cycle 1
tpw
p
Combinational Logic
Latch
p
Latch
Pulsed Latches
p
tnonoverlap
Tc/2
2
Latch
2-Phase Transparent Latches
1
Latch
Flip-Flops
Flip-flops
2-Phase Latches
Pulsed Latches
Tc
Timing Diagrams
Contamination and
Propagation Delays
tcd
Logic Cont. Delay
tpcq
Latch/Flop Clk-Q Prop Delay
tccq
Latch/Flop Clk-Q Cont. Delay
tpdq
Latch D-Q Prop Delay
tpcq
Latch D-Q Cont. Delay
tsetup
Latch/Flop Setup Time
thold
Latch/Flop Hold Time
A
tpd
Y
Y
clk
clk
Flop
Logic Prop. Delay
D
Q
tcd
tsetup
thold
D
tpcq
Q
D
tccq
clk
clk
Latch
tpd
A
Combinational
Logic
tccq
Q
tsetup
tpcq
D
tcdq
Q
thold
tpdq
Max-Delay: Flip-Flops
clk
sequencing overhead
clk
Q1
Combinational Logic
D2
Tc
clk
Q1
D2
tsetup
tpcq
tpd
F2
F1
t pd Tc
clk
sequencing overhead
clk
Q1
Combinational Logic
D2
Tc
clk
Q1
D2
tsetup
tpcq
tpd
F2
t pd Tc tsetup t pcq
F1
Max-Delay: Flip-Flops
Max Delay: 2-Phase Latches
Q1
Combinational
Logic 1
D2
1
Q2
Combinational
Logic 2
1
2
Tc
D1
Q1
D2
Q2
D3
tpdq1
tpd1
tpdq2
tpd2
D3
L3
D1
sequencing overhead
2
L2
L1
t pd t pd 1 t pd 2 Tc
1
Q3
Max Delay: 2-Phase Latches
D1
sequencing overhead
Q1
Combinational
Logic 1
D2
1
Q2
Combinational
Logic 2
1
2
Tc
D1
Q1
D2
Q2
D3
tpdq1
tpd1
tpdq2
tpd2
D3
L3
pdq
2
L2
2t
L1
t pd t pd 1 t pd 2 Tc
1
Q3
Max Delay: Pulsed Latches
p
D1
sequencing overhead
p
Q1
D2
Combinational Logic
L2
L1
t pd Tc max
Q2
Tc
D1
(a) tpw > tsetup
tpdq
Q1
tpd
D2
p
tpcq
Q1
(b) tpw < tsetup
D2
Tc
tpd
tpw
tsetup
Max Delay: Pulsed Latches
D1
sequencing overhead
p
Q1
D2
Combinational Logic
L2
p
L1
t pd Tc max t pdq , t pcq tsetup t pw
Q2
Tc
D1
(a) tpw > tsetup
tpdq
Q1
tpd
D2
p
tpcq
Q1
(b) tpw < tsetup
D2
Tc
tpd
tpw
tsetup
Min-Delay: Flip-Flops
clk
F1
tcd
Q1
CL
clk
F2
D2
clk
Q1 tccq
D2
tcd
thold
Min-Delay: Flip-Flops
clk
F1
tcd thold tccq
Q1
CL
clk
F2
D2
clk
Q1 tccq
D2
tcd
thold
Min-Delay: 2-Phase Latches
1
Paradox: hold applies
twice each cycle, vs.
only once for flops.
CL
2
D2
1
tnonoverlap
tccq
2
Q1
D2
But a flop is made of
two latches!
Q1
L2
Hold time reduced by
nonoverlap
L1
tcd 1,tcd 2
tcd
thold
Min-Delay: 2-Phase Latches
1
Paradox: hold applies
twice each cycle, vs.
only once for flops.
CL
2
D2
1
tnonoverlap
tccq
2
Q1
D2
But a flop is made of
two latches!
Q1
L2
Hold time reduced by
nonoverlap
L1
tcd 1,tcd 2 thold tccq tnonoverlap
tcd
thold
Min-Delay: Pulsed Latches
p
Q1
CL
p
D2
p
L2
Hold time increased
by pulse width
L1
tcd
tpw
thold
Q1 tccq
D2
tcd
Min-Delay: Pulsed Latches
p
Q1
CL
p
D2
p
L2
Hold time increased
by pulse width
L1
tcd thold tccq t pw
tpw
thold
Q1 tccq
D2
tcd
Time Borrowing
In a flop-based system:
– Data launches on one rising edge
– Must setup before next rising edge
– If it arrives late, system fails
– If it arrives early, time is wasted
– Flops have hard edges
In a latch-based system
– Data can pass through latch while transparent
– Long cycle of logic can borrow time into next
– As long as each loop completes in one cycle
Time Borrowing Example
1
2
Combinational Logic
Borrowing time across
half-cycle boundary
Combinational
Logic
Borrowing time across
pipeline stage boundary
2
Combinational Logic
Latch
(b)
Latch
1
1
Latch
2
Latch
(a)
Latch
1
Combinational
Logic
Loops may borrow time internally but must complete within the cycle
How Much Borrowing?
D1
L1
tborrow
T
c tsetup tnonoverlap
2
1
2
Q1
Combinational Logic 1
D2
L2
2-Phase Latches
Q2
1
Pulsed Latches
2
tborrow t pw tsetup
tnonoverlap
Tc
Tc/2
Nominal Half-Cycle 1 Delay
D2
tborrow
tsetup
Clock Skew
We have assumed zero clock skew
Clocks really have uncertainty in arrival time
– Decreases maximum propagation delay
– Increases minimum contamination delay
– Decreases time borrowing
Skew: Flip-Flops
Combinational Logic
D2
Tc
clk
tcd thold tccq tskew
tpcq
Q1
D2
F1
clk
Q1
CL
clk
D2
tskew
clk
thold
Q1 tccq
D2
tskew
tpdq
F2
sequencing overhead
Q1
F1
t pd Tc t pcq tsetup tskew
clk
F2
clk
tcd
tsetup
Skew: Latches
pdq
sequencing overhead
tcd 1 , tcd 2 thold tccq tnonoverlap tskew
tborrow
1
2
Tc
tsetup tnonoverlap tskew
2
Pulsed Latches
t pd Tc max t pdq , t pcq tsetup t pw tskew
sequencing overhead
tcd thold t pw tccq tskew
tborrow t pw tsetup tskew
2
Q1
Combinational
Logic 1
D2
1
Q2
Combinational
Logic 2
D3
L3
2t
D1
L1
t pd Tc
1
L2
2-Phase Latches
Q3
Two-Phase Clocking
If setup times are violated, reduce clock speed
If hold times are violated, chip fails at any speed
In this class, working chips are most important
– No tools to analyze clock skew
An easy way to guarantee hold times is to use 2phase latches with big nonoverlap times
Call these clocks 1, 2 (ph1, ph2)
Safe Flip-Flop
In class, use flip-flop with nonoverlapping clocks
– Very slow – nonoverlap adds to setup time
– But no hold times
In industry, use a better timing analyzer
– Add buffers to slow signals if hold time is at risk
Q
X
D
Q
Summary
Flip-Flops:
– Very easy to use, supported by all tools
2-Phase Transparent Latches:
– Lots of skew tolerance and time borrowing
Pulsed Latches:
– Fast, some skew tol & borrow, hold time risk