Transcript Document
Introduction to CMOS VLSI Design Sequential Circuits Outline Sequencing Sequencing Element Design Max and Min-Delay Clock Skew Time Borrowing Two-Phase Clocking CMOS VLSI Design 2 Sequencing Combinational logic – output depends on current inputs Sequential logic – output depends on current and previous inputs – Requires separating previous, current, future – Called state or tokens – Ex: FSM, pipeline clk in clk clk clk out CL Finite State Machine CMOS VLSI Design CL CL Pipeline 3 Sequencing Cont. If tokens moved through pipeline at constant speed, no sequencing elements would be necessary Ex: fiber-optic cable – Light pulses (tokens) are sent down cable – Next pulse sent before first reaches end of cable – No need for hardware to separate pulses – But dispersion sets min time between pulses This is called wave pipelining in circuits In most circuits, dispersion is high – Delay fast tokens so they don’t catch slow ones. CMOS VLSI Design 4 Sequencing Overhead Use flip-flops to delay fast tokens so they move through exactly one stage each cycle. Inevitably adds some delay to the slow tokens Makes circuit slower than just the logic delay – Called sequencing overhead Some people call this clocking overhead – But it applies to asynchronous circuits too – Inevitable side effect of maintaining sequence CMOS VLSI Design 5 Sequencing Elements Latch: Level sensitive – a.k.a. transparent latch, D latch Flip-flop: edge triggered – A.k.a. master-slave flip-flop, D flip-flop, D register Timing Diagrams – Transparent – Opaque – Edge-trigger clk Q D Flop D Latch clk Q clk D Q (latch) Q (flop) CMOS VLSI Design 6 Sequencing Elements Latch: Level sensitive – a.k.a. transparent latch, D latch Flip-flop: edge triggered – A.k.a. master-slave flip-flop, D flip-flop, D register Timing Diagrams – Transparent – Opaque – Edge-trigger clk Q D Flop D Latch clk Q clk D Q (latch) Q (flop) CMOS VLSI Design 7 Latch Design Pass Transistor Latch Pros + + Cons – – – – – – CMOS VLSI Design D Q 8 Latch Design Pass Transistor Latch Pros + Tiny + Low clock load Cons – Vt drop – nonrestoring – backdriving – output noise sensitivity – dynamic – diffusion input CMOS VLSI Design D Q Used in 1970’s 9 Latch Design Transmission gate + - D Q CMOS VLSI Design 10 Latch Design Transmission gate + No Vt drop - Requires inverted clock D Q CMOS VLSI Design 11 Latch Design Inverting buffer + + + Fixes either • • – CMOS VLSI Design X D Q D Q 12 Latch Design Inverting buffer + Restoring + No backdriving + Fixes either • Output noise sensitivity • Or diffusion input – Inverted output CMOS VLSI Design X D Q D Q 13 Latch Design Tristate feedback + – X D Q CMOS VLSI Design 14 Latch Design Tristate feedback + Static – Backdriving risk X D Q Static latches are now essential CMOS VLSI Design 15 Latch Design Buffered input + + X D Q CMOS VLSI Design 16 Latch Design Buffered input + Fixes diffusion input + Noninverting X D Q CMOS VLSI Design 17 Latch Design Buffered output + Q X D CMOS VLSI Design 18 Latch Design Buffered output + No backdriving X D Widely used in standard cells + Very robust (most important) - Rather large - Rather slow (1.5 – 2 FO4 delays) - High clock loading CMOS VLSI Design Q 19 Latch Design Datapath latch + - Q X D CMOS VLSI Design 20 Latch Design Datapath latch + Smaller, faster - unbuffered input Q X D CMOS VLSI Design 21 Flip-Flop Design Flip-flop is built as pair of back-to-back latches X D Q X D Q CMOS VLSI Design Q 22 Enable Enable: ignore clock when en = 0 – Mux: increase latch D-Q delay – Clock Gating: increase en setup time, skew Symbol Multiplexer Design Clock Gating Design en D 1 Q 0 en Q D en 1 0 Q Q D en Flop D Flop Flop Q en D Latch Latch D Latch Q en CMOS VLSI Design 23 Reset Force output low when reset asserted Synchronous vs. asynchronous Q D reset Synchronous Reset Q reset D Q Q Asynchronous Reset Q Q reset reset D D reset reset CMOS VLSI Design Q reset reset D Flop Symbol D Latch 24 Set / Reset Set forces output high when enabled Flip-flop with asynchronous set and reset reset set D Q set reset CMOS VLSI Design 25 Sequencing Methods clk clk Combinational Logic tnonoverlap Combinational Logic 1 Combinational Logic Latch 2 Latch 1 Half-Cycle 1 tpw p Combinational Logic Latch p Latch Pulsed Latches p tnonoverlap Tc/2 2 Latch 2-Phase Transparent Latches 1 Half-Cycle 1 CMOS VLSI Design Flop clk Flop Flip-Flops Flip-flops 2-Phase Latches Pulsed Latches Tc 26 Review Timing Definitions CMOS VLSI Design 27 Timing Diagrams Contamination and Propagation Delays tcd Logic Cont. Delay tpcq Latch/Flop Clk-Q Prop Delay tccq Latch/Flop Clk-Q Cont. Delay tpdq Latch D-Q Prop Delay tpcq Latch D-Q Cont. Delay tsetup Latch/Flop Setup Time thold Latch/Flop Hold Time CMOS VLSI Design A tpd Y Y clk clk Flop Logic Prop. Delay D Q tcd tsetup thold D tpcq Q D tccq clk clk Latch tpd A Combinational Logic tccq Q tsetup tpcq D tcdq thold tpdq Q 28 Max-Delay: Flip-Flops clk sequencing overhead clk Q1 Combinational Logic D2 F2 F1 t pd T c Tc 1. rising edge of clk trigger F1 2. data at Q1 after clk-to-Q delay tpcq clk Q1 tsetup tpcq tpd 3. cont. logic delay to D2 D2 4. setup time for F2 before rising edge of clk CMOS VLSI Design 29 Max-Delay: Flip-Flops sequencing overhead clk Q1 Combinational Logic D2 F2 clk F1 t pd T c t setup t pcq Tc tpd is the time allow for combinational logic clk Q1 design the CL block satisfying the constraint CMOS VLSI Design tsetup tpcq tpd D2 30 Max Delay: 2-Phase Latches sequencing overhead Q1 Combinational Logic 1 D2 1 Q2 Combinational Logic 2 D3 L3 D1 2 L2 L1 t pd t pd 1 t pd 2 T c 1 Q3 1 2 Tc D1 Q1 D2 Q2 tpdq1 tpd1 tpdq2 tpd2 D3 CMOS VLSI Design 31 Max Delay: 2-Phase Latches sequencing overhead Q1 Combinational Logic 1 D2 1 Q2 Combinational Logic 2 D3 L3 D1 2 L2 pdq L1 2t t pd t pd 1 t pd 2 T c 1 Q3 1 assume that tpdq1 = tpdq2 propagation delay D1 to Q1, D2 to Q2 2 Tc D1 Q1 D2 Q2 tpdq1 tpd1 tpdq2 tpd2 D3 CMOS VLSI Design 32 Max Delay: Pulsed Latches p D1 sequencing overhead p Q1 D2 Combinational Logic L2 L1 t pd T c m ax Q2 Tc D1 (a) tpw > tsetup tpdq : D to Q propa. delay tpdq Q1 tpd D2 tcdq : D to Q contamination delay p tpcq : clk to Q propagation delay tpcq Q1 Tc tpd tpw tsetup (b) tpw < tsetup D2 CMOS VLSI Design 33 Max Delay: Pulsed Latches D1 p Q1 D2 Combinational Logic sequencing overhead L2 t pd T c m ax t pdq , t pcq t setup t pw L1 p Tc D1 If the pulse is wide enough, tpw > (a) t tsetup , max-delay constraint is similar to the two-phase latches except only one latch is in the critical path pw > tsetup tpdq Q1 tpd D2 p tpd < Tc - tpdq tpcq Q1 Tc tpw tpd tsetup (b) tpw < tsetup If pulse width is narrow than the setup time, data must set up before the pulse rises D2 tpd < Tc + tpw – tpcq – tsetup CMOS VLSI Design Q2 34 Min-Delay: Flip-Flops tcd minimum logic contamination delay clk F1 t cd Q1 CL clk D2 F2 If thold > tcd, the data can incorrectly propagate through F1 and F2 two successive flip flops on one clock edge, resulting in system failure clk Q1 tccq D2 CMOS VLSI Design tcd thold 35 Min-Delay: Flip-Flops tcd minimum logic contamination delay of CL block clk F1 t cd t hold t ccq Q1 CL clk D2 F2 1. rising edge of clk trigger F1 2. after clk-to-Q cont. delay Q1 begins change clk 3. D2 begins to change after CL cont delay Q1 tccq 4. D2 should not change for at least thold w.r.t. the rising clk, if D2 changes it corrupts F2 D2 tcd thold so t cd t hold t ccq CMOS VLSI Design 36 Min-Delay: 2-Phase Latches L1 1 t cd 1, t cd 2 2. Data should not reach L2 until a hold time delay the previous falling edge of 2 i.e. L2 becomes safely opaque. CL L2 2 D2 1. Data pass through L1 from rising edge of 1 Q1 1 tnonoverlap tccq 2 t cd 1, t cd 2 t hold t ccq t nonoverlap Q1 D2 tcd thold We need tcd large enough to have correct operation, meet thold requirement of L2 CMOS VLSI Design 37 Min-Delay: 2-Phase Latches t cd 1, t cd 2 t hold t ccq t nonoverlap L1 1 Paradox: hold applies twice each cycle, vs. only once for flops. But a flop is made of two latches! CL 2 D2 1 L2 Hold time reduced by nonoverlap Q1 tnonoverlap tccq 2 Q1 D2 tcd thold Contamination delay constraint applies to each phase of logic for latchbased systems, but to the entire cycle of logic for flip-flops. CMOS VLSI Design 38 Min-Delay: Pulsed Latches p Q1 CL p D2 p L2 Hold time increased by pulse width L1 t cd tpw thold Q1 tccq tcd D2 CMOS VLSI Design 39 Min-Delay: Pulsed Latches t cd t hold t ccq t pw L1 p CL p D2 p L2 Hold time increased by pulse width Q1 tpw thold Q1 tccq tcd D2 tccq + tcd tpw + thold CMOS VLSI Design 40 Time Borrowing In a flop-based system: – Data launches on one rising edge – Must setup before next rising edge – If it arrives late, system fails – If it arrives early, time is wasted – Flops have hard edges In a latch-based system – Data can pass through latch while transparent – Long cycle of logic can borrow time into next – As long as each loop completes in one cycle CMOS VLSI Design 41 Time Borrowing Example 1 2 Combinational Logic Borrowing time across half-cycle boundary Combinational Logic Borrowing time across pipeline stage boundary 2 Combinational Logic Latch (b) Latch 1 1 Latch 2 Latch (a) Latch 1 Combinational Logic Loops may borrow time internally but must complete within the cycle CMOS VLSI Design 42 How Much Borrowing? 2 t setup t nonoverlap D1 L1 t borrow Tc 1 2 Q1 Combinational Logic 1 D2 L2 2-Phase Latches Q2 1 2 tnonoverlap Tc Tc/2 Nominal Half-Cycle 1 Delay tborrow tsetup D2 Pulsed Latches t borrow t pw t setup CMOS VLSI Design Data can depart the first latch on the rising edge of the clock and does not have to set up until the falling edge of the clock on the receiving latch 43 Clock Skew We have assumed zero clock skew Clocks really have uncertainty in arrival time – Decreases maximum propagation delay – Increases minimum contamination delay – Decreases time borrowing CMOS VLSI Design 44 Skew: Flip-Flops F1 tpd : propagation delay of CL clk tpcq tpdq tsetup D2 clk F1 tccq + tcd tskew + thold Q1 CL clk D2 CMOS VLSI Design tskew tpd Q1 tcd : contamination delay of CL Launching flop receives its clock early, the receiving flop receives its clock late clock skew effectively increases the hold time D2 Tc sequencing overhead t cd t hold t ccq t skew Combinational Logic F2 t pd T c t pcq t setup t skew Q1 clk F2 clk tskew clk thold Q1 tccq D2 tcd 45 Skew: Latches 2t t cd 1 , t cd 2 t hold t ccq t nonoverlap t skew Tc 2 Combinational Logic 1 D2 Q2 Combinational Logic 2 D3 Q3 pdq sequencing overhead t borrow Q1 1 L3 t pd T c D1 2 L2 2-Phase Latches L1 1 1 2 t setup t nonoverlap t skew Latch-based design, clock skew does not degrade performance Data arrives at the latches while they are transparent even clocks are skewed. Latch based design systems are skew-tolerant. CMOS VLSI Design 46 Skew: Pulsed Latches Pulsed Latches t pd T c m ax t pdq , t pcq t setup t pw t skew sequencing overhead t cd t hold t pw t ccq t skew t borrow t pw t setup t skew If the pulse width is wide enough, the skew will not increase overhead If the pulse width is narrow, skew can degrade the performance CMOS VLSI Design 47 Two-Phase Clocking If setup times are violated, reduce clock speed If hold times are violated, chip fails at any speed An easy way to guarantee hold times is to use 2phase latches with big nonoverlap times Call these clocks 1, 2 (ph1, ph2) CMOS VLSI Design 48 Safe Flip-Flop In industry, use a better timing analyzer – Add buffers to slow signals if hold time is at risk Q X D Q Power PC 603 datapath used this flip-flop CMOS VLSI Design 49 Differential Flip-flops Accepts true and complementary inputs Produce true and complementary outputs Works well for low-swing inputs such as register file bitlines and low-swing busses When is low, precharge X, X’ When is high, either X or X’ is pulled down, cross-coupled pMOS work as a keeper Cross-coupled NAND gates work as a SR latch capturing and holding the data . CMOS VLSI Design 50 Differential Flip-flops Replace the cross-coupled NAND gates by a faster latch CMOS VLSI Design 51 Summary Flip-Flops: – Very easy to use, supported by all tools 2-Phase Transparent Latches: – Lots of skew tolerance and time borrowing Pulsed Latches: – Fast, lowest sequencing overhead, susceptible to mindelay problem CMOS VLSI Design 52