Sequential Circuits Outline Floorplanning Sequencing Sequencing Element Design Max and Min-Delay Clock Skew Time Borrowing Two-Phase Clocking Project Strategy Proposal – Specifies inputs, outputs, relation between them Floorplan – Begins with.
Download ReportTranscript Sequential Circuits Outline Floorplanning Sequencing Sequencing Element Design Max and Min-Delay Clock Skew Time Borrowing Two-Phase Clocking Project Strategy Proposal – Specifies inputs, outputs, relation between them Floorplan – Begins with.
Sequential Circuits Outline Floorplanning Sequencing Sequencing Element Design Max and Min-Delay Clock Skew Time Borrowing Two-Phase Clocking Project Strategy Proposal – Specifies inputs, outputs, relation between them Floorplan – Begins with block diagram – Annotate dimensions and location of each block – Requires detailed paper design Schematic – Make paper design simulate correctly Layout – Physical design, DRC, NCC, ERC Floorplan How do you estimate block areas? – Begin with block diagram – Each block has • Inputs • Outputs • Function (draw schematic) • Type: array, datapath, random logic Estimation depends on type of logic MIPS Floorplan 10 I/O pads mips (4.6 M2) control 1500 x 400 (0.6 M2) zipper 2700 x 250 datapath 2700 x 1050 (2.8 M2) bitslice 2700 x 100 2700 3500 10 I/O pads 5000 10 I/O pads 1690 3500 5000 10 I/O pads wiring channel: 30 tracks = 240 alucontrol 200 x 100 (20 k2) Area Estimation Arrays: – Layout basic cell – Calculate core area from # of cells – Allow area for decoders, column circuitry Datapaths – Sketch slice plan – Count area of cells from cell library – Ensure wiring is possible Random logic – Compare complexity do a design you have done MIPS Slice Plan srcB writedata memdata adr bitlines srcA aluresult immediate pc aluout 44 24 93 93 93 93 93 44 24 52 48 48 48 48 16 86 93 131 93 44 24 93 131 39 93 39 24 44 39 39 160131 mux4 zerodetect ALU fulladder or2 and2 mux2 inv and2 aluout PC flop and2 mux4 flop srcA inv mux2 srcB flop mux4 flop readmux writemux MDR adrmux register file ramslice srampullup dualsrambit0 dualsram dualsram dualsram writedriver inv mux2 flop flop flop flop flop inv mux2 IR3...0 Typical Layout Densities Typical numbers of high-quality layout Derate by 2 for class projects to allow routing and some sloppy layout. Allocate space for big wiring channels Element Area Random logic (2 metal layers) 1000-1500 2 / transistor Datapath 250 – 750 2 / transistor Or 6 WL + 360 2 / transistor SRAM 1000 2 / bit DRAM 100 2 / bit ROM 100 2 / bit Sequencing Combinational logic – output depends on current inputs Sequential logic – output depends on current and previous inputs – Requires separating previous, current, future – Called state or tokens – Ex: FSM, pipeline clk in clk clk clk out CL Finite State Machine CL CL Pipeline Sequencing Cont. If tokens moved through pipeline at constant speed, no sequencing elements would be necessary Ex: fiber-optic cable – Light pulses (tokens) are sent down cable – Next pulse sent before first reaches end of cable – No need for hardware to separate pulses – But dispersion sets min time between pulses This is called wave pipelining in circuits In most circuits, dispersion is high – Delay fast tokens so they don’t catch slow ones. Sequencing Overhead Use flip-flops to delay fast tokens so they move through exactly one stage each cycle. Inevitably adds some delay to the slow tokens Makes circuit slower than just the logic delay – Called sequencing overhead Some people call this clocking overhead – But it applies to asynchronous circuits too – Inevitable side effect of maintaining sequence Sequencing Elements Latch: Level sensitive – a.k.a. transparent latch, D latch Flip-flop: edge triggered – A.k.a. master-slave flip-flop, D flip-flop, D register Timing Diagrams – Transparent – Opaque – Edge-trigger clk D Q (latch) Q (flop) clk Q D Flop D Latch clk Q Sequencing Elements Latch: Level sensitive – a.k.a. transparent latch, D latch Flip-flop: edge triggered – A.k.a. master-slave flip-flop, D flip-flop, D register Timing Diagrams – Transparent – Opaque – Edge-trigger clk D Q (latch) Q (flop) clk Q D Flop D Latch clk Q Latch Design Pass Transistor Latch Pros + + Cons – – – – – – D Q Latch Design Pass Transistor Latch Pros + Tiny + Low clock load Cons – Vt drop – nonrestoring – backdriving – output noise sensitivity – dynamic – diffusion input D Q Used in 1970’s Latch Design Transmission gate + - D Q Latch Design Transmission gate + No Vt drop - Requires inverted clock D Q Latch Design Inverting buffer + + + Fixes either • • – X D Q D Q Latch Design Inverting buffer + Restoring + No backdriving + Fixes either • Output noise sensitivity • Or diffusion input – Inverted output X D Q D Q Latch Design Tristate feedback + – X D Q Latch Design Tristate feedback + Static – Backdriving risk X D Q Static latches are now essential Latch Design Buffered input + + X D Q Latch Design Buffered input + Fixes diffusion input + Noninverting X D Q Latch Design Buffered output + Q X D Latch Design Buffered output + No backdriving X D Widely used in standard cells + Very robust (most important) - Rather large - Rather slow (1.5 – 2 FO4 delays) - High clock loading Q Latch Design Datapath latch + - Q X D Latch Design Datapath latch + Smaller, faster - unbuffered input Q X D Flip-Flop Design Flip-flop is built as pair of back-to-back latches X D Q Q X D Q Enable Enable: ignore clock when en = 0 – Mux: increase latch D-Q delay – Clock Gating: increase en setup time, skew Symbol Multiplexer Design Clock Gating Design en D 1 Q 0 en Q D en 1 0 Q Flop D Q D en Flop Flop en Q en D Latch Latch D Latch Q Reset Force output low when reset asserted Synchronous vs. asynchronous Q D reset Synchronous Reset Q reset D Q reset reset D Flop Symbol D Latch Q Q Asynchronous Reset Q Q reset reset D D reset reset Set / Reset Set forces output high when enabled Flip-flop with asynchronous set and reset reset set D Q set reset Sequencing Methods clk clk Flop clk Flop Combinational Logic tnonoverlap 2 Combinational Logic Half-Cycle 1 1 Combinational Logic Latch 1 Half-Cycle 1 tpw p Combinational Logic Latch p Latch Pulsed Latches p tnonoverlap Tc/2 2 Latch 2-Phase Transparent Latches 1 Latch Flip-Flops Flip-flops 2-Phase Latches Pulsed Latches Tc Timing Diagrams Contamination and Propagation Delays tcd Logic Cont. Delay tpcq Latch/Flop Clk-Q Prop Delay tccq Latch/Flop Clk-Q Cont. Delay tpdq Latch D-Q Prop Delay tpcq Latch D-Q Cont. Delay tsetup Latch/Flop Setup Time thold Latch/Flop Hold Time A tpd Y Y clk clk Flop Logic Prop. Delay D Q tcd tsetup thold D tpcq Q D tccq clk clk Latch tpd A Combinational Logic tccq Q tsetup tpcq D tcdq Q thold tpdq Max-Delay: Flip-Flops clk sequencing overhead clk Q1 Combinational Logic D2 Tc clk Q1 D2 tsetup tpcq tpd F2 F1 t pd Tc clk sequencing overhead clk Q1 Combinational Logic D2 Tc clk Q1 D2 tsetup tpcq tpd F2 t pd Tc tsetup t pcq F1 Max-Delay: Flip-Flops Max Delay: 2-Phase Latches Q1 Combinational Logic 1 D2 1 Q2 Combinational Logic 2 1 2 Tc D1 Q1 D2 Q2 D3 tpdq1 tpd1 tpdq2 tpd2 D3 L3 D1 sequencing overhead 2 L2 L1 t pd t pd 1 t pd 2 Tc 1 Q3 Max Delay: 2-Phase Latches D1 sequencing overhead Q1 Combinational Logic 1 D2 1 Q2 Combinational Logic 2 1 2 Tc D1 Q1 D2 Q2 D3 tpdq1 tpd1 tpdq2 tpd2 D3 L3 pdq 2 L2 2t L1 t pd t pd 1 t pd 2 Tc 1 Q3 Max Delay: Pulsed Latches p D1 sequencing overhead p Q1 D2 Combinational Logic L2 L1 t pd Tc max Q2 Tc D1 (a) tpw > tsetup tpdq Q1 tpd D2 p tpcq Q1 (b) tpw < tsetup D2 Tc tpd tpw tsetup Max Delay: Pulsed Latches D1 sequencing overhead p Q1 D2 Combinational Logic L2 p L1 t pd Tc max t pdq , t pcq tsetup t pw Q2 Tc D1 (a) tpw > tsetup tpdq Q1 tpd D2 p tpcq Q1 (b) tpw < tsetup D2 Tc tpd tpw tsetup Min-Delay: Flip-Flops clk F1 tcd Q1 CL clk F2 D2 clk Q1 tccq D2 tcd thold Min-Delay: Flip-Flops clk F1 tcd thold tccq Q1 CL clk F2 D2 clk Q1 tccq D2 tcd thold Min-Delay: 2-Phase Latches 1 Paradox: hold applies twice each cycle, vs. only once for flops. CL 2 D2 1 tnonoverlap tccq 2 Q1 D2 But a flop is made of two latches! Q1 L2 Hold time reduced by nonoverlap L1 tcd 1,tcd 2 tcd thold Min-Delay: 2-Phase Latches 1 Paradox: hold applies twice each cycle, vs. only once for flops. CL 2 D2 1 tnonoverlap tccq 2 Q1 D2 But a flop is made of two latches! Q1 L2 Hold time reduced by nonoverlap L1 tcd 1,tcd 2 thold tccq tnonoverlap tcd thold Min-Delay: Pulsed Latches p Q1 CL p D2 p L2 Hold time increased by pulse width L1 tcd tpw thold Q1 tccq D2 tcd Min-Delay: Pulsed Latches p Q1 CL p D2 p L2 Hold time increased by pulse width L1 tcd thold tccq t pw tpw thold Q1 tccq D2 tcd Time Borrowing In a flop-based system: – Data launches on one rising edge – Must setup before next rising edge – If it arrives late, system fails – If it arrives early, time is wasted – Flops have hard edges In a latch-based system – Data can pass through latch while transparent – Long cycle of logic can borrow time into next – As long as each loop completes in one cycle Time Borrowing Example 1 2 Combinational Logic Borrowing time across half-cycle boundary Combinational Logic Borrowing time across pipeline stage boundary 2 Combinational Logic Latch (b) Latch 1 1 Latch 2 Latch (a) Latch 1 Combinational Logic Loops may borrow time internally but must complete within the cycle How Much Borrowing? D1 L1 tborrow T c tsetup tnonoverlap 2 1 2 Q1 Combinational Logic 1 D2 L2 2-Phase Latches Q2 1 Pulsed Latches 2 tborrow t pw tsetup tnonoverlap Tc Tc/2 Nominal Half-Cycle 1 Delay D2 tborrow tsetup Clock Skew We have assumed zero clock skew Clocks really have uncertainty in arrival time – Decreases maximum propagation delay – Increases minimum contamination delay – Decreases time borrowing Skew: Flip-Flops Combinational Logic D2 Tc clk tcd thold tccq tskew tpcq Q1 D2 F1 clk Q1 CL clk D2 tskew clk thold Q1 tccq D2 tskew tpdq F2 sequencing overhead Q1 F1 t pd Tc t pcq tsetup tskew clk F2 clk tcd tsetup Skew: Latches pdq sequencing overhead tcd 1 , tcd 2 thold tccq tnonoverlap tskew tborrow 1 2 Tc tsetup tnonoverlap tskew 2 Pulsed Latches t pd Tc max t pdq , t pcq tsetup t pw tskew sequencing overhead tcd thold t pw tccq tskew tborrow t pw tsetup tskew 2 Q1 Combinational Logic 1 D2 1 Q2 Combinational Logic 2 D3 L3 2t D1 L1 t pd Tc 1 L2 2-Phase Latches Q3 Two-Phase Clocking If setup times are violated, reduce clock speed If hold times are violated, chip fails at any speed In this class, working chips are most important – No tools to analyze clock skew An easy way to guarantee hold times is to use 2phase latches with big nonoverlap times Call these clocks 1, 2 (ph1, ph2) Safe Flip-Flop In class, use flip-flop with nonoverlapping clocks – Very slow – nonoverlap adds to setup time – But no hold times In industry, use a better timing analyzer – Add buffers to slow signals if hold time is at risk Q X D Q Summary Flip-Flops: – Very easy to use, supported by all tools 2-Phase Transparent Latches: – Lots of skew tolerance and time borrowing Pulsed Latches: – Fast, some skew tol & borrow, hold time risk