Transcript ppt
ECE 368 --- CAD-Based Logic Design Lecture # 14 : High-Level Digital Design Strategies -- For Ease of Design and for High-Performance Designs Shantanu Dutt ECE Dept., UIC Strategy 1: Divide and Conquer (D&C) • Think of the design problem as a computation • Divide the main problem into 2 or more subproblems which when solved will lead to the solution to the main problem • Some “stitching up” of subprob solns may be required • Do this recursively, until each small problem can be solved in an obvious way, e.g. using truth tables (TTs) Main Problem Subprob. Subsub prob. Subsub prob Subprob. Subsub prob. Subsub prob. Strategy 1: D&C (contd.) • Example: Ripple-Carry Adder (RCA) – Stitching up: Carry from LS n/2 bits is input to carry-in of MS n/2 bits at each level of the D&C tree. – Leaf subproblem: Full Adder (FA) Add n-bit #s X, Y Add MS n/2 bits of X,Y Add LS n/2 bits of X,Y • Example: Carry-Lookahead Adder (CLA) – Division: 4 subproblems per level – Stitching up: A more complex stitching up process (generation of “super” P,G’s to connect up the subproblems) – Leaf subproblem: 4-bit basic CLA with small p, g bits. FA FA FA FA (a) D&C for Ripple-Carry Adder Add n-bit #s X, Y • More intricate techniques (like P,GAdd last n/4 bits Add last n/4 bits Add last n/4 bits Add last n/4 bits generation in CLA) for complex stitching up for fast designs may need to be devised that is not directly 4-bit CLA 4-bit CLA 4-bit CLA 4-bit CLA suggested by D&C. But D&C is a (b) D&C for Carry-Lookahead Adder good starting point. Strategy 2: Fast Tree Designs for Associative Operations • An associative operation op is defined as one for which: Thus, A op B op C op D = (A op B) op (C op D). • This means that (A op B) and (C op D) can be done simultaneously to speed up the operation and the results op’ed to get the final result. • Thus associative operations can be performed using tree-like designs to get the result in Theta(log n) time • At each level of the tree the op operations are performed simultaneously and their results are op’ed at the next higher level, and so forth • E.g. of assoc. oper: +, *, and, or, xor • E.g. of non-assoc. oper: -, / • E.g. designs: AND-tree, Wallace-tree multiplier & & & & Inputs A op B op C = (A op B) op C = A op (B op C) z (a) “Linear” AND’ing of n bits. Time = (n-1)d, d= & gate delay z & & & & & & & Inputs (b) Tree-based AND’ing of n bits. Time = d log(n). Strategy3: Speculative Computations --Faster Designs x A • If there is a data dependency between two y or more portions of a computation (which (a) Original design: Time = T(A)+T(B) may be obtained using D&C), don’t wait x for the the “previous” computation to finish 0 A B(0,0) before starting the next one y 0 • Assume all possible input values for the next computation/stage B (e.g., if it has 2 0 inputs from the prev. stage there will be 4 B(0,1) possible input value combinations) and 1 perform it using a copy of the design for z possible input value. 1 B(1,0) • All the different o/p’s of the diff. Copies of 0 B are Mux’ed using prev. stage A’s o/p • E.g. design: Carry-Select Adder (at each 1 B(1,1) stage performs two additions one for carry1 in of 0 and another for carry-in of 1 from the previous stage) (b) Speculative computation: Time = max(T(A),T(B)) + T(Mux). B 4:1 Mux z Works well when T(A) approx = T(B) and T(A) >> T(Mux) Strategy4: Get the Best of Both Worlds (Average and Worst Case Delays)! Registers inputs inputs start Unary Division Ckt (good ave case, bad done1 worst case) output select Ext. FSM done2 Mux NonRestoring Div. Ckt (bad ave case, good worst case) output Register • • • • Use 2 circuits with different worst-case and average-case behaviors Use the first available output Get the best of both (ave-case, worst-case) worlds In the above schematic, we get the good ave case performance of unary division (assuming uniformly distributed inputs w/o the disadvantage of its bad worst-case performance) Strategy5: Pipeline It! Stage 1 Original ckt or datapath Stage 2 Conversion to a simple level-partitioned pipeline (level partition may not always be possible but other pipelineable partitions may be) Stage k