Transcript ppt

ECE 368 --- CAD-Based Logic
Design
Lecture # 14 : High-Level Digital
Design Strategies -- For Ease of
Design and for High-Performance
Designs
Shantanu Dutt
ECE Dept., UIC
Strategy 1: Divide and Conquer (D&C)
• Think of the design problem as
a computation
• Divide the main problem into 2
or more subproblems which
when solved will lead to the
solution to the main problem
• Some “stitching up” of subprob
solns may be required
• Do this recursively, until each
small problem can be solved in
an obvious way, e.g. using truth
tables (TTs)
Main
Problem
Subprob.
Subsub
prob.
Subsub
prob
Subprob.
Subsub
prob.
Subsub
prob.
Strategy 1: D&C (contd.)
• Example: Ripple-Carry Adder (RCA)
– Stitching up: Carry from LS n/2 bits is
input to carry-in of MS n/2 bits at
each level of the D&C tree.
– Leaf subproblem: Full Adder (FA)
Add n-bit #s X, Y
Add MS n/2 bits
of X,Y
Add LS n/2 bits
of X,Y
• Example: Carry-Lookahead Adder
(CLA)
– Division: 4 subproblems per level
– Stitching up: A more complex
stitching up process (generation of
“super” P,G’s to connect up the
subproblems)
– Leaf subproblem: 4-bit basic CLA
with small p, g bits.
FA
FA
FA
FA
(a) D&C for Ripple-Carry Adder
Add n-bit #s X, Y
• More intricate techniques (like P,GAdd last n/4 bits Add last n/4 bits Add last n/4 bits Add last n/4 bits
generation in CLA) for complex
stitching up for fast designs may need
to be devised that is not directly
4-bit CLA
4-bit CLA
4-bit CLA
4-bit CLA
suggested by D&C. But D&C is a
(b) D&C for Carry-Lookahead Adder
good starting point.
Strategy 2: Fast Tree Designs for Associative Operations
• An associative operation op is defined as
one for which:
Thus, A op B op C op D = (A op B) op (C op D).
• This means that (A op B) and (C op D)
can be done simultaneously to speed up
the operation and the results op’ed to get
the final result.
• Thus associative operations can be
performed using tree-like designs to get
the result in Theta(log n) time
• At each level of the tree the op
operations are performed simultaneously
and their results are op’ed at the next
higher level, and so forth
• E.g. of assoc. oper: +, *, and, or, xor
• E.g. of non-assoc. oper: -, /
• E.g. designs: AND-tree, Wallace-tree
multiplier
&
&
&
&
Inputs
A op B op C = (A op B) op C = A op (B op C)
z
(a) “Linear” AND’ing of n bits.
Time = (n-1)d, d= & gate delay
z
&
&
&
&
&
&
&
Inputs
(b) Tree-based AND’ing of n bits.
Time = d log(n).
Strategy3: Speculative Computations --Faster Designs
x
A
• If there is a data dependency between two
y
or more portions of a computation (which
(a) Original design: Time = T(A)+T(B)
may be obtained using D&C), don’t wait
x
for the the “previous” computation to finish
0
A
B(0,0)
before starting the next one
y
0
• Assume all possible input values for the
next computation/stage B (e.g., if it has 2
0
inputs from the prev. stage there will be 4
B(0,1)
possible input value combinations) and
1
perform it using a copy of the design for
z
possible input value.
1
B(1,0)
• All the different o/p’s of the diff. Copies of
0
B are Mux’ed using prev. stage A’s o/p
• E.g. design: Carry-Select Adder (at each
1
B(1,1)
stage performs two additions one for carry1
in of 0 and another for carry-in of 1 from
the previous stage)
(b) Speculative computation: Time = max(T(A),T(B)) + T(Mux).
B
4:1 Mux
z
Works well when T(A) approx = T(B) and T(A) >> T(Mux)
Strategy4: Get the Best of Both Worlds
(Average and Worst Case Delays)!
Registers
inputs
inputs
start
Unary
Division Ckt
(good ave
case, bad done1
worst case)
output
select
Ext.
FSM
done2
Mux
NonRestoring
Div. Ckt
(bad ave
case, good
worst case)
output
Register
•
•
•
•
Use 2 circuits with different worst-case and average-case behaviors
Use the first available output
Get the best of both (ave-case, worst-case) worlds
In the above schematic, we get the good ave case performance of unary
division (assuming uniformly distributed inputs w/o the disadvantage of its bad
worst-case performance)
Strategy5: Pipeline It!
Stage 1
Original ckt
or datapath
Stage 2
Conversion
to a simple
level-partitioned
pipeline (level
partition may not
always be possible
but other pipelineable partitions
may be)
Stage k