Transcript Document
Logical Effort of
Higher Valency Adders
David Harris
Harvey Mudd College
November 2004
Outline
Are higher valency adders worth the hassle?
Prefix networks
Building blocks
Critical paths
Results
Conclusions
Logical Effort of Higher Valency Adders
Slide 2
Prefix Networks
A4
B4
A3
B3
A2
B2
A1
B1
Cin
Precomputation
G4
P4
P4:4
G3
G3:3
P3
P3:3
G2
G2:2
P2
P2:2
G1
G1:1
P1
P1:1
G0
G0:0
P0
P0:0
Prefix Network
G3:0
G2:0
C3
G1:0
C2
G0:0
C1
C0
Postcomputation
C4
Cout
S4
S3
S2
Logical Effort of Higher Valency Adders
S1
Slide 3
Pre/post computation
Inverting Static CMOS
Bitwise
Ai
2 Bi
2
Bi
2
Ai
2
Pi
Sum XOR
2
Gi-1:0
1
Ai
4
Bi
4
1 Bi
1
Gi
Ai
Gi-1:0
Footless Domino
4
Pi
4 Gi-1:0 4
Gi-1:0
2 Gi-1:0 2
2
Pi
Pi
Ai_h
1
1
Si
P i'
2
1
2 Ai_l
2
2
Ai_h 2
Bi_h
2
2
Bi_l
4
Gi-1:0
Pi
1
Ai_l
2
2
Pi 2
Gi-1:0
2
2
Ki-1:0
Gi
H
Pi
H
Ki
tiny
1
2 Pi
H
Pi'
P i'
H
Si_h
H
Si_l
Cell
Term
Noninverting CMOS
Inverting CMOS
Footed Domino
Footless Domino
Bitwise
LEbit
9/3
9/3
6/3 * 5/6
4/3 * 5/6
PDbit
6/3 + 1
6/3
7/3 + 5/6
5/3 + 5/6
LExor
9/3
9/3
3/3 * 5/6
2/3 * 5/6
PDxor
9/3 + 12/3
9/3 + 12/3
7/3 + 5/6
5/3 + 5/6
LEbuf
1 * 1/2
1 * 1/2
2/3 * 5/6 * 1/2
1/3 * 5/6 * 1/2
Sum XOR
Buffer
Logical Effort of Higher Valency Adders
Slide 4
Valency 2 Networks
(1 , 2 , 0 )
( f ) L a d n e r - F is c h e r
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
1 3 :1 2
1 1 :1 0
1 5 :1 2
9 :8
7 :6
1 1 :8
5 :4
3 :2
7 :4
14
13
12
1 3 :8
1 5 :8
1 3 :0
1 5 :1 4
3 :0
7 :0
11
10
9
8
1 3 :1 2
1 1 :1 0
9 :8
(0 , 3 , 0 )
( b ) S k la n s k y
15
14
1 5 :1 4
13
12
1 3 :1 2
11
10
1 1 :1 0
1 5 :0
9
8
7
9 :8
6
5
7 :6
4
3
5 :4
2
3 :2
1
1 4 :0
1 3 :0
1 2 :0
6
7 :6
5
4
3
1 1 :8
5 :4
2
3 :2
1
0
7 :4
1 :0
0 :0
1 :0
3 :0
5 :0
1 5 :8
1 1 :0
7
1 :0
1 5 :1 2
1 5 :8
(3 , 0 , 0 )
(a ) B re n t-K u n g
0
15
1 5 :1 4
7 :0
9 :0
1 1 :0
1 0 :0
9 :0
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2 :0
1 :0
1 1 :0
0 :0
1 3 :0
0
9 :0
5 :0
1 5 :0 1 4 :0 1 3 :0 1 2 :0 1 1 :0 1 0 :0 9 :0
1 :0
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2 :0
l ( L o g ic L e v e l s )
1 5 :1 2
1 4 :1 2
1 1 :8
1 0 :8
7 :4
6 :4
3 :0
2 :0
B r e n t1 5 :8
1 4 :8
1 3 :8
1 2 :8
K ung
Ladner1 5 :0 1 4 :0 1 3 :0 1 2 :0 1 1 :0 1 0 :0 9 :0
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2 :0
1 :0
3 (7)
F is c her
0 :0
LadnerF is c her
f ( F a n o u t)
S k lans k y
2 (6)
3 (9 )
1 (5 )
2 (5)
( e ) K n o w le s [2 ,1 ,1 ,1 ]
1 (3)
(0 , 1 , 2 )
0 (4 )
0 (2 )
15
14
13
12
11
10
9
8
7
6
5
4
3
1 5 :1 4
1 4 :1 3
1 3 :1 2
1 2 :1 1
1 1 :1 0
1 0 :9
9 :8
8 :7
7 :6
6 :5
5 :4
4 :3
3 :2
1 5 :1 2
1 4 :1 1
1 3 :1 0
1 2 :9
1 1 :8
1 0 :7
9 :6
8 :5
7 :4
6 :3
5 :2
4 :1
3 :0
1 5 :8
1 4 :7
1 3 :6
1 2 :5
1 1 :4
1 0 :3
9 :2
8 :1
7 :0
6 :0
5 :0
4 :0
2
2 :1
1
0
H an-
0 (1)
C a rl s o n
New
1 :0
K n o w le s
( d ) H a n - C a r ls o n
( 1 ,1 , 1 )
15
1 (2)
1 5 :0 1 4 :0 1 3 :0 1 2 :0 1 1 :0 1 0 :0 9 :0
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2 :0
(1 , 0 , 2 )
[4 , 2 , 1 ,1 ]
2 :0
1 :0
0 :0
14
13
12
11
10
9
8
7
6
5
4
3
1 5 :1 4
1 3 :1 2
1 1 :1 0
9 :8
7 :6
5 :4
3 :2
1 5 :1 2
1 3 :1 0
1 1 :8
9 :6
7 :4
5 :2
3 :0
1 5 :8
1 3 :6
1 1 :4
9 :2
7 :0
5 :0
2
1
0
1 :0
0 :0
1 :0
H anC a rl s o n
K n o w le s
[2 , 1 ,1 ,1 ]
2 (4)
1 5 :0 1 4 :0 1 3 :0 1 2 :0 1 1 :0 1 0 :0 9 :0
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2 :0
K ogge3 (8)
S to n e
t (W i r e T ra c k s )
(c ) K o g g e -S to n e
15
14
13
1 5 :1 4
1 4 :1 3
1 3 :1 2
1 5 :1 2
1 4 :1 1
1 3 :1 0
1 4 :7
1 3 :6
1 5 :8
12
1 2 :1 1
(0 , 0 , 3 )
11
10
9
8
7
6
5
4
3
1 1 :1 0
1 0 :9
9 :8
8 :7
7 :6
6 :5
5 :4
4 :3
3 :2
1 2 :9
1 1 :8
1 0 :7
9 :6
8 :5
7 :4
6 :3
5 :2
4 :1
3 :0
1 2 :5
1 1 :4
1 0 :3
9 :2
8 :1
7 :0
6 :0
5 :0
4 :0
1 5 :0 1 4 :0 1 3 :0 1 2 :0 1 1 :0 1 0 :0 9 :0
Logical Effort of Higher Valency Adders
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2
2 :1
1
0
1 :0
0 :0
1 :0
2 :0
2 :0
Slide 5
Valency 3: BK
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
26:0 25:0 24:0 23:0 22:0 21:0 20:0 19:0 18:0 17:0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0
Logical Effort of Higher Valency Adders
Slide 6
Valency 3: LF
26
0
26:0
0:0
Logical Effort of Higher Valency Adders
Slide 7
Valency 3: Sklansky
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
26:0 25:0 24:0 23:0 22:0 21:0 20:0 19:0 18:0 17:0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0
Logical Effort of Higher Valency Adders
Slide 8
Valency 3: KS
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
26:0 25:0 24:0 23:0 22:0 21:0 20:0 19:0 18:0 17:0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0
Logical Effort of Higher Valency Adders
Slide 9
Valency 3: HC
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
26:0 25:0 24:0 23:0 22:0 21:0 20:0 19:0 18:0 17:0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0
Logical Effort of Higher Valency Adders
Slide 10
Building Blocks
Black Cells, Gray Cells, Buffers
– Black cells compute G and P
– Gray cells compute G
Circuit families
– Inverting Static CMOS
– Noninverting Static CMOS
– Footed Domino
– Footless Domino
Logical Effort of Higher Valency Adders
Slide 11
Valency 2
G0
4 P1
G1
G1
4
4
1 G0
2
P1
2
Gi:j
P1
2 P0
2
P1
2
P0
2
1
1
1
P1:0
Logical Effort of Higher Valency Adders
G0
G1
1
2
P0
2
P1
2
K0
2
K1
H
G1:0
H
P1:0
H
K1:0
1
Slide 12
Valency 4
P1
P2
P3
4 G2
G3
8
G2
G3
1 P3
P3
P
P1 2
P0
4 G1
8 G0
8
8
8
G1
2 P2
G0
2 P1
4
4
G3:0
G0
G1
4
G2 4/3
G3
4
1
2
1
1
1
4P0
P1
4 K0
4
4
P2
4
P3
4
G3:0
P3:0
K3:0
K1
2
K2
4/3
K3
1
P3:0
Logical Effort of Higher Valency Adders
Slide 13
Delay Parameters (v=2)
Term
Cell
Inverting
CMOS
Noninverting
CMOS
Footed
Domino
Footless
Domino
PDg
7/3
7/3 + 1
6/3 + 5/6
4/3 + 5/6
PDp
2
2+1
3/3 + 5/6
2/3 + 5/6
LEg1
5/3
1/2 * 5/6
1/3 * 5/6
LEg0
6/3
3/3 * 5/6
2/3 * 5/6
LEp1
Gray
6/3
3/3 * 5/6
2/3 * 5/6
LEp1
Black
10/3
6/3 * 5/6
4/3 * 5/6
LEp0
Black
4/3
3/3 * 5/6
2/3 * 5/6s
Logical Effort of Higher Valency Adders
Slide 14
Delay Parameters (v=3)
Term
Cell
Inverting
CMOS
Noninverting
CMOS
Footed
Domino
Footless
Domino
PDg
13/3
13/3 + 1
10/3 + 5/6
7/3 + 5/6
PDp
3
3+1
4/3 + 5/6
3/3 + 5/6
LEg2
7/3
4/9 * 5/6
1/3 * 5/6
LEg1
8/3
2/3 * 5/6
1/2 * 5/6
LEg0
9/3
4/3 * 5/6
3/3 * 5/6
LEp2
Gray
5/3
4/3 * 5/6
3/3 * 5/6
LEp1
Gray
9/3
4/3 * 5/6
3/3 * 5/6
LEp2
Black
10/3
8/3 * 5/6
6/3 * 5/6
LEp1
Black
14/3
8/3 * 5/6
6/3 * 5/6
LEp0
Black
5/3
4/3 * 5/6
3/3 * 5/6
Logical Effort of Higher Valency Adders
Slide 15
Delay Parameters (v=4)
Term
Cell
Inverting
CMOS
Noninverting
CMOS
Footed
Domino
Footless
Domino
PDg
17/3
17/3 + 1
13/3 + 5/6
10/3 + 5/6
PDp
4
4+1
5/3 + 5/6
4/3 + 5/6
LEg3
9/3
5/12 * 5/6
1/3 * 5/6
LEg2
10/3
5/9 * 5/6
4/9 * 5/6
LEg1
10/3
5/6 * 5/6
2/3 * 5/6
LEg0
12/3
5/3 * 5/6
4/3 * 5/6
LEp3
Gray
8/3
5/3 * 5/6
4/3 * 5/6
LEp2
Gray
6/3
5/3 * 5/6
4/3 * 5/6
LEp1
Gray
12/3
5/3 * 5/6
4/3 * 5/6
LEp3
Black
14/3
10/3 * 5/6
8/3 * 5/6
LEp2
Black
12/3
10/3 * 5/6
8/3 * 5/6
LEp1
Black
18/3
10/3 * 5/6
8/3 * 5/6
LEp0
Black
6/3
5/3 * 5/6
4/3 * 5/6
Logical Effort of Higher Valency Adders
Slide 16
Wire Capacitance
Wires contribute capacitance per length
– Count tracks t spanned by each wire
w = (wire cap / track) / unit inverter capacitance
w = wt
W = sum(wi)
Logical Effort of Higher Valency Adders
Slide 17
Delay Estimation
Ideally choose best size for each gate
D DF P NF 1/ N P
– Only valid if wire parasitics are negligible
Alternatively, make each stage have unit drive
N
D DF P W f i P W
inverting static CMOS
i 1
N /2
D DF P W 2 f 2i f 2i 1 P W
others (2 gates/stage)
i 1
How much performance does this cost?
– Very little unless stage efforts are nonuniform
– Overestimates delay of Sklansky / LF networks
Logical Effort of Higher Valency Adders
Slide 18
Method
Create MATLAB models of adders
– For each stage, list f, p, t
• Depends on architecture, valency, circuit family
– Use MATLAB to calculate total delays
Compare ideal and unit drive delays for w = 0
– Verifies unit drive simplification
Plot D vs. # of bits
Logical Effort of Higher Valency Adders
Slide 19
Ladner-Fischer
Sklansky
Kogge-Stone
Han-Carlson
Delay (FO4)
Brent-Kung
Results
30
30
30
30
20
20
20
20
10
10
10
10
0
0
30
0
0
30
0
0
30
50
50
50
0
0
30
20
20
20
20
10
10
10
10
0
0
30
50
0
0
30
50
0
0
30
50
0
0
30
20
20
20
20
10
10
10
10
0
0
30
0
0
30
0
0
30
50
50
50
20
20
20
10
10
10
10
50
0
0
30
50
0
0
30
50
20
20
20
10
10
10
10
50
Inverting CMOS
0
0
50
Noninverting CMOS
0
0
# of bits
50
Footed Domino
50
0
0
30
20
0
0
50
0
0
30
20
0
0
30
50
0
0
50
50
Footless Domino
Valency 2
Valency 3
Valency 4
Logical Effort of Higher Valency Adders
Slide 20
Conclusions
For fast prefix networks, the logical effort model
predicts that valency barely affects delay
– Valency 2 designs are simpler
– But most commercial designs use valency 4
Weaknesses of logical effort model
– Overpredicts g for higher valency
– Underpredicts p for higher valeny
– Calibrate model through simulation
– Or simulate entire designs
Logical Effort of Higher Valency Adders
Slide 21