Transcript Document

Logical Effort of
Higher Valency Adders
David Harris
Harvey Mudd College
November 2004
Outline






Are higher valency adders worth the hassle?
Prefix networks
Building blocks
Critical paths
Results
Conclusions
Logical Effort of Higher Valency Adders
Slide 2
Prefix Networks
A4
B4
A3
B3
A2
B2
A1
B1
Cin
Precomputation
G4
P4
P4:4
G3
G3:3
P3
P3:3
G2
G2:2
P2
P2:2
G1
G1:1
P1
P1:1
G0
G0:0
P0
P0:0
Prefix Network
G3:0
G2:0
C3
G1:0
C2
G0:0
C1
C0
Postcomputation
C4
Cout
S4
S3
S2
Logical Effort of Higher Valency Adders
S1
Slide 3
Pre/post computation
Inverting Static CMOS
Bitwise
Ai
2 Bi
2
Bi
2
Ai
2
Pi
Sum XOR
2
Gi-1:0
1
Ai
4
Bi
4
1 Bi
1
Gi
Ai
Gi-1:0
Footless Domino
4
Pi
4 Gi-1:0 4
Gi-1:0
2 Gi-1:0 2
2
Pi
Pi
Ai_h
1
1
Si
P i'
2
1
2 Ai_l
2
2
Ai_h 2
Bi_h
2
2
Bi_l

4
Gi-1:0
Pi

1
Ai_l
2
2
Pi 2
Gi-1:0
2
2
Ki-1:0
Gi
H
Pi
H
Ki
tiny
1
2 Pi
H
Pi'
P i'
H
Si_h
H
Si_l
Cell
Term
Noninverting CMOS
Inverting CMOS
Footed Domino
Footless Domino
Bitwise
LEbit
9/3
9/3
6/3 * 5/6
4/3 * 5/6
PDbit
6/3 + 1
6/3
7/3 + 5/6
5/3 + 5/6
LExor
9/3
9/3
3/3 * 5/6
2/3 * 5/6
PDxor
9/3 + 12/3
9/3 + 12/3
7/3 + 5/6
5/3 + 5/6
LEbuf
1 * 1/2
1 * 1/2
2/3 * 5/6 * 1/2
1/3 * 5/6 * 1/2
Sum XOR
Buffer
Logical Effort of Higher Valency Adders
Slide 4
Valency 2 Networks
(1 , 2 , 0 )
( f ) L a d n e r - F is c h e r
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
1 3 :1 2
1 1 :1 0
1 5 :1 2
9 :8
7 :6
1 1 :8
5 :4
3 :2
7 :4
14
13
12
1 3 :8
1 5 :8
1 3 :0
1 5 :1 4
3 :0
7 :0
11
10
9
8
1 3 :1 2
1 1 :1 0
9 :8
(0 , 3 , 0 )
( b ) S k la n s k y
15
14
1 5 :1 4
13
12
1 3 :1 2
11
10
1 1 :1 0
1 5 :0
9
8
7
9 :8
6
5
7 :6
4
3
5 :4
2
3 :2
1
1 4 :0
1 3 :0
1 2 :0
6
7 :6
5
4
3
1 1 :8
5 :4
2
3 :2
1
0
7 :4
1 :0
0 :0
1 :0
3 :0
5 :0
1 5 :8
1 1 :0
7
1 :0
1 5 :1 2
1 5 :8
(3 , 0 , 0 )
(a ) B re n t-K u n g
0
15
1 5 :1 4
7 :0
9 :0
1 1 :0
1 0 :0
9 :0
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2 :0
1 :0
1 1 :0
0 :0
1 3 :0
0
9 :0
5 :0
1 5 :0 1 4 :0 1 3 :0 1 2 :0 1 1 :0 1 0 :0 9 :0
1 :0
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2 :0
l ( L o g ic L e v e l s )
1 5 :1 2
1 4 :1 2
1 1 :8
1 0 :8
7 :4
6 :4
3 :0
2 :0
B r e n t1 5 :8
1 4 :8
1 3 :8
1 2 :8
K ung
Ladner1 5 :0 1 4 :0 1 3 :0 1 2 :0 1 1 :0 1 0 :0 9 :0
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2 :0
1 :0
3 (7)
F is c her
0 :0
LadnerF is c her
f ( F a n o u t)
S k lans k y
2 (6)
3 (9 )
1 (5 )
2 (5)
( e ) K n o w le s [2 ,1 ,1 ,1 ]
1 (3)
(0 , 1 , 2 )
0 (4 )
0 (2 )
15
14
13
12
11
10
9
8
7
6
5
4
3
1 5 :1 4
1 4 :1 3
1 3 :1 2
1 2 :1 1
1 1 :1 0
1 0 :9
9 :8
8 :7
7 :6
6 :5
5 :4
4 :3
3 :2
1 5 :1 2
1 4 :1 1
1 3 :1 0
1 2 :9
1 1 :8
1 0 :7
9 :6
8 :5
7 :4
6 :3
5 :2
4 :1
3 :0
1 5 :8
1 4 :7
1 3 :6
1 2 :5
1 1 :4
1 0 :3
9 :2
8 :1
7 :0
6 :0
5 :0
4 :0
2
2 :1
1
0
H an-
0 (1)
C a rl s o n
New
1 :0
K n o w le s
( d ) H a n - C a r ls o n
( 1 ,1 , 1 )
15
1 (2)
1 5 :0 1 4 :0 1 3 :0 1 2 :0 1 1 :0 1 0 :0 9 :0
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2 :0
(1 , 0 , 2 )
[4 , 2 , 1 ,1 ]
2 :0
1 :0
0 :0
14
13
12
11
10
9
8
7
6
5
4
3
1 5 :1 4
1 3 :1 2
1 1 :1 0
9 :8
7 :6
5 :4
3 :2
1 5 :1 2
1 3 :1 0
1 1 :8
9 :6
7 :4
5 :2
3 :0
1 5 :8
1 3 :6
1 1 :4
9 :2
7 :0
5 :0
2
1
0
1 :0
0 :0
1 :0
H anC a rl s o n
K n o w le s
[2 , 1 ,1 ,1 ]
2 (4)
1 5 :0 1 4 :0 1 3 :0 1 2 :0 1 1 :0 1 0 :0 9 :0
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2 :0
K ogge3 (8)
S to n e
t (W i r e T ra c k s )
(c ) K o g g e -S to n e
15
14
13
1 5 :1 4
1 4 :1 3
1 3 :1 2
1 5 :1 2
1 4 :1 1
1 3 :1 0
1 4 :7
1 3 :6
1 5 :8
12
1 2 :1 1
(0 , 0 , 3 )
11
10
9
8
7
6
5
4
3
1 1 :1 0
1 0 :9
9 :8
8 :7
7 :6
6 :5
5 :4
4 :3
3 :2
1 2 :9
1 1 :8
1 0 :7
9 :6
8 :5
7 :4
6 :3
5 :2
4 :1
3 :0
1 2 :5
1 1 :4
1 0 :3
9 :2
8 :1
7 :0
6 :0
5 :0
4 :0
1 5 :0 1 4 :0 1 3 :0 1 2 :0 1 1 :0 1 0 :0 9 :0
Logical Effort of Higher Valency Adders
8 :0
7 :0
6 :0
5 :0
4 :0
3 :0
2
2 :1
1
0
1 :0
0 :0
1 :0
2 :0
2 :0
Slide 5
Valency 3: BK
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
26:0 25:0 24:0 23:0 22:0 21:0 20:0 19:0 18:0 17:0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0
Logical Effort of Higher Valency Adders
Slide 6
Valency 3: LF
26
0
26:0
0:0
Logical Effort of Higher Valency Adders
Slide 7
Valency 3: Sklansky
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
26:0 25:0 24:0 23:0 22:0 21:0 20:0 19:0 18:0 17:0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0
Logical Effort of Higher Valency Adders
Slide 8
Valency 3: KS
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
26:0 25:0 24:0 23:0 22:0 21:0 20:0 19:0 18:0 17:0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0
Logical Effort of Higher Valency Adders
Slide 9
Valency 3: HC
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
26:0 25:0 24:0 23:0 22:0 21:0 20:0 19:0 18:0 17:0 16:0 15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0
Logical Effort of Higher Valency Adders
Slide 10
Building Blocks
 Black Cells, Gray Cells, Buffers
– Black cells compute G and P
– Gray cells compute G
 Circuit families
– Inverting Static CMOS
– Noninverting Static CMOS
– Footed Domino
– Footless Domino
Logical Effort of Higher Valency Adders
Slide 11
Valency 2
G0
4 P1
G1
G1
4
4
1 G0
2
P1
2
Gi:j
P1
2 P0
2
P1
2
P0
2

1
1
1
P1:0
Logical Effort of Higher Valency Adders
G0
G1
1
2
P0
2
P1
2
K0
2
K1
H
G1:0
H
P1:0
H
K1:0
1
Slide 12
Valency 4
P1
P2
P3
4 G2
G3
8
G2
G3
1 P3
P3
P
P1 2
P0
4 G1
8 G0
8
8

8
G1
2 P2
G0
2 P1
4
4
G3:0
G0
G1
4
G2 4/3
G3
4
1
2
1
1
1
4P0
P1
4 K0
4
4
P2
4
P3
4
G3:0
P3:0
K3:0
K1
2
K2
4/3
K3
1
P3:0
Logical Effort of Higher Valency Adders
Slide 13
Delay Parameters (v=2)
Term
Cell
Inverting
CMOS
Noninverting
CMOS
Footed
Domino
Footless
Domino
PDg
7/3
7/3 + 1
6/3 + 5/6
4/3 + 5/6
PDp
2
2+1
3/3 + 5/6
2/3 + 5/6
LEg1
5/3
1/2 * 5/6
1/3 * 5/6
LEg0
6/3
3/3 * 5/6
2/3 * 5/6
LEp1
Gray
6/3
3/3 * 5/6
2/3 * 5/6
LEp1
Black
10/3
6/3 * 5/6
4/3 * 5/6
LEp0
Black
4/3
3/3 * 5/6
2/3 * 5/6s
Logical Effort of Higher Valency Adders
Slide 14
Delay Parameters (v=3)
Term
Cell
Inverting
CMOS
Noninverting
CMOS
Footed
Domino
Footless
Domino
PDg
13/3
13/3 + 1
10/3 + 5/6
7/3 + 5/6
PDp
3
3+1
4/3 + 5/6
3/3 + 5/6
LEg2
7/3
4/9 * 5/6
1/3 * 5/6
LEg1
8/3
2/3 * 5/6
1/2 * 5/6
LEg0
9/3
4/3 * 5/6
3/3 * 5/6
LEp2
Gray
5/3
4/3 * 5/6
3/3 * 5/6
LEp1
Gray
9/3
4/3 * 5/6
3/3 * 5/6
LEp2
Black
10/3
8/3 * 5/6
6/3 * 5/6
LEp1
Black
14/3
8/3 * 5/6
6/3 * 5/6
LEp0
Black
5/3
4/3 * 5/6
3/3 * 5/6
Logical Effort of Higher Valency Adders
Slide 15
Delay Parameters (v=4)
Term
Cell
Inverting
CMOS
Noninverting
CMOS
Footed
Domino
Footless
Domino
PDg
17/3
17/3 + 1
13/3 + 5/6
10/3 + 5/6
PDp
4
4+1
5/3 + 5/6
4/3 + 5/6
LEg3
9/3
5/12 * 5/6
1/3 * 5/6
LEg2
10/3
5/9 * 5/6
4/9 * 5/6
LEg1
10/3
5/6 * 5/6
2/3 * 5/6
LEg0
12/3
5/3 * 5/6
4/3 * 5/6
LEp3
Gray
8/3
5/3 * 5/6
4/3 * 5/6
LEp2
Gray
6/3
5/3 * 5/6
4/3 * 5/6
LEp1
Gray
12/3
5/3 * 5/6
4/3 * 5/6
LEp3
Black
14/3
10/3 * 5/6
8/3 * 5/6
LEp2
Black
12/3
10/3 * 5/6
8/3 * 5/6
LEp1
Black
18/3
10/3 * 5/6
8/3 * 5/6
LEp0
Black
6/3
5/3 * 5/6
4/3 * 5/6
Logical Effort of Higher Valency Adders
Slide 16
Wire Capacitance
 Wires contribute capacitance per length
– Count tracks t spanned by each wire
 w = (wire cap / track) / unit inverter capacitance
 w = wt
 W = sum(wi)
Logical Effort of Higher Valency Adders
Slide 17
Delay Estimation
 Ideally choose best size for each gate
D  DF  P  NF 1/ N  P
– Only valid if wire parasitics are negligible
 Alternatively, make each stage have unit drive
N
D  DF  P  W   f i  P  W
inverting static CMOS
i 1
N /2
D  DF  P  W  2  f 2i f 2i 1  P  W
others (2 gates/stage)
i 1
 How much performance does this cost?
– Very little unless stage efforts are nonuniform
– Overestimates delay of Sklansky / LF networks
Logical Effort of Higher Valency Adders
Slide 18
Method
 Create MATLAB models of adders
– For each stage, list f, p, t
• Depends on architecture, valency, circuit family
– Use MATLAB to calculate total delays
 Compare ideal and unit drive delays for w = 0
– Verifies unit drive simplification
 Plot D vs. # of bits
Logical Effort of Higher Valency Adders
Slide 19
Ladner-Fischer
Sklansky
Kogge-Stone
Han-Carlson
Delay (FO4)
Brent-Kung
Results
30
30
30
30
20
20
20
20
10
10
10
10
0
0
30
0
0
30
0
0
30
50
50
50
0
0
30
20
20
20
20
10
10
10
10
0
0
30
50
0
0
30
50
0
0
30
50
0
0
30
20
20
20
20
10
10
10
10
0
0
30
0
0
30
0
0
30
50
50
50
20
20
20
10
10
10
10
50
0
0
30
50
0
0
30
50
20
20
20
10
10
10
10
50
Inverting CMOS
0
0
50
Noninverting CMOS
0
0
# of bits
50
Footed Domino
50
0
0
30
20
0
0
50
0
0
30
20
0
0
30
50
0
0
50
50
Footless Domino
Valency 2
Valency 3
Valency 4
Logical Effort of Higher Valency Adders
Slide 20
Conclusions
 For fast prefix networks, the logical effort model
predicts that valency barely affects delay
– Valency 2 designs are simpler
– But most commercial designs use valency 4
 Weaknesses of logical effort model
– Overpredicts g for higher valency
– Underpredicts p for higher valeny
– Calibrate model through simulation
– Or simulate entire designs
Logical Effort of Higher Valency Adders
Slide 21