ROAD : An Order-Impervious Optimal Detailed Router Hasan Arslan, Shantanu Dutt

Download Report

Transcript ROAD : An Order-Impervious Optimal Detailed Router Hasan Arslan, Shantanu Dutt

ROAD : An Order-Impervious Optimal
Detailed Router
Hasan Arslan, Shantanu Dutt
Electrical & Computer Eng.
University of Illinois at Chicago
ICCD 2003
1
OUTLINE
• Introduction
• Standard Single-Net Routing Mode
• Ripup-Reroute Detailed Routing Alg.
• Bump &Refit (B&R) Concepts
• B&R Paradigm
• Applying B&R to Complete Detailed Routing
• Optimality-Preserving Speedup Methods
• Lookahead TC functions
• Learning-Based Search Space Pruning
• Clique-Based Search Space Pruning
• Experimental Results
• Conclusion
2
INTRODUCTION
• Efficient Routing:
– Reducing total wiring area
– Lengths of critical path nets for performance
opt.
• Detailed Routing:
– Net ordering problem (Std. Single-Net Routing )
– Ripup-and-Reroute (R&R)
• New approach (Bump & Refit strategy)
3
Prior work on Detailed Routing
(Cont.)
2
1
1
S B
C
T
E
• Standard single-net routing
mode
2
1
0
E
– during detail routing
• Does not perturb or move existing
nets
D
A
2
1
0
01 2
}
Cells
01 2
}
01 2
Switchboxes
Tracks #s
Tracks
4
Prior work on Detailed Routing(Cont.)
(RIPUP-and-REROUTE)
1
1
B
C
E
E
1
0
D
A
1
0
01
01
01
1) Single Net Routing : Route new
nets without removing any
existing nets.
2) Rip & Reroute : If some nets
cannot be routed, rip-up the
existing nets which occupy the
resources of new nets. Reroute
the ripped up nets
– Changes net topology
– Net length can be increased
– Because of ripup-reroute solution
time can be increased
5
Complete Detailed Router
by using Bump-&-Refit strategy
• Basic B&R developed for incremental routing
(Dutt etal. ICCAD’99, ICCAD’01, TODAES’02)
• Incremental Routing
– Existing routed nets set R
– Some new nets set S (timing violation, noise…)
– Route the nets in S by doing min. changes on nets in R.
• Bump-&-Refit Approach
– Does not rip-up and reroute
– Shift them (or their subnets) to other track positions --Bump-&-Refit (B&R)
– No change in topology, length of existing nets
– Optimal: Finds a detailed route if exists
6
Experimental Results
(Dutt etal., TODAES 2002)
Comparisons of STD, R&R, B&R Inc. Routing Alg.
800
700
600
500
400
300
R&R=(85x)
200
100
STD
vg
A
i4
i
ex 5
m
p2
rd
73
pm
a
cs
e
sa
o2
m
m
4a
te
rm
1
s7
1
s8 3
38
.1
ex
1
s8
20
m
lt3
2
cl
ip
0
ss
e
Unrouted Nets over B&R
10% new net, 10% unused tracks
STD=(20x)
R&R
7
Complete Detailed Router (B&R)
• Each step in complete detailed routing is an incremental
routing problem
B&R
B&R
B&R
B&R
Complete_Detailed_Routing()
input: unrouted nets
N: number of nets
output: Routed nets.
R0=
for (i=1 to N)
Ri=Do_Incremental_Routing(ni,Ri-1)
B&R
8
B&R Concepts
Some definitions and representation (cont.)
n5
n4
n2
3
2
1
0
n3
The overlap graph (OG)
Nodes: existing nets
Edges: a channel is shared by two
nets
Tj  Tk, bumps to some nets
3
2
1
0
3
2
1
n6
T0
012 3
012 3
T3
T1
0
n6
T0
012 3
n2
T1
T1
T2
T2
T3
T3
n5
T3
n3
T0
T1
T1
T0
SP
T2
T0
T0
T2
T0
T3
n4
T2
9
B&R Concepts
(i) TC1sum (ni TjTk ) =  l(nj)  nj  adjTk (ni)total-length-of
bumped nets
E
 l(nj)  nj  adjTk (ni)
(ii) TC1sqrt (ni TjTk ) = ----------------------------sqrt(| adjTk (ni)|
l(nj) is the total length of nj in terms of the track segments
New_Net
1
1
C
adjT1(E)={B}
B
O_Net
0
D
1
0
01
01
01
adjT0(E)={C}
O_Net
B
T1
1
A
E
X
A
T0
C
T0
D
T1
10
B&R Concepts
1st Level TC Functions
n1
n3
n2
n4
n5
0
1
2
TC1=Total-length-of-bumped-nets
T1
n2
12
n3
n1
T2
n4
n5
6
11
Example: B&R for Detailed Routing
E
New_Net
1
1
C
C
O_Net
3
B
E
X0
T
B
T1
O_Net
C
T0
X
2
1
0
A
T0
D
T1
D
A
1
0
01
01
01
D_Sp
T0
12
Example: B&R for Detailed Routing(Cont.)
New_Net
E
T0
1
1
B
B
T1
C
E
O_Net
C
X1
T
1
0
A
T0
D
A
D
X01
T
1
0
D_Sp
T0
01
01
01
13
B&R Concepts
Navigating the OG
New-net
T2
• If there is a solution,
T2
1
T1
2
T1
T1
4
T3
5
T2
9
this process will find it.
• it optimizes the number of tracks
3
T3
6
T3
T3
8
10
11
12
Time Complexity:
•L= # of paths in OG
•(m=# of nodes, b:branching facto
• In worst case L=O((b-1) m)
• If OG is tree L is linear
Sp
Sp
Sp
Sp
14
Optimality-Preserving Speedup Methods
• Regular incremental routing: B&R applied
to 1-10% of the nets---speed is good
• Complete detailed routing: B&R applied to
100% of nets--speed drops significantly
• Developed three optimality-preserving
speedup methods to increase the speed of
B&R
15
Optimality-Preserving Speedup Methods (Cont.)
(1)
Lookahead TC Functions
n4
n5
T1
7 n2
T2
n6
3
T0
Sp
0
n1
n1
n3
n2
n6
0
1
2
T2
3 n
3
T0
T1
n4
3
(i) TC1=Total-length-of-bumped-nets
(i) TC2sum-sum (ni TjTk ) =   nj  adjTk (ni) min Tt (TC1sum (ni TkTt ))
n5
5
16
Optimality-Preserving Speedup Methods (Cont.)
(2)
A
T1
B
T2
Y
T3
C
T0
D
T3
Clique-Based Search Space Pruning
C
A
X
B
Y
D
T0
T1
T2
T3
C
Y A
X
B
D
T0
T1
T2
T3
• Dynamically determines the presence of cliques in the OG among
the longer nets
• CLIQUE: is a completely connected subgraph of OG
• Min. # of distinct tracks for succ. Routing (m=4)
• For each clique, maintain the number of common unused (CUT)
track. (k)
• After a net in clique is bumped, If (k+m) > t , there is no solution to
bump that net.
• (1+4) > 4
17
Optimality-Preserving Speedup Methods (Cont.)
(3) Learning-Based Search Space Pruning
D
A
T0
T1
T2
T3
C
K
B
1
A
C
D
D
A
C
K
A
B
T0
T1
X T2
Z T3
2
C
D
X
Z
K
K
B
B
P
•
Q
Theorem: if no soln. for bumping net B, and obstacle pattern OP1 is obtained
– If in another search path, B is bumped again and OP1  AP2
18
– Then no solution exists
Optimality-Preserving Speedup Methods (Cont.)
(3) Learning-Based Search Space Pruning
• Isomorphic Function:
– f: is a one-to-one and onto functions between tracks that maps
Ti  Tk where Tk=Tf(i)
D
A
C
K
B
T0
T1
T2
T3
C
K
A
B
D
T
X T0
Z T1
2
T3
Pattern Isomorphism: Let P1 and P2 be obstacle patterns,
If all nets on each track of P1 appears on a unique, possibly
different, track of P2, then P1 is isomorphic to P2.
19
Optimality-Preserving Speedup Methods (Cont.)
(3) Learning-Based Search Space Pruning
D
A
C
K
B
T0
T1
T2
T3
1
A
C
D
C
K
A
B
D
A
T
X T0
Z T1
2
T3
2
C
D
X
Z
K
K
B
B
P
•
Q
Theorem: if no soln. for bumping net B, and obstacle pattern OP1 is obtained
– If in another search path, B is bumped again
• OP2  AP2 & OP1 isomorphic to OP2
– Then no solution exists
20
Experimental Results
(Characteristics of VPR Benchmark Circuits)
Circuit
Nam e
C499
Opt. Circuit
Nam e
FPGA Size # net # pins track
10x10
147 443
7
vda
Opt.
FPGA Size # net # pins track
18x18
399 1511
10
m m 9a
13x13
205
787
6
frg2
36x36
558
2177
6
alu2
15x15
236
1001
7
apex6
30x30
575
2199
6
s1
14x14
248
990
7
ex4p
22x22
586
2337
6
s1423
15x15
272
1133
6
m m 30a
23x23
651
2634
6
t481
15x15
280
1087
8
m isex3c
24x24
663
2773
9
sand
16x16
285
1236
7
ex5p
33x33
1072
5391
15
m m 9b
15x15
292
1107
6
tseng
33x33
1248
5409
7
planet
17x17
307
1357
6
m isex3
38x38
1658
7013
12
planet1
17x17
308
1357
7
alu4
40x40
1748
7632
12
x4
21x21
310
1136
6
diffeq
39x39
1786
7588
8
s1196
17x17
325
1354
7
des
63x63
1847
8456
8
i6
26x26
328
1115
4
apex2
44x44
2284
9431
13
duke2
16x16
328
1306
8
elliptic
61x61
4175 15565
11
s1488
18x18
340
1508
6
spla
61x61
4286 18512
14 21
Experimental Results
(Speedup Method Results)
• ROAD-1: Basic B&R with 1st TC Function
• ROAD0: 1st TC Function  2nd TC Function
• ROAD: LBS + Clique-Based pruning mtd.
Added to ROAD0
ROAD (bumped & Refit based OptimAl Detailed
router)
22
Experimental Results
(Internal Comparisons)
Circuit
Name
planet1
mm9a
x4
alu2
i6
s1
s1488
s1423
C499
sand
t481
planet
mm9b
planet1
s1196
x4
duke2
i6
vda
s1488
Time in secs
ROAD-1
ROAD-0
1.06
3299.22
18.06
4015.23
4.60
3163.87
12.21
537.34
131.76
2010.56
1.06
18.06
4.60
12.21
ROAD
2.24
130.03
3.41
1.44
4.56
1.40
2.97
4.65
51.52
4.90
37.80
2.29
10.05
2.24
87.70
3.41
238.20
4.56
1670.47
2.97
2.37
1.27
2.72
1.41
3.93
1.14
2.31
2.56
0.54
2.08
1.80
2.05
2.45
2.37
3.10
2.72
2.11
3.93
4.99
2.31
total(10 Ckt)
13193.91
157.89
Speed up over ROAD-1
83.56
total ( 16) ckt
2253.63
Speed up over ROAD-0
21.84
604.12
36.83
61.19
Extrapolated Speed up of ROAD over ROAD-1
83 x 61 = 5763
23
Experimental Results
(Extracting VPR Global Routing Topology)
VPR Placement
2
1
1
2
1
Flat-routing
2
1
ROAD SEGA
0
0
012
01 2
VPR Router
01 2
Comparisons of
ROAD, VPR and SEGA
24
Experimental Results
(Comparing with VPR-SEGA combine)
120%
100%
80%
60%
40%
20%
m
pl 9b
an
et
1
s1
19
du 6
ke
2
vd
a
ap
e
m x6
m
30
a
ex
5p
m
ise
x
di
ffe
q
ap
ex
2
To
ta
l
81
m
t4
s1
m
9a
0%
m
Improvement over SEGA
Comparisions of SEGA, ROAD track numbers
25
Experimental Results
(Comparing with VPR-SEGA combine)
Speedup factor over SEGA
Comparisions of VPR, ROAD, SEGA runtimes
(SEGA runtime taken as baseline)
25
20
15
VPR
ROAD
10
ROAD=(7x)
5
0
To
ta
l
de
s
pl
an
et
s1
19
6
s1
48
8
ap
ex
6
m
ise
x3
c
m
ise
x
t4
81
al
u2
VPR=(4x)
26
Approx. VPR detailed-runtime=(VPR flat-runtime) - (VPR global-runtime)
Empirical Avg. Case Time Complexity
of ROAD
800.00
700.00
y = 3E-05x2 + 0.0347x - 11.105
600.00
500.00
Time
y = 0.1409x - 56.939
400.00
300.00
200.00
y = 2E-11x 4 - 1E-07x3 + 0.0003x2 - 0.1827x + 32.085
100.00
0.00
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
-100.00
number of nets
ROAD
Linear (ROAD)
Poly. (ROAD)
Poly. (ROAD)
27
CONCLUSION
• ROAD: uses the bump-and-refit (B&R)
– Overcomes net-ordering problem
– Optimality-preserving search space pruning methods that
give us orders of magnitude speedup
• Large circuits: VPR’s flat routing  time consuming
• Hence need two-stage routing (global followed by
detailed routing) possibly interleaved across the
nets
• ROAD  prime candidate for detailed routing
phase in this two-stage framework
28
THANK YOU
29