Example: Traveling Salesman Problem (TSP)
Download
Report
Transcript Example: Traveling Salesman Problem (TSP)
High-Performance Global Routing
with Fast Overflow Reduction
Huang-Yu Chen, Chin-Hsiung Hsu, and Yao-Wen Chang
National Taiwan University
Taiwan
Outline
Introduction
Preliminary
Routing flow of NTUgr
Multiple forbidden-regions expansion
Critical nets rerouting selection
Look-ahead historical cost increment
Experimental results
Conclusions
2
Outline
Introduction
Preliminary
Routing flow of NTUgr
Multiple forbidden-regions expansion
Critical nets rerouting selection
Look-ahead historical cost increment
Experimental results
Conclusions
3
Global Routing Problem
Global routing is the first stage to tackle modern VLSI
routing challenges
Connect pins of each net in the global routing graph:
A global tile node represents a tile (global cell)
A global edge models the relationship between adjacent tiles
Overflow of a global edge: the amount of routing demand
that exceeds the given capacity
Global edge
Tile boundary
Tile
Global tile
node
4
Objectives of Global Routing
Major objectives:
minimize the total overflow
minimize the maximum overflow
Minor objectives:
minimize the total wirelength
minimize running time
5
Outline
Introduction
Preliminary
Routing flow of NTUgr
Multiple forbidden-regions expansion
Critical nets rerouting selection
Look-ahead historical cost increment
Experimental results
Conclusions
6
State-of-the-art Global Routers
Archer [ICCAD’07]
BoxRouter [ICCAD’07]
FastRoute [ICCAD’06, ASPDAC’07, TCAD’08]
FGR [ICCAD’07]
NTHU-Route [ASPDAC’08, ICCAD’08]
Those routers adopt INR (Iteratively Negotiation-based
Rip-up/rerouting) to effectively reduce overflows
7
INR (Iteratively Negotiation-based Rip-up/rerouting)
Proposed in PathFinder [McMurchie and Ebeling, FPGA’95]
Spreads the congested wires iteratively
At the (i)-th iteration, the cost of a global edge e:
(be he(i ) ) pe
be: base cost of using e,
pe: # of nets passing e,
( i 1)
h
1
(i )
e
(i)
he : historical cost on e, he (i 1)
he
if e has overflow
otherwise
INR may get stuck as the number of iterations increases
[Ozdal, ICCAD’07] [Gao et al., ASPDAC’08]
8
Contributions
NTUgr --- a high-quality global router
The 2nd place of ISPD 2008 Global Routing Contest
3D Benchmark
3D 2D capacity mapping
Enhanced 2D routing
2D 3D layer assignment
Prerouting
Initial Routing
Iterative Forbidden-region
Rip-up/rerouting (IFR)
3D routing result
9
Outline
Introduction
Preliminary
Routing flow of NTUgr
Multiple forbidden-regions expansion
Critical nets rerouting selection
Look-ahead historical cost increment
Experimental results
Conclusions
10
The Routing Flow
3D Benchmark
3D 2D capacity mapping
Enhanced 2D routing
2D 3D layer assignment
Prerouting
Initial Routing
Iterative Forbidden-region
Rip-up/rerouting (IFR)
3D routing result
11
Prerouting
1. Congestion-hotspot historical cost pre-increment
Identify the high-pin-density tiles (#pin exceeds total tile capacity)
Increase the historical cost lying around these tiles by 10
To avoid other nets passing through these congested tiles
2. Small bounding-box area routing
Route the less-flexibility nets with smaller bounding-box area
Prerouting of newblue3
(49.22% routed nets,
74374 overflows)
12
The Routing Flow
3D Benchmark
3D 2D capacity mapping
Enhanced 2D routing
2D 3D layer assignment
Prerouting
Initial Routing
Iterative Forbidden-region
Rip-up/rerouting (IFR)
3D routing result
13
Initial Routing
The first stage completing all nets in the whole chip
Apply iterative monotonic routing until the overflow
improvement is less than 5%, cf. the previous iteration
Initial routing of newblue3
(100% routed nets,
306082 overflows)
14
The Routing Flow
3D Benchmark
3D 2D capacity mapping
Enhanced 2D routing
2D 3D layer assignment
Prerouting
Initial Routing
Iterative Forbidden-region
Rip-up/rerouting (IFR)
3D routing result
15
Iterative Forbidden-region Rip-up/rerouting (IFR)
An enhanced flow over the traditional INR
Perform iteratively until no overflow or timeout
Multiple forbidden
regions expansion
Critical nets
rerouting selection
IFR:
Look-ahead historical
cost increment
No overflow
or timeout
N
Y
16
Outline
Introduction
Preliminary
Routing flow of NTUgr
Multiple forbidden-regions expansion
Critical nets rerouting selection
Look-ahead historical cost increment
Experimental results
Conclusions
17
Multiple Forbidden-Regions Construction
At each iteration of IFR, new forbidden regions are
constructed from the most congested regions
Initially contains two adjacent tiles w.r.t. the most congested edge
Expand the region until the average congestion of each boundary
is smaller than a threshold (overlap is allowed)
Apply a special cost metric for nets in forbidden regions
Introducing new overflows within these regions is almost forbidden
by incurring a large penalty
Forbidden-region routing of adaptec5
18
Cost Considering Forbidden Regions
The cost function of a global edge e:
Pn (de ce ) if e forbidden regions and is congested
cost(e)
(i )
b
(1
h
e
e ) otherwise
Pn : forbidden region penalty (=1000 in NTUgr)
de : routing demand of e
ce : routing capacity of e
he(i ) : historical cost of e
1 (ce d e ) if d e ce
be
if d e = ce (penalized base cost of e)
d c if d > c ( 3 in NTUgr)
e
e
e
e
19
Region Propagation Leveling
Applied when # of overflows stops decreasing (get stuck
at the local optima)
Stop creating new forbidden regions
Expand all forbidden regions at the previous iteration
simultaneously
(i)-th iteration (i+1)-th iteration (i+2)-th iteration
final iteration
Forbidden-region routing of bigblue3
20
Final Expansion of Forbidden Regions
Applied when # of overflows < 0.5% of initial overflow
Expand the forbidden region to the whole routing graph to
quickly reduce the remaining overflows
Overflow reduction of adaptec5
IFR w/ final IFR w/o final
expansion expansion
Traditional INR
21
Comparisons of Congested Regions
BoxRouter NTHU-Route 1.0
NTUgr (Ours)
Terminology
Box
Congested region Forbidden region
Shape
Rectangular
Rectangular
Rectilinear
# of regions
Single box
Single region
Multiple regions
Objective
Performing
progressive
ILP
Selecting
rerouting nets
Performing
different cost
functions
Simultaneous
expansion
No
No
Yes
22
Outline
Introduction
Preliminary
Routing flow of NTUgr
Multiple forbidden-regions expansion
Critical nets rerouting selection
Look-ahead historical cost increment
Experimental results
Conclusions
23
Critical Nets Rerouting Selection
To speed up the rip-up/rerouting process
Only rip-up/reroute the critical nets in each iteration
The critical nets are those nets with overflows or small
remaining capacity:
min{ce d e } 1
en
ce : routing capacity of e
de : routing demand of e
24
Outline
Introduction
Preliminary
Routing flow of NTUgr
Multiple forbidden-regions expansion
Critical nets rerouting selection
Look-ahead historical cost increment
Experimental results
Conclusions
25
Look-Ahead Historical Cost Increment
For the near-overflow global edges (those edges would
have overflow if more N demands are added), increase
their historical cost in advance
(i )
e
h
he( i 1) 1
( i 1)
he
if de N ce (e is near-overflow)
otherwise
N 1 : The look-ahead historical-cost update scheme
Setting N = 1 in NTUgr results in better quality and with
about 2x runtime speedup
26
Outline
Introduction
Preliminary
Routing flow of NTUgr
Multiple forbidden-regions expansion
Critical nets rerouting selection
Look-ahead historical cost increment
Experimental results
Conclusions
27
Results on ISPD’08 Benchmarks
Compared with the winners of ISPD’08 global routing contest
Runtime is averagely the same with NTHU-Route 2.0 (for the ten
overflow-free cases for the three routers)
Overflow is better than FastRoute 3.0
NTHU-Route 2.0
Circuit
Overflow
adaptec1
adaptec2
adaptec3
adaptec4
adaptec5
bigblue1
bigblue2
bigblue3
bigblue4
newblue1
newblue2
newblue3
newblue4
newblue5
newblue6
newblue7
Comp.
0
0
0
0
0
0
0
0
182
0
0
31454
152
0
0
68
-
Max
Overflow
0
0
0
0
0
0
0
0
2
0
0
204
2
0
0
2
-
ISPD'08
WL (e5)
53.5
52.3
131.1
121.7
155.6
56.3
90.6
130.8
230.8
46.5
75.7
106.5
129.9
231.7
177.0
353.6
1.00
FastRoute 3.0
NTUgr (Ours)
CPU
ISPD'08 CPU
ISPD'08 CPU
Max
Max
Overflow
Overflow
5
(min)
Overflow WL (e ) (min)
Overflow WL (e5) (min)
7.9
0
0
55.5
1.8
0
0
57.4
4.5
1.7
0
0
53.1
0.4
0
0
53.7
1.1
8.0
0
0
133.3
1.7
0
0
135.0
4.4
2.3
0
0
122.2
0.6
0
0
123.7
1.2
17.2
0
0
160.9
4.7
0
0
159.9
15.3
9.8
0
0
58.3
3.9
0
0
60.0
18.1
9.9
142
4
98.2
11.2
0
0
91.2
248.3
4.3
0
0
131.7
3.1
0
0
133.5
4.0
125.6
206
2
243.4 41.3
188
8
242.8 413.1
5.1
76
2
49.0
12.9
6
2
49.3
977.5
1.1
0
0
76.3
0.3
0
0
76.9
0.6
129.1
31650
734
109.3 31.5
188.3 884.0
31024
408
The best
67.1
226
4
135.7
9.9
142
2
143.8
1118.1solution
14.2
0
0
241.2
5.5
0
0
244.9
in the20.5
literature!
13.6
0
0
186.6
4.0
0
0
186.6
21.3
140.6
588
6
358.6 189.9
310
2
372.2 1445.5
3.48
1.03
1.00
1.04
3.01
28
Effects of Look-Ahead Historical Cost Increment
Achieved 1.94x speed up and better overflow reduction
with similar total wirelength
w/o look-ahead he increment
Circuit
adaptec1
adaptec2
adaptec3
adaptec4
adaptec5
bigblue1
bigblue2
bigblue3
bigblue4
newblue1
newblue2
newblue3
newblue4
newblue5
newblue6
newblue7
Comp.
Overflow
0
0
0
0
0
0
4
0
224
16
0
31404
152
0
0
424
-
Max ISPD'08 CPU
Overflow WL (e5) (min)
0
55.8
5.4
0
53.2
1.4
0
133.4
5.4
0
122.6
2.3
0
158.5
22.0
0
59.6
16.4
2
97.6
176.6
0
135.4
5.2
4
242.4
93.8
2
49.6
212.8
0
77.5
0.9
426
132.2
108.6
2
142.6
730.8
0
243.5
27.2
0
191.3
154.8
2
371.9 1423.1
1.00
1.94
w/ look-ahead he increment
Overflow
0
0
0
0
0
0
0
0
188
6
0
31024
142
0
0
310
-
Max
Overflow
0
0
0
0
0
0
0
0
8
2
0
408
2
0
0
2
-
ISPD'08
WL (e5)
57.4
53.7
135.0
123.7
159.9
60.0
91.2
133.5
242.8
49.3
76.9
188.3
143.8
244.9
186.6
372.2
1.00
CPU
(min)
4.5
1.1
4.4
1.2
15.3
18.1
248.3
4.0
413.1
977.5
0.6
884.0
1118.1
20.5
21.3
1445.5
1.00
29
Outline
Introduction
Preliminary
Routing flow of NTUgr
Multiple forbidden-regions expansion
Critical nets rerouting selection
Look-ahead historical cost increment
Experimental results
Conclusions
30
Conclusions
NTUgr--- a high-quality global router for overflow reduction
1. Prerouting
Congestion-hotspot historical cost pre-increment
Small bounding-box area routing
2. Initial iterative monotonic routing
3. Iterative forbidden-region rip-up/rerouting (IFR)
Multiple forbidden-regions expansion
Look-ahead historical cost increment
Critical nets rerouting selection
Have achieved good results in terms of both overflow and
runtime for the new ISPD’08 benchmarks
31
Conclusions and Future Work
A dummy fill algorithm considering both gradient
minimization and coupling constraints
Achieve more balanced metal density distribution with
fewer dummy features and an acceptable timing
overhead
Thank You!
Huang-Yu Chen
Future work: integration of gradient minimization and
coupling [email protected]
constraints
Simultaneously minimize the gradient and the coupling
capacitance
32