Performance And RLC Crosstalk Driven Global Routing

Download Report

Transcript Performance And RLC Crosstalk Driven Global Routing

Performance and RLC Crosstalk Driven
Global Routing
Ling Zhang, Tong Jing,
Xianlong Hong, Jingyu Xu
Jinjun Xiong, Lei He
Dept. of CST, Tsinghua Univ
Dept. of EE, UC, Los Angeles
Speaker: Xianlong Hong
Outline
Introduction & Previous Work
 Problem Formulations
 Our Algorithm: PO-GR
 Experimental Results & Discussions
 Conclusions

2015/7/21
ISCAS 2004, Vancouver, Canada
2
Introduction
Device size shrinking and clock frequency
increasing
 Coupling capacitance and inductance
could not be ignored
 Longer delay and crosstalk caused by
coupling effects
 Global routing with performance
optimization becomes more important.

2015/7/21
ISCAS 2004, Vancouver, Canada
3
Previous Work(1)
 Noise

modeling
Sakurai model
(T. Sakurai, C. Kobayashi, M. Node, 1993)

LSK model for calculating coupling inductance
(L. He, K. M. Lepak, 2000)

Model for calculating noise voltage
(K. M. Lepak, I. Luwandi, L. He, 2001)
2015/7/21
ISCAS 2004, Vancouver, Canada
4
Previous Work(2)

Noise minimization

Spacing in detailed routing phase
(K. Chaudhary, A. Onozawa et al, 1993)

Track permutation in detailed routing phase
(T. Gao, C. L. Liu, 1996)

Wire perturbation in detailed routing phase
(P. Saxena, C. L. Liu, 1999)

Crosstalk reduction after global routing phase
(T. X. Xue, E. S. Kuh, D. F. Wang, 1997)
(J. J. Xiong, J. Chen, J. Ma, L. He, 2002)

Coupling capacitance crosstalk reduction
in global routing phase
(J. Y. Xu, T. Jing, X. L. Hong, L. Zhang, 2004, ASP-DAC)
2015/7/21
ISCAS 2004, Vancouver, Canada
5
Major Contributions
An efficient crosstalk elimination algorithm
based on Tabu search and shielding
technology is proposed.
 Timing performance and routability are
simultaneously considered at global
routing level.
 By using LSK model, we take coupling
inductance into consideration.

2015/7/21
ISCAS 2004, Vancouver, Canada
6
Outline
Introduction & Previous Work
 Problem Formulations
 Our Algorithm: PO-GR
 Experimental Results & Discussions
 Conclusions

2015/7/21
ISCAS 2004, Vancouver, Canada
7
Problem Formulations(1)—Global
Routing Problem
Cells
e
GRC1
GRCi
v1
v2
GRG
Fig.1 Global Routing Graph(GRG)
2015/7/21
ISCAS 2004, Vancouver, Canada
8
Problem Formulations(2)—LSK Model
Accurate calculation:
kij 
Li , j
Li  L j
Simplified calculation in
LSK model:
f (i )  g ( j )
kij 
2
K
Kit for segment of net i in region t: 1
K it   j i kit , jt
(for all j sensitive to i)
g(j)
kij
f(i)
LSK, the total K value for net i:
LSK   lt  K it
Wire
t
(for all t occupied by net i)
order
gl
Ni
Nj
gr
Fig.2 LSK Model
2015/7/21
ISCAS 2004, Vancouver, Canada
9
Problem Formulations(3)—Tabu Search
Outline:
Step1. Select an initial solution xnow, and set Tabu list H=empty;
Step2. While not meet the stop conditions do
Generate a candidate list Can_N(xnow) from the neighborhood
N(xnow,H) of xnow that doesn’t conflict with H;
Select the best solution from Can_N(xnow):xnext;
xnow=xnext;
Update Tabu list H;
End While
Key factors:
neighborhood
Tabu object & Tabu length
aspiration rule
2015/7/21
How to search efficiently
How to choose properly
How to set the reasonably
ISCAS 2004, Vancouver, Canada
10
Outline
Introduction & Previous Work
 Problem Formulations
 Our Algorithm: PO-GR
 Experimental Results & Discussions
 Conclusions

2015/7/21
ISCAS 2004, Vancouver, Canada
11
Our Algorithm: PO-GR—(1)


Part 1: timing performance and routability
Part 2: Crosstalk estimation and elimination

Part 1 firstly generates an initial routing solution
considering congestion and timing optimization

Then, Part 2 eliminates the crosstalk from the
solution by inserting shields and gets a mid-result

Finally, regard the mid-result as input and send it
to Part 1 for iterations
2015/7/21
ISCAS 2004, Vancouver, Canada
12
Our Algorithm: PO-GR—(2)
1. Call Part 1 to generate a minimum wire length
initial solution X0 without congestion and timing
Part 1
violation;
Else do go back to 1. to generate a new
Iterations
2. Call Part 2 to obtain X1 = CEE(X0);
3. If no edge overflow in X1 then go to 4.;
Part 2
C
E
E
solution;
Subtract tracks used by shields
4. Call Part 1 again to obtain congestion and timing
optimized solution X2 from X1;
pseudo code of PO-GR
2015/7/21
Fig.3 flow chart of PO-GR
ISCAS 2004, Vancouver, Canada
13
Part 2—CEE
Get LSK bound
Insert shield
with specific
method
Step 1: Crosstalk bound budgeting
partition the LSK
bound at each
sink of a net
into the GRG
edges belonging
to the sourcesink paths.
Check each net to
eliminate
Step 2: Eliminate crosstalk in each region
possible remnant
crosstalk and
delete
Step 3: Local refinement
unnecessary
shields to
minimize total
area.
Fig.4 flow chart of CEE
2015/7/21
ISCAS 2004, Vancouver, Canada
14
Crosstalk Elimination Based on
Tabu Search(1)
Simulated
Annealing, or
Tabu search?
Get LSK bound
Step 1: Crosstalk bound budgeting
Step 2: Eliminate crosstalk in each region
Step 3: Local refinement
The runtime of Simulated Annealing could be very long,
while with similar performance, Tabu search is much faster.
2015/7/21
ISCAS 2004, Vancouver, Canada
15
Set the global solution in one GRG edge as initial solution xcur;
Set Tabu list H=empty; a=0; c=0;
While( a < Na )
tmpcost =  ;
b = 0;
While (b < Nb )
xnew = xcur;
randommove ( xnew );
If cost (xnew) is in H
c++;
If c < Nc, then continue;
Else c = 0;
If cost (xnew) < tmpcost, then
xtmp = xnew;
tmpcost = cost (xnew);
b++;
End While
Insert xcur into H;
xcur= xtmp;
If cost (xcur) < cost (xmin), then xmin = xcur; a = 0;
Else a++;
Update H;
End While
2015/7/21
ISCAS 2004, Vancouver, Canada
xcur: current solution;
xnew: candidate in
neighborhood of xcur;
xtmp: best candidate;
xmin: best solution ever
reached;
Na: maximum iteration times
with no improvement;
Nb: number of candidates
selected from neighborhood;
Nc: maximum trying times
for searching one candidate;
randommove(x): method of
generate a candidate in
neighborhood of x;
cost(x): evaluation of
solution x;
16
Crosstalk Elimination Based on
Tabu Search(2)
randommove(x):
cost(x)=w1*c1 + w2*c2 + w3*c3 + w4*c4
{
swap two net randomly,
move one net randomly,
c1: total number of nets that are adjacent
to their sensitive nets;
insert one shield randomly,
c2: total number of shields in a GRG edge;
remove one shield randomly
c3: summation of (Keff - Kth ) for all nets
}
with Keff > Kth in a GRG edge;
c4: total number of nets with (Keff
> Kth) in
a GRG edge.
2015/7/21
ISCAS 2004, Vancouver, Canada
17
Outline
Introduction & Previous Work
 Problem Formulations
 Our Algorithm: PO-GR
 Experimental Results & Discussions
 Conclusions

2015/7/21
ISCAS 2004, Vancouver, Canada
18
Benchmark Data
Circuits
Number of nets
Grids
745
9  11
1764
16  18
2356
16  18
C2
C5
C7
Technology: 0.2um
Sensitivity rate: 0.5 for all nets and sensitivity matrix is random.
LSK bound:1000 at each sink
2015/7/21
ISCAS 2004, Vancouver, Canada
19
Experimental Results(1)
Circuits
Step 2 in
CEE
Step 3 in
CEE
Total runtime
(Step2+Step3)
C2
C5
C7
901.97
2140.36
3748.78
Tabu search
45.75
112.87
237.80
Runtime reduction
856.22
2027.49
3510.98
SA
153.53
56.36
453.70
Tabu search
91.44
34.08
227.50
Runtime reduction
62.09
22.28
226.20
SA
1055.50
2196.72
4202.48
Tabu search
137.19
146.95
465.30
Total runtime reduction
918.31
2049.77
3737.18
Simulated
Annealing(SA)
Comparison of runtime(s) between Tabu search and Simulate Annealing
2015/7/21
ISCAS 2004, Vancouver, Canada
20
Experimental Results(2)
Circuits
Area
Shield
number
(Sn)
C2
C5
C7
SA
149196
271301
342395
Tabu search
149202
273307
346393
SA
158
460
589
Tabu search
165
501
621
Sn increment
7
41
32
Comparison of results between Tabu search and Simulated Annealing
2015/7/21
ISCAS 2004, Vancouver, Canada
21
Experimental Results(3)
Circuits
W mode
(um)
T mode
C2
C5
C7
(Part 1)Wire length
480350
1307456
1552916
(PO-GR) Wire length
477326
1368198
1575922
Wire length increment
-0.63%
4.65%
1.48%
(Part 1)Wire length
476424
1346876
1569366
(PO-GR) Wire length
479100
1280352
1567818
Wire length increment
0.56%
-4.94%
-0.10%
(Part 1) Min-R
-0.009243
0.012124
0.000034
(PO-GR) Min-R
-0.007195
0.003439
0.001243
Comparison of results between P1 and PO-GR
2015/7/21
ISCAS 2004, Vancouver, Canada
22
Discussions

Tabu search sharply decreases the runtime of step2 in
CEE(about 20x speedup), and doesn’t make any bad
effects on step3 in CEE(its runtime slightly decreases too).

Tabu search can obtains similar results in routing area
compared with SA method, while the shielding number
only increases a little.

Tabu search achieve 2.5x wire length reduction compared
with SA.

PO-GR keeps the effectiveness in timing optimization.
2015/7/21
ISCAS 2004, Vancouver, Canada
23
Outline
Introduction & Previous Work
 Problem Formulations
 Our Algorithm: PO-GR
 Experimental Results & Discussions
 Conclusions

2015/7/21
ISCAS 2004, Vancouver, Canada
24
Conclusions
PO-GR is able to:

Take coupling inductance into consideration.

Tackle coupling noise, timing performance and routability
simultaneously.

Efficiently eliminate crosstalk throughout the global
routing phase by inserting shields and has little influence
on wire length and timing performance.

Preserve the good routing result and greatly decrease
the running time.
2015/7/21
ISCAS 2004, Vancouver, Canada
25
THANK YOU
2015/7/21
ISCAS 2004, Vancouver, Canada
26