Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation

Download Report

Transcript Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation

Timing-Driven Routing for FPGAs
Based on Lagrangian Relaxation
Seokjin Lee*, D. F. Wong+
*Dept.
of Electrical and Computer Engineering
+Dept. of Computer Sciences
The University of Texas at Austin
1
Outline


Overview
Introduction



Problem Formulation




Routing graphs and Timing graphs
Algorithm Description


FPGA Architecture, Routing resources
FPGA routing problem
Lagrangian Relaxation
LR_ROUTE, NET_ROUTE
Experimental Results
Conclusion
2
Overview




A new timing-driven routing algorithm
for FPGAs
Find a routing with minimum critical
path delay for a given placed circuit.
Handling of the timing constraints in a
mathematical programming framework.
Routing results are compared with
those of VPR router.
3
FPGA Architecture

Logic modules



Routing resources



Implements logic
functions
LUTs, flip-flops
Wire segments
Programmable
switches
I/O modules
programmable
switch
S
S
S
L
S
L
S
S
S
S
L
L
L
L
I/O
module
L
S
L
logic
module
wire
segments
L
<A typical FPGA architecture>
4
FPGA Routing Resources

Prefabricated wire
segments


Routing constraints :
Sharing of a wire
c
segments by different d
nets is not possible
Limited Routability

High RC delays and
large area of switches
a b
L3
L1
e
f
L4
L2
g h
5
FPGA Routing
a
b
4
1
12
L1
3
9
L3 11
10
2
c
e
d
f
8
5
16
L2
6
15
L4
13
7
14
g
h
6
Routing Graph Gr (Vr , Er)
a
b
4
1
L1
3
9
e
d
f
5
L2
6

13
L4
14
g h
15
b
10
c
e
d
f
16
7
9
a
2
c
8

L3 11
10
2

3
12
g
8
h
7
16
13
Vr : I/O pins of logic modules, wire segments
Er : feasible connections between the nodes
Routing problem: Find vertex disjoint trees
T={T1,…Tn}
7
Timing Constraints

Source-to-sink delays of nets



Delay of wire-switch chains
Calculated from architecture specific RC values
based on Elmore delay model
Timing constraints

Specified by arrival times at primary inputs
(outputs of storage elements) or required
times at primary outputs (inputs of storage
elements)
8
Timing Graph Gt (Vt , Et)





primary
input
Constructed from
input netlist
Captures timing
constraints
Vt : inputs, outputs, s
logic module pins
logic
modules
primary
output
t
Et : source-sink pairs of nets, input-output pairs of
logic modules
Fictitious nodes
s : connects primary inputs, t : connects primary outputs
9
Timing-Driven FPGA Routing


Minimization of critical path delay under timing and routing
constraints
Find vertex disjoint routing trees T = {T1, …, Tn} for all the
nets such that
Minimize
subject to
at
au  Duv  av
(u, v)  Et
(if u  s, as  0 and Dsv  arrivaltimeof input v
if v  t , Dut  0 and au  arrivaltimeof outputu
else, au  arrivaltimeat node u, av  arrivaltimeat node v,
Duv  delay along path(u, v))
10
Lagrangian Relaxation


General technique for solving optimization
problems with difficult constraints
Lagrangian subproblems


New objective function: adding constraints to
the original objective function after multiplied
by constants (Lagrangian multipliers)
Iteratively update Lagrangian multipliers and
solve Lagrangian subproblems
11
Lagrangian Relaxation
Original problem
m in f ( x)
s.t
g1(x)  b1
g 2(x)  b2
...
g k (x)  bk
Lagrangian subproblem
min f (x)  1 ( g1 (x)  b1 )
 2 ( g 2 (x)  b2 )
   k ( g k (x)  bk )
update 1 , 2 ,, k
12
LR for Our Problem
Original problem
min at
s.t au  Duv  av
(u,v) Et
Lagrangian subproblem
min Lλ(a,T)
 au 

( u ,v )Et
uv
(au  Duv  av )
update λ
13
Optimality Conditions


Optimality conditions on 
By rearranging terms,
Lλ (a, T )  (1   ut )at
a
( u ,t )
 (  wu 
(w,u)


uw
) aw
( u , w)
 uv Duv
( u ,v )
L / au  0 u Vt
1   ut

( w,v )Et
wv 
( u ,t )ET

( u , w)Et
uw
ac
b
d
c
bc
dt
cd
et
e
dt  et  1
ac  bc  cd
w Vt  {s, t}
14
t
Simplified Lagrangian Subproblem
Lλ (a, T )  (1   ut )at  (  wu   uw )aw   uv Duv
( u ,t )
(w,u)
( u , w)



 


 ( u ,v )
0
0
Optimality conditions on 
Lλ (T ) 

( u ,v )Et
uv
Duv
Lagrangian subproblem becomes
LS ( ) : min Lλ (T ) 

( u ,v )Et
uv
Duv
15
Updating Lagrangian Multipliers
Subgradient Method

r 1
uv
 max{0,   r (au  Duv  at )}
r
uv
 r : stepsize
r
lim r  0  lim  i  
r 
r 
i 1
 convergence
16
LR_ROUTE
1.
2.
3.
4.
5.
Initialize 
Call NET_ROUTE to solve LS()
Compute au for each u Vt
Update uv for each (u, v)  Et
Repeat Steps 2-4 until no shared
resource exists.
17
Solving Lagrangian Subproblem


NET_ROUTE
Find routing trees T for a set of given
multipliers  such that
Minimize
subject to

( u ,v )Et
uv
x
ik
Duv
1
i Vr
k
where xik  1, if Tk for net k uses node i
0, otherwise
18
Solving Lagrangian Subproblem

Duv
x
1
LS ( ) : min
( u ,v )Et
s.t
uv
ik
k
min Lμ ( x ) 

( u ,v )Et
uv
Duv   i ( xik  1)
iVr
k
 {  uv Duv   i xik }   i
netk ( u ,v )netk
iVr
iVr





weighted sink
delay for net k
routing congestion
cost for net k
constant
19
Routing Nets

For net k,
minimize

uv
( u ,v )netk


Duv   i xik
( u ,v )netk

iVr
uv
d   x
i
i path ( u ,v )
iVr
i ik
Cost for each node:
ci  uv d i 

delay
cost

i
congestion
cost
20
NET_ROUTE
1.
2.
3.
4.
5.
For each net k
Rip up routing for net k
for each sink v of net k
Maze route from source to sink
with cost ci  uv di  i
Update i for all nodes in path(u, v)
21
Experimental Results

FPGA model used



Symmetrical-array-based FPGA
Each logic block contains four 4-input LUTs and flipflops
Switch connections: Fs = 3, Fc = W
Fs: number of connections per wire entering the
switch box
Fc : number of tracks to which each logic
block pin can connect
W : number of tracks in a channel
22
Experimental Results


Tested on large circuits from MCNC
benchmark
Routing with fixed channel width



Minimum channel width obtained by
running VPR in timing-driven mode
Better results for 13 circuits (out of 17)
Critical path delay improved up to 33%
with comparable runtime
23
Experimental Results

Critical path delay and runtime comparison
Circuits
LUTs
/ FFs
Number
of Tracks
Delay (ns)
Runtime (s)
VPR
LR_ROUTE
VPR
LR_ROUTE
Alu4
1522
33
46.6
46.2
58
57
Apex2
1878
43
61.5
49.3
61
46
Apex4
1262
41
45.4
48.9
29
41
Bigkey
1707
24
41.7
27.8
53
62
Clma
8383
51
125.0
96.4
531
464
Des
1591
24
43.5
48.1
44
42
Diffeq
1497
29
48.8
48.6
32
31
Dsip
1370
25
29.6
27.6
53
78
Elliptic
3604
40
77.1
71.3
151
256
24
Experimental Results
Circuits
LUTs
/ FFs
Number
of Tracks
Delay (ns)
Runtime (s)
VPR
LR_ROUTE
VPR
LR_ROUTE
Ex1010
4598
44
83.5
75.2
248
351
Ex5p
1064
43
44.8
43.7
22
34
frisc
3556
43
81.5
84.3
121
171
misex3
1397
37
42.5
49.4
50
49
pdc
4575
61
96.5
95.0
304
465
s298
1931
28
98.7
91.5
71
85
seq
1750
35
55.9
47.0
55
67
spla
3690
56
94.7
74.0
203
234
25
Conclusion




A new timing-driven routing algorithm
for FPGAs
Find a routing with minimum critical
path delay for a given placed circuit.
Handling of the timing constraints by
Lagrangian relaxation.
Routing results are better than those of
VPR router.
26