Transcript Document

CSE 494: Electronic
Design Automation
Lecture 4
Partitioning
Organization
Partitioning
 Kernighan-Lin (KL) Heuristic
 Fiduccia-Mattheyses (FM) Heuristic
 Simulated annealing

Partitioning
 Division
of a graph (or hypergraph) into
multiple sub-graphs is known as
partitioning.
 Partitioning should



Maintain functionality
Minimize interconnections between subgraphs
Have low run-time complexity
Problem Formulation

Given





Partition V into {V_1,V_2,V_3,…,V_k} where





A hypergraph G(V,E)
V = {v_1,v_2,…,v_n} set of vertices
E = {e_1,e_2,…,e_m} set of hyperedges where e_i =
{v_i, v_j, …,v_k}
Area of each vertex, a(v_i)
V_i intersection V_j is empty set, i<>j
Union of all V_i = V
Size of each partition < Constraint
Cut-set is minimized
Partitioning is an NP complete problem.
Objective and Constraints
 Objective


Obj1: Minimize interconnection between
various partitions
Obj2: Minimize delay due to partition
 Constraints



Const1: Number of terminals or pins.
Const2: Area of each partition
Const 3: Number of partitions
Partitioning and Design Styles

Full Custom



Standard Cell




Area and terminal count constraints
Minimize nets crossing a partition, delay
At RTL, Circuit
Partition RTL specification into dis-joint sub-circuits, such that
each sub-circuit corresponds to a standard cell
Minimize nets, delay
Gate array



At RTL
Partition RTL specification recursively such that each partition
corresponds to a gate.
Minimize delay
Classification of Partitioning
Algorithms
 Constructive
algorithms versus iterative
improvement algorithms
 Deterministic versus probabilistic
algorithms
Bi-partitioning problem
 Also
known as min cut partitioning
 Number of partitions = 2
 Minimize the nets crossing the partitions
 Size of the two partitions is equal
 Given a graph with N nodes, calculate the
number of different bi-partitions!
Kernighan-Lin (KL) Heuristic
 Bi-partitioning
algorithm
 Input specified as a graph G(V,E)


Obj: Divide V into two equal halves
Minimize cut-set
 Iterative

improvement
Starts with a random initial partition.
KL: Input and Output
1
5
1
5
2
6
2
6
3
7
3
7
4
8
4
8
KL: Gain Calculation
 For



each vertex a
I(a) = number of edges that do not cross cut
E(a) = number of edges that cross the cut
Gain(a) = E(a) – I(a)
 If
two vertices a in A and b in B are
exchanged

Gain(a,b) = Gain(a) + Gain(b) – 2c(a,b)
 Cutcost’ =
Cutcost - Gain(a,b)
 For the remaining vertices x in A and y in B


Gain’(x) = Gain(x) + 2c(x,a) – 2c(x,b)
Gain’(y) = Gain(y) + 2c(y,b) – 2c(y,a)
KL: Strategy
 From
a node from each partition whose
exchange results in largest gain.
 Exchange the nodes, and lock them in the
new partitions.
 Maintain a table that records and updates
the cumulative gain after every move.
 Continue exchanging nodes until all nodes
are locked.
 Based on the table implement the first “k”
moves that result in largest gain.
KL: Table
Iteration
Vertex pair
Gain(i,j)
Sum of
Gain(i,i)
Cutsize
0
-
-
-
9
1
(3,5)
3
3
6
2
(4,6)
5
8
1
3
(1,7)
-6
2
7
4
(2,8)
-2
0
9
KL: Algorithm
begin
initialize();
Complexity = O(n^3)
while (improve == TRUE)
while (UNLOCK(A) == TRUE)
for all unlocked (a) in A
for all unlocked(b) in B
if (cutcost + gain(a,b) < min)
min = cutcost + gain(a,b)
sel_a = a, sel_b =b
cutcost = min, lock(sel_a), lock(sel_b), update(T)
implement first k moves that achieve the lowest cutset
set improve
end
KL Drawbacks
 Handles
only unit vertex nodes.
 Addresses only exact bisections.
 Cannot handle hypergraphs.
 Time complexity is high.
Fiduccia-Mattheyses (FM) Problem
Definition
Given
 A hypergraph G(C, N) where C is the set
of cells, and N is the set of nets.

Each cell i has a size s(i).
 A fraction
r = |A|/(|A| + |B|)
 Partiton G into two block A and B such that


the resulting cutset is minimized, and
the fraction r is satisfied.
FM Definitions
 Total
number of nets: N
 Total number of cells: C
 Size of each cell: s(i)
 Number of cells in a net: n(i)
 Number of pins in a cell: p(i)
 Total number of pins: p(1) + p(2) + .. P(C)
= n(1) + n(2) + …n(N) = P
FM Definition
 The
cut state of a net is ‘1’, if the net has
cells in both partitions.
 A net is considered critical if it has a cell
which if moved will change its cut state:


No cell in one partition (or all cells are in one
partition),
It has only one cell in partition A, and the
remaining are in partition B.
FM Strategy

Overall strategy is similar to KL.


Iterative improvement.
However, some modifications.

Support for hypergraphs.
 Only one cell moved at a time.



Max gain
Maintains the ratio (r-smax <= r <= r+smax)
Efficient data structures for:



Accessing cells and nets
Obtaining cells with max gain
Calculating and updating gain
Cell and Net Data Structures
 An

array of cell nodes
Each node has a linked list of nets
 A array

of nets
Each position has a linked list of cells
 Constructed
in O(P).
Bucket Structure






The gain when a cell is moved can vary from
pmax to - pmax.
Each partition has an array of pointers called the
bucket array.
Size of the array is given by 2*pmax + 1.
Each array location “i” has a linked list of
pointers with gain “-pmax + i”.
The bucket structure is utilized for bucket sort.
A pointer MAXGAIN that points to the location
with the maxgain cell.
Free List
 Once


a cell has been moved, and locked it is
Removed from the bucket structure.
Placed in the free cell list.
 Reduces
structure.
the number of entries in the bucket
Selection of base cell
 Consider
the cell of the highest gain from
each of the bucket structure.

Must satisfy r “inequality” on the move.
 Break
ties by selecting one that gives the
best r.
 Selected cell is called base cell.
 Remove from bucket structure, lock and
place in free list.
Initial Computation of Cell Gains
F
=> current or “from” block of cell i.
 T => target or “to” block of cell i.
 Gain determined by only critical nets.
 FS(i) => number of nets that have cell i as
their only F cell.
 TE(i) => number of nets that contain cell i
and have an empty T.
 G(i) = FS(i) – TE(i)
 Can be calculated in O(P).
Updating Cell Gains
 Base
cell is moved from one partition to
another.
 Only nets that are critical before and after
the move should be considered.
 Cells that are not locked and belong such
critical nets are updated.
Case 1
Updating Cell Gains
F
T
F
T
Case 3
F
F
Case 2
T
T
Case 4
Updating Cell Gains
Case 1
F
T
+1
+1
-1
T
F
+1
0
T
F
0
T
F
-1
-1
Updating Cell Gains
Case 2
F
T
0
+1
0
+1
F
T
0
0
T
F
+1
0
F
T
0
0
Updating Cell Gains
Case 3
F
T
0
+1
0
+1
F
T
0
0
0
T
F
0
0
F
T
+1
0
Updating Cell Gains
Case 4
F
-1
T
+1
-1
F
T
-1
T
F
-1
F
T
0
Updation Algorithm
For each net n on the base cell
Complexity if O(P)
/* critical before move */
If T(n) = 0 then incr gain of all free cells on n
If T(n) = 1 then decr gain of only T cell
/* change net distribution */
decr F(n), incr T(n)
/* critical after move */
If F(n) = 0 then decr gain of all free cells on n
If F(n) = 1 then incr gain on the only F cell
End
KL and FM are Deterministic algorithms
 Every
invocation of the algorithm with
identical inputs, generates the same
solution (hence, deterministic).
 Fast, but inherently greedy in nature.
Local minima
Cost
Successive solutions
Non-deterministic algorithms

Also known as probabilistic or stochastic
algorithms.
 Every invocation of the algorithm with identical
inputs generates a different solution.
 Slower than non-deterministic, but demonstrates
non-greedy behavior.
Cost
Hill-climbing behavior
Successive solutions
Simulated Annealing







Simulated annealing is a generic optimization
technique.
In PDA, it has been applied to partitioning and
placement.
Maintains a temperature variable that is reduced
from high value to a low value.
Number of solutions explored at each temperature
by modification of existing solution.
Solution that decreases cost is always accepted.
Accept solutions that increase cost at high
temperatures with greater probability.
At low temperatures accept solutions that increase
cost with very low probability.
Partitioning by Simulated Annealing
Algorithm SA
Begin
T = T_initial; P = initial partition; C = cutsize(P);
repeat
repeat
P’ = neighbourhood(P); C’ = cutsize(P’);
D = C’ – C; r = random (0,1);
If (D < 0 OR r < exp(-D/T)) accept P’;
until (equilibrium at T is reached)
T = alpha * T; /* 0 < alpha < 1 */
Until (T == T_final);
End.
Partitioning by Simulated Annealing
 A neighbourhood
solution could be
generated by exchanging of two nodes.
 Equilibrium at T

Apply fixed number of moves.
Ratio Cut
 KL aims
to generate equally sized bipartitions.
 FM gives the possibility for unequal
bipartitions.
 Neither, consider the graph structure itself.
 Ratio cut overcomes this limitation.
Ratio Cut

Ratio cut is a cost function.
 Utilized instead of just cut set.
C
R
| A|| B |