Transcript Figure 10-1

IAY 0600
Digitaalsüsteemide disain
FSM Decomposition Synthesis
Lab. 6
Alexander Sudnitson
Tallinn University of Technology
Decomposition motivation




Often convenient to realize a sequential circuit as an
interconnection of sub-circuits that realize the same
terminal behavior
A large hardware behavioral description is
decomposed into several smaller ones
First, decide on how the overall circuit is to be broken
up and what function each of the sub-circuits must
serve
Then, treat each of the sub-circuits as a separate and
independent design problem. One goal is to make the
synthesis problem more tractable by providing smaller
sub-problems that can be solved efficiently. Another
goal is to create descriptions that can be synthesized
into a structure that meets the design constraints.
2
FSM decomposition problem
Commonly, FSM decomposition problem is a task of replacement
of given prototype FSM with a network of interconnected and
interacting component machines, which has the same terminal
behaviour. Hardware behavior description is decomposed into a
network of interconnected FSMs targeting optimization by various
criteria (performance, measurements, power consumption).
1
FSM
Sub-FSM 2
outputs
n
inputs
3
Sub-FSM 1
outputs
inputs
2
Network of FSMs
3
FSM decomposition approaches
 In the past, synthesis focused on quality measures
based on area and performance. The continuing
decrease in feature size and increase in chip density in
recent years have given rise to consider decomposition
theory for low power as new dimension of the design
process.
 A range of decomposition techniques has been
proposed for the register transfer level optimization of
circuits for low power. Various FSM decomposition
techniques can broadly fall into two categories: additive
decomposition and multiplicative decomposition.
4
Dynamic power management
Systems and components are:
Designed to deliver peak performance, but …
Not needing peak performance most of the time
Dynamic power management (DPM):
Shut-down idle components
Dynamic voltage scaling (DVS)
Slow-down components, by scaling down frequency
and voltage
(c) Giovanni De Micheli
5
Additive decomposition
 The main idea of additive decomposition is that the
special “idle” state is added to each of sub-FSMs. Only
one of sub-FSMs in the decomposed network is working
in a time while all the others are suspended (stay in their
“idle or sleeping” states).
 The network of interacting sub-FSMs corresponds to a
given partition on the set of states of prototype FSM.
 The number of blocks in the partition defines number of
sub-FSMs in the network.
 The number of states of each sub-FSM is equal to the
number of states in the corresponded block of partition
plus one “idle” state.
6
Applet
7
Example FSM
As an example, we use presented
Mealy FSM with
S={ s1 , s2 , … , s8 } - set of
states,
X={x1 , x2 , … , x8 } - set of
Present
state
Next
state
Input
condition
Output
signals
s1
s1
s3
s3
s6
s8
s2
s1
s3
s7
s4
s4
s5
s8
s2
s3
s8
s5
s8
s5
x1
^x1
1
x7
^x7 & x8
^x7 & ^x8
x1 & x3
x1 & ^x3
^x1 & x2
^x1 & ^x2
x4 & x6
x4 & ^x6
^x4
x5
^x5 & x7
^x5 & ^x7
1
x6
^x6
y7
y10 y11
y10 y11
y2 y5 y10
y3 y4
y1 y3 y4
y3 y4
y1
y6 y13
y6 y13
y6 y8
y10 y11
y12
y10 y11
y1
y9 y14
s2
s3
binary input variables (channels),
Y= {y1 , y2 , … , y14 } - set of
s4
binary output variables
(channels)
x1
x2
x8
y1
y2
•
•
•
•
•
•
s5
s6
y14
s7
s8
h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
8
FSM description
The search for the next state means the evaluation of the Boolean
functions. It is necessary to evaluate which of these functions has
value “true” for a given input combination  from {0, 1} 8.
x1
x2
x3
•
•
x8 •
y1
y2
y3
y4
•
•
• y14
Here Input conditions
(Boolean functions)
presented using cubes.
Each cube corresponds
to the set of input
patterns.
Present
state
Next
state
Input condition
s1
s3
s7
s4
x1 & x3
x1 & ^x3
^x1 & x2
^x1 & ^x2
s4
Output signals
h
y3 y4
y1 y3 y4
y3 y4
y1
7
8
9
10
Present Next x1 x2 x3 x4 x5 x6x7 x8 y1 y2 y3 y4 … y13 y14
state state
h
s4
s1
s3
s7
s4
1
1
0
0
-1
-0
10-
-
-
-
-
-
00110000000000
10110000000000
00110000000000
10000000000000
7
8
9
10
9
Partition
A partition of a set S is collection of nonempty and pairwise disjoint
subsets of S which exhaust the set S.
S
We can give a diagrammatic
representation of partitions. If the set
S is represented by an enclosed area
on paper, we can draw lines to divide
the area into nonoverlapping regions.
Each region of the resulting diagram
will correspond to a block of partition.
B2
B1
B3
Example:
S={ s1 , s2 , … , s8 }
 ={ { s1, s4, s7 }, { s2 , s3, s6 }, { s5 , s8 } }
Distinct partition of S induce distinct equivalence relation on a set S.
10
Decomposition procedure
1) Generation of the initial partition
2) Definition of states in component FSMs
3) Definition of the set of external input variables of
component FSM
4) Generation of the set of the set of internal (additional)
input variables
5) Definition of the set of output variables of component
FSMs
6) Generation of transition and output functions of
component FSMs
7) Realization of FSM network.
11
Additive decomposition basics
Additive decomposition put network N with n
component FSM in accordance to pair (A,  ). The
number of component FSM is equal to the number of
blocks in the partition . A is a given FSM.
As example, we use partition,
 = { { s1, s4, s7 }, { s2 , s3, s6 }, { s5 , s8 } }
So, in this case there will be three component FSM
B1, B2 and B2, since there are three blocks ( B1, B2
and B3 ) in the partition .
12
Present
state
Next
state
s1
s1
s3
s3
s6
s8
s2
s1
s3
s7
s4
s4
s5
s8
s2
s3
s8
s5
s8
s5
s2
s3
s4
s5
s6
s7
s8
Input
condition
Output
signals
x1
y7
^x1
1
y10 y11
x7
y10 y11
^x7 & x8
^x7 & ^x8 y2 y5 y10
x1 & x3
y3 y4
x1 & ^x3 y1 y3 y4
^x1 & x2
y3 y4
^x1 & ^x2
y1
x4 & x6
y6 y13
x4 & ^x6
y6 y13
^x4
y6 y8
x5
y10 y11
^x5 & x7
y12
^x5 & ^x7 y10 y11
1
y1
x6
^x6
y9 y14
h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Decomposition
example
{ { s1, s4, s7 },
{ s2 , s3, s6 },
{ s5 , s 8 } }
413
Present
state
Next
state
s1
s1
s3
s3
s6
s8
s2
s1
s3
s7
s4
s4
s5
s8
s2
s3
s8
s5
s8
s5
s2
s3
s4
s5
s6
s7
s8
Input
condition
Output
signals
x1
y7
^x1
1
y10 y11
x7
y10 y11
^x7 & x8
^x7 & ^x8 y2 y5 y10
x1 & x3
y3 y4
x1 & ^x3 y1 y3 y4
^x1 & x2
y3 y4
^x1 & ^x2
y1
x4 & x6
y6 y13
x4 & ^x6
y6 y13
^x4
y6 y8
x5
y10 y11
^x5 & x7
y12
^x5 & ^x7 y10 y11
1
y1
x6
^x6
y9 y14
h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Decomposition
example
{ { s1, s4, s7 },
{ s2 , s3, s6 },
{ s5 , s8 } }
414
The set of states of (component) sub-FSM
An initial partition of example FSM decomposition is
 = { { s1, s4, s7 }, { s2 , s3, s6 }, { s5 , s8 } }
Set of states in the m-component FSM is defined as:
Sm = Bm  {am}
Bm- is the block of the partition ,
am - is the additional state that exists in each component FSM.
So, in our example:
S1 = { s1, s4, s7, a1 }
S2 = { s2, s3, s6, a2 }
S3 = { s5, s8, a3 }
The set of states in the component FSM contains the corresponding
block of the partition  plus one additional state a.
15
Present
state
Input
condition
h
s1
x1
^x1
1
x7
^x7 & x8
^x7 & ^x8
x1 & x3
x1 & ^x3
^x1 & x2
^x1 & ^x2
x4 & x6
x4 & ^x6
^x4
x5
^x5 & x7
^x5 & ^x7
1
x6
^x6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
s2
s3
s4
s5
s6
s7
s8
External input variables
The set of external input variables of
m-component FSM is the set of input
variables at all transitions from the states
of block Bm in the transition table of the
prototype FSM A.
In our example, with partition
{ { s1, s4, s7 }, { s2 , s3, s6 }, { s5 , s8 } }
corresponding sets of external input
variables are
X(B1) = { x1, x2, x3 }
X(B2) = { x5, x7, x8 }
X(B3) = { x4, x6 }
16
Present
state
s1
s2
s3
s4
s5
s6
s7
s8
External output variables
Output
signals
y7
y10 y11
y10 y11
y2 y5 y10
y3 y4
y1 y3 y4
y3 y4
y1
y6 y13
y6 y13
h
1
2
3
4
5
6
7
8
9
10
11
12
y6 y8
13
y10 y11
y12
y10 y11
y1
y9 y14
14
15
16
17
18
19
The set of external output variables of
m-component FSM is the set of output
variables at all transitions from the states
of block Bm in the transition table of the
prototype FSM A.
In our example, with partition
{ { s1, s4, s7 }, { s2 , s3, s6 }, { s5 , s8 } }
corresponding sets of external input
variables are
In our example:
Y(B1) = { y1, y3, y4, y7 }
Y(B2) = { y2, y5, y10, y11, y12 }
Y(B3) = { y6, y8, y9, y13, y14 }
17
{ { s1, s4, s7 }, { s2 , s3, s6 }, { s5 , s8 } } Generation
Present
state
Next
state
s1
s1
s3
s3
s6
s8
s2
s1
s3
s7
s4
s4
s5
s8
s2
s3
s8
s5
s8
s5
z3
s2
s3
s4 z3
z4
s5
s6
z5
s7
s8
Input
condition
Output
signals
x1
y7
^x1
1
y10 y11
x7
y10 y11
^x7 & x8
^x7 & ^x8 y2 y5 y10
x1 & x3
y3 y4
x1 & ^x3 y1 y3 y4
^x1 & x2
y3 y4
^x1 & ^x2
y1
x4 & x6
y6 y13
x4 & ^x6
y6 y13
^x4
y6 y8
x5
y10 y11
^x5 & x7
y12
^x5 & ^x7 y10 y11
1
y1
x6
^x6
y9 y14
h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
of transitions
of component FSMs
(the first sub-FSM)
Present Next
state
state
s1
s4
s7
a1
s1
a1
s1
a1
s7
s4
a1
a1
s4
Input
condition
Output
signals
x1
y7
^x1
z3
x1 & x3
y3 y4
x1 & ^x3 y1 y3 y4 z3
^x1 & x2
y3 y4
^x1 &
y1
^x2
1
y1 z5
^z4
z4
18
The first sub-FSM
Present Next
state
state
s1
s4
s7
a1
s1
a1
s1
a1
s7
s4
a1
a1
s4
Input
condition
Output
signals
x1
y7
^x1
z3
x1 & x3
y3 y4
x1 & ^x3 y1 y3 y4 z3
^x1 & x2
y3 y4
^x1 & ^x2
y1
1
y1 z5
^z4
z4
-
x1 x2 x3
z4
B1
y7 z 3
z5
y3
y1 y 4
19
Transition table of the first sub-FSM
{ { s1, s4, s7 }, { s2 , s3, s6 }, { s5 , s8 } }
B1
s1
s4
s5
s3
B2
s7
s5
B3
Present Next
state
state
s1
s4
B3
s7
a1
s1
a1
s1
a1
s7
s4
a1
a1
s4
Input
condition
Output
signals
x1
y7
^x1
z3
x1 & x3
y3 y4
x1 & ^x3 y1 y3 y4 z3
^x1 & x2
y3 y4
^x1 &
y1
^x2
1
y1 z5
^z4
z4
20
Transformation of transition from s7 to s5
s7
1
s7
a3
z5
s5
-
y1, z5
a1
Prototype_FSM
Sub-FSM_1
s5
Sub-FSM_3
21
FSM Network
x1 x2 x3
x5 x7 x8
x4 x6
z4
z8
B1
B2
z5
B3
z8
z4
z3
y3 y7
y1 y4
y5 y11
y2 y10 y12
y8 y14
y6 y9 y13
22
Component FSM B2
Present
state
s2
s3
s6
a2
Next
state
s3
s6
a2
s2
s2
s3
a2
Input
condition
1
x7
^x7 & x8
^x7 & ^x8
x5
^x5 & x7
^x5 & ^x7
a2
s3
z3
z3
Output
signals
y10 y11
y10 y11
z8
y2 y5 y10
y10 y11
y12
y10 y11
z8
23
Component FSM B3
Present
state
s5
s8
a3
Next
state
a3
s5
s8
s8
s5
s8
s5
a3
Input
condition
x4 & x6
x4 & ^x6
^x4
x6
^x6
z8
z5
^z5 ^z8
Output
signals
y6 y13 z4
y6 y13
y6 y8
y9 y14
-
24
FSM Stochastic Analysis




Given the FSM description and the input probabilities, the probabilistic
behavior of a FSM can be studied by regarding its transition structure
as a Markov chain.
A Markov process is a stochastic process, where the past has no
influence on the future. In other words, the future behavior depends
only on the current state of the process (a “Markov property”). Markov
process is called a Markov chain (MC) if its state space is discrete
(either finite or countable).
One example of MC is the process of playing a board game, where
player's next action is determined entirely by rolling a dice. In order to
make a move, one takes into account only the current state of the
board. It doesn't really matter how the game progressed to that state.
Alternatively, in a card game player's move is motivated not only by
the cards he or she currently holds, but also the cards which have
already been used during the course of the game.
Using steady state probabilities, which are received in the result of
such analysis, it is possible to build different kinds of quantitative
estimations of FSM’s stochastic behavior.
25
A Case Study: Low-Power Design


To demonstrate the use of applets in conjunction with
FPGA-based development boards, the procedure of
computational kernel extraction and implementation
will be considered in Lab.
Sequential circuits may have an extremely large
number of reachable states, but probabilistic analysis
show that during normal operation only a relatively
small subset is actually being visited. A power
optimization paradigm is based on the concept of
computational kernel, a highly optimized logic block,
which mimics the steady-state behaviour of the
original specification.
26
Probability distribution of the FSM
The first step of computational kernel extraction procedure is
probabilistic analysis of the FSM.
State
Steady state
probability
init0
0.5000001408
init1
0.3346775136
init2
0.0877016376
init4
0.0584677584
IOwait
0.0161290368
read0
0.0006720432
write0
0.0006720432
RMACK
0.0006720432
WMACK
0.0006720432
read1
0.0003360216
It is seen that FSM “opus”-benchmark spends 83% of its operation
time in states “init0” and “init1”.
27
Decomposed FSM network


After computational kernel is identified, it should be separated
from the rest of the circuit.
The applet of additive decomposition is used to divide the
original circuit into two alternatively working sub-FSMs.
28
Implementation summary


VHDL description for prototype FSM and decomposed network
can be generated by decomposition applet. This descriptions are
used to implement and verify both designs using FPGA-based
development board.
XPower Analyzer is a tool for power consumption estimation
featured in Xilinx ISE. It is used to evaluate the quality of the
decomposed design in comparison with the original.
Design
Area (LUTs)
Power Consumptions (mW)
Original
25
4.65
Decomposed
36
1.85
As it is seen from the table, the dynamic power consumption has
been reduced by the factor of 2.5, while area overhead is 44%.
29