Asynchronous Interface Specification, Analysis and Synthesis M. Kishinevsky

Download Report

Transcript Asynchronous Interface Specification, Analysis and Synthesis M. Kishinevsky

Asynchronous Interface
Specification, Analysis
and Synthesis
M. Kishinevsky
J. Cortadella
Intel Corporation
Technical University
of Catalonia
Steps in Design Flow
 Specification
 Synthesis
– Next-state functions
– State encoding
– Decomposition and technology mapping
 Timing
optimization
 Verification
x
y
z
z+
x+
x-
y+
z-
y-
Signal Transition Graph (STG)
x
y
z
z+
x+
x-
y+
y-
z-
xyz
000
x+
z+
x+
z+
xy+
y-
z-
y-
x-
100
y+
101
110
y+
z+
001
111
y+
x-
011
z-
010
xyz
000
Next-state functions
x+
x  z (x  y)
y  zx
z  x  y z
z+
y-
x-
100
y+
101
110
y+
z+
001
111
y+
x-
011
z-
010
Next-state functions
x  z (x  y)
y  zx
z  x  y z
x
y
z
VME bus
Bus
DSr
Data
Transceiver
LDS
LDTACK
Device
D
DSr
DSw
LDS
VME Bus
Controller LDTACK
D
DTACK
DTACK
Read Cycle
STG for the READ cycle
DSr+
LDS+
LDTACK+
D+
LDTACK-
DTACK-
DTACK+
LDS-
DSr-
D-
Choice: Read and Write cycles
LDTACK-
LDS-
DSr+
DSw+
LDS+
D+
LDTACK+
LDS+
D+
DTACK-
DTACK-
LDTACK+
DTACK+
D-
DSr-
DTACK+
D-
DSw-
LDTACK-
LDS-
Choice: Read and Write cycles
LDTACK-
LDS-
DSr+
DSw+
LDS+
D+
LDTACK+
LDS+
D+
DTACK-
DTACK-
LDTACK+
DTACK+
D-
DSr-
DTACK+
D-
DSw-
LDTACK-
LDS-
Speed independence
 Delay
model:
– Unbounded gate delays
– Wire delays after fork are less than gate
delays
 Conditions
for implementability:
– Consistent and Complete State Coding
– Determinism
– Output persistency
– Commutativity
State Graph (Read cycle)
DSr+
LDS+
LDTACKDSr+
LDS-
LDTACK+
DSr+
D+
DTACK-
LDTACKDTACK-
LDS-
LDS-
DTACK-
DDTACK+
DSr-
LDTACK-
Binary encoding of signals
DSr+
LDS+
LDTACKDSr+
LDS-
LDTACK+
DSr+
D+
DTACK-
LDTACKDTACK-
LDS-
LDS-
DTACK-
DDTACK+
DSr-
LDTACK-
Binary encoding of signals
DSr+
10000
LDS+
LDTACK-
LDTACK-
DSr+
10010
LDS-
LDTACK+
DTACK-
LDS-
DSr+
10110
D+
DTACK-
10110
LDTACK-
01100
LDS-
DTACK-
01110
00110
D-
DTACK+
DSr-
(DSr , DTACK , LDTACK , LDS , D)
Excitation / Quiescent Regions
ER (LDS+)
LDS+
QR (LDS-)
LDS-
QR (LDS+)
LDS-
LDS-
ER (LDS-)
Next-state function
01
LDS+
00
LDS-
11
LDS-
LDS-
10
Karnaugh map for LDS
LDS = 1
LDS = 0
D
LDTACK
DTACK
DSr
00
01
11
10
D
LDTACK
DTACK
DSr
00
01
11
10
00
0
0
-
1
00
-
-
-
1
01
-
-
-
-
01
-
-
-
-
11
-
-
-
-
11
-
1
1
1
10
0
0
-
0
10
0
0
-
?
State encoding conflicts
01
LDS+
LDTACK+
LDS-
LDS-
LDS-
10
10110
10110
11
00
LDTACK-
Concurrency reduction
DSr+
LDS+
DSr+
LDSDSr+
10110
10110
LDS-
LDS-
Concurrency reduction
DSr+
LDS+
LDTACK+
D+
LDTACK-
DTACK-
DTACK+
LDS-
DSr-
D-
State encoding conflicts
LDS+
LDTACK+
LDTACK-
LDS-
10110
10110
Signal Insertion
CSC+
LDS+
LDTACK+
LDTACK-
LDS-
101101
101100
D-
DSr-
CSC-
Decomposition




Hazards
Global acknowledgement
Generating candidates
Hazard-free signal insertion
– Event insertion
– Signal insertion
Hazards
abcx
1000
1000
b+
1100
a0100
0100
c+
0110
1
0
a
z
0
1
1
0
b
0
1
c
x
0
1
Global acknowledgement
d-
b+
d+
y+
a-
y-
c+
d-
c-
d+
z-
b-
z+
c+
a+
c-
c
b
a
z
a
b
d
y
How about 2-input gates ?
d-
b+
d+
y+
a-
y-
c+
d-
c-
d+
z-
b-
z+
c+
a+
c-
c
b
a
z
a
b
d
y
How about 2-input gates ?
d-
b+
d+
y+
a-
y-
c+
d-
c-
d+
z-
b-
z+
c+
a+
c-
c
z
b
a
a
b
d
y
How about 2-input gates ?
d-
b+
d+
y+
a-
y-
c+
d-
c-
d+
z-
b-
z+
c+
a+
c-
c 0
b
0 z
a
a
b
d
y
How about 2-input gates ?
d-
b+
d+
y+
a-
y-
c+
d-
c-
d+
z-
b-
z+
c+
a+
c-
c
b
a
z
a
y
b
d
How about 2-input gates ?
d-
b+
d+
y+
a-
y-
c+
d-
c-
d+
z-
b-
z+
c+
a+
c-
c
z
a
b
y
d
Strategy for correct logic
decomposition
 Each
decomposition defines a new
internal signal of the circuit
 Method: Insert new internal signals
such that
– After resynthesis,
some large gates are decomposed
– The new specification is SI-implementable
(hazard-free under unbounded gate delays)
Decomposition
-Boolean relations
- Algebraic factorization
F
C
Sr
C
D
C
D more progress
until
no
Sr
Sr
C
NO
D
Hazard-free ?
(Signal insertion)
C
YES
Decomposition
Decomposition
-Boolean relations
(Boolean relations)
-Algebraic factorization
F
Sr
C
D
until no more progress
Sr
C
NO
D
Hazard-free ?
(Signal insertion)
C
YES
Boolean decomposition
x1
F
xn
f = F (x1,…,xn)
f
x1
xn
h1
H
G
hm
f = G(H(x1,…,xn))
Our problem: Given F and G, find H
f
h1
h2
state
s1
s2
s3
s4
dc
f
0
0
1
1
-
C
next(f)
0
1
0
1
-
f
(h1,h2)
(0,-) (-,0)
(1,1)
(0,0)
(-,1) (1,-)
(-,-)
This is a Boolean Relation
ya+
cd-
a-
a
c
d
acd  y (c  d )
F
y
c+
a+
S
y+
cad+
c+
Rs
R
y
ya+
cd-
a
c
d
acd  y (c  d )
y
ac+
c-
a
c
d
c
d+
d
a+
y+
ac+
Rs
y
ya+
cd-
a
c
d
acd  y (c  d )
y
ac+
a+
a
y+
cad+
c+
cd  yc
Rs
y
ya+
cd-
a
c
d
acd  y (c  d )
y
ac+
a+
a
y+
cad+
c+
d
c
D
Rs
y
Ad hoc solver for Boolean Relations
 Existing
solvers [Somenzi,Watanabe]
aim at minimizing PLA size
 Our approach:
– Targeted to 2-output functions
– Individual minimization of each function
– Branch-and-bound to eliminate
incompatible solutions (heuristic pruning)
– Yields several solutions with similar cost
Decomposition
-Boolean relations
-Algebraic factorization
F
Sr
C
D
until no more progress
Sr
C
NO
D
Hazard-free ?
(Signal insertion)
C
YES
Event insertion (Vanbekbergen’92)
SR(x)
b
a
x x
a
x
b
ER(x)
x
c
Event insertion (Continued)
 Properties
to preserve during insertion:
– trace equivalence
– speed-independence
 output-persistency
a
b
a
b
a
b
b
 commutativity
 Signal
insertion = a few events insertion
a
Event insertion: examples
a
b
a
b
a
b
a
b
x
x a b
a
b
b
a
b
b
a is not
persistent
x
a
a is
persistent
Signal insertion for function F
Insertion by input borders
F+
F=0
F=1
FState Graph
1001
z-
y+
1010
yw-
1000
0001
w- z-
w- y+
0010
0000
0110
w+
x+
0101
x+ z-
0011
0100
x-
x+ y+
y-
1011
z+
0111
z-
w-
w+
y+
x+
x-
z+
1001
z-
y+
1010
yw-
1000
0001
w- z-
w- y+
0010
0000
0110
yz=0
1011
x+
0100
z+
w
y
z
w+
0101
x+ z-
x+ y+
x
y
z
x
w
0011
w
z
x0111
yz=1
x
y
z
y
C
y
C
z
1001
z-
y+
1010
yw-
1000
0001
w- z-
w- y+
0010
0000
0110
yz=0
1011
x+
0100
z+
w
y
z
w+
0101
x+ z-
x+ y+
x
y
z
x
w
0011
w
z
x0111
yz=1
x
y
z
y
C
y
C
z
x
1001
z-
y+
1010
yw-
1000
0001
w- z-
w- y+
0010
0000
x+
0100
x
w
0011
w
z
x-
z is delayed
by
z+
0110
0111
the new signal !!!
yz=0
y
z
w+
0101
x+ z-
x+ y+
1011
w
yz=1
x
y
z
y
C
y
C
z
x
1001
z-
y+
1010
yw-
1000
0001
w- z-
w- y+
0010
0000
0110
yz=0
y
z
w+
x+
0101
x+ z-
x+ y+
1011
w
0100
z+
x
w
0011
w
z
x0111
yz=1
y
x
y
z
C
y
C
z
x
1001
z-
y+
1010
yw-
1000
0001
w- z-
w- y+
0010
0000
0110
yz=0
y
z
w+
x+
0101
x+ z-
x+ y+
1011
w
0100
z+
x
w
0011
w
z
x0111
yz=1
x
y
z
y
C
y
C
z
x
s=1
zy+
1010
1000
s- y+
1001
s- z1000
y1011
s1001
w-
0001
w- zx+
x+ y+
s=0
0110
s
y
z
w+
1010
0000
0101
w- y+
x+ z0010
w
0100
z+
w
0011
x-
x
w
z
z
C
y
C
z
0111
s+
0111
x
y
y
s=1
zy+
1010
1000
s- y+
1001
s- z1000
y-
s1001
w+
0001
w- zx+
x+ y+
s=0
0110
s-
w-
1010
0000
0101
w- y+
x+ z0010
y-
1011
0100
z+
0011
z-
w-
w+
y+
x+
x-
x0111
s+
0111
z+
s+
Technology mapping
 BDD-based
boolean matching
[Mailhot 93]
 Handles sequential gates and
combinational feedbacks
 Merging small gates into larger gates
introduces no new hazards
 No guarantee to find correct mapping
(some gates cannot be decomposed)
Timing optimization (I)
 If
exact timing bounds are unknown,
use relative timing assumptions
 Timing assumptions always reduce the
set of states
– DC-set is larger
– No new logic dependencies
– Less state conflicts
– Simpler logic
READ control in 2-input gates
DSr+
LDS+
LDTACK+
D+
DTACKDTACK+
LDTACK-
DSr-
D-
LDSD
DTACK
LDS
map
DSr
csc
LDTAKE
Adding timing assumptions (I)
DSr+
LDS+
LDTACK+
D+
DTACKDTACK+
DSrSep_max(LDTACK-,DSr+)<0
LDTACKbeforeD-DSr+
LDTACK-
LDSD
DTACK
LDS
map
DSr
csc
LDTAKE
Adding timing assumption (I)
DSr+
LDS+
LDTACK+
D+
LDTACK-
DTACKDTACK+
DSr-
LDSD
DTACK
D-
TIMING CONSTRAINT
Sep(LDTACK-,DSr+)<0
DSr
LDS
LDTAKE
Timing optimization (II)
 Lazy
optimization:
– Idea: Increase concurrency for enabling,
without increasing concurrency for firing
– Early enabling of a signal cannot produce
new reachable states if some other
enabled signal is faster
Adding timing assumptions (II)
DSr+
LDS+
LDTACK+
Sep_max(LDTACK-,DSr+)<0
Sep_max(D-,LDS-) < 0
D+
LDTACK-
DTACKDTACK+
DSr-
LDSD
DTACK
D-
TIMING CONSTRAINT
Sep(LDTACK-,DSr+)<0
and Sep(D-,LDS-)<0
DSr
LDS
LDTAKE
Adding timing assumptions (II)
DSr+
LDS+
LDTACK+
Sep_max(LDTACK-,DSr+)<0
Sep_max(D-,LDS-) < 0
D+
LDTACK-
DTACKDTACK+
DSr-
LDSD
DTACK
D-
TIMING CONSTRAINT
Sep(LDTACK-,DSr+)<0
and Sep(D-,LDS-)<0
DSr
LDS
LDTAKE
Summary
 Asynchronous
–
–
–
–

design is applicable to
asynchronous interfaces
high-performance computing
low-power design
low-emission design
There is an increased interest of few, but
large scale companies: Intel, Philips, Sun,
Sharp, ARM, HP, Cogency
Summary
 Asynchronous
circuits are more difficult to
design than synchronous
 Clock distribution and on-die variations
makes synchronous design more difficult
 CAD support is crucial
 CAD tools have matured
 Most steps of the design process covered by
this tutorial are supported by tool Petrify
Summary
 Asynchronous
circuits are more difficult to
design than synchronous
 Clock distribution and on-die variations
makes synchronous design more difficult
 CAD support is crucial
 CAD tools have matured
 Most steps of the design process covered by
this tutorial are supported by tool Petrify