A Reconfigurable Stochastic Architecture for Highly Reliable Computing Xin Li, Weikang Qian, Marc Riedel, Kia Bazargan & David Lilja Electrical & Computer Engineering University.

Download Report

Transcript A Reconfigurable Stochastic Architecture for Highly Reliable Computing Xin Li, Weikang Qian, Marc Riedel, Kia Bazargan & David Lilja Electrical & Computer Engineering University.

A Reconfigurable Stochastic Architecture
for Highly Reliable Computing
Xin Li, Weikang Qian, Marc Riedel, Kia Bazargan & David Lilja
Electrical & Computer Engineering
University of Minnesota
A
C
B
GLSVLSI, Boston – May 12, 2009
Opportunities & Challenges
Novel materials, devices, technologies:
• High density of bits/logic/interconnects.
{
Challenges for logic synthesis:
• Topological constraints.
• Inherent structural randomness.
• High defect rates.
M Wires
{
N Wires
Opportunities & Challenges
{
M Wires
{
Strategy:
• Cast synthesis in terms of arithmetic
operations on real values.
N Wires
• Synthesize circuits that compute
logical values with probability
corresponding to the real-valued
inputs and outputs.
Probabilistic Signals
deterministic
deterministic
random
“A Mathematical Theory of Communication”
Bell System Technical Journal, 1948.
Claude E. Shannon
1916 –2001
Probabilistic Analysis
• Circuit Reliability
– Probabilistic fault models.
– Random test pattern generation.
• Statistical Timing Power
(circuit level).
• Statistical Performance Measures
(architectural level).
Probabilistic Analysis
“There are known knowns; and there are unknown
unknowns; but today I’ll speak of the known unknowns.”
– Donald Rumsfeld, 2004
Probabilistic
Inputs
Digital
Circuit
Independent
Probabilistic
Outputs
Unknown
Known
Probabilistic
Analysis
Synthesis
of Probabilistic
Circuits
“There are known knowns; and there are unknown
unknowns; but today I’ll speak of the known unknowns.”
– Donald Rumsfeld, 2004
Probabilistic
Inputs
Digital
Circuit
Independent
Probabilistic
Outputs
Specified
Unknown
Unknown
Known
(for us to design)
Synthesis of Probabilistic Logic
• Shannon and von Neumann:
– “Probabilistic Logic,”
– “Reliable Circuits Using Less Reliable Relays”.
• K. Nepal, R. Bahar, J. Mundy, W. Patterson, and A.
Zaslavsky, “Designing Logic Circuits for Probabilistic
Computation in the Presence of Noise.”
• L. Chakrapani, P. Korkmaz, B. Akgul, and K. Palem,
“Probabilistic System-on-a-chip Architecture.”
Stochastic Logic
Probability values are the input and output signals.
0.616
0.7
combinational
circuit
0.468
Stochastic Logic
Probability values are the input and output signals.
0.6t + 0.4t 2
t
combinational
circuit
2
0.8t 0.8t + 0.3
Functions of a probability value t.
Stochastic Logic
X
t
0.3
Y
0.6t + 0.4t 2
Zt
Xt
0.3
Y
Zt
0.8t - 0.8t 2 + 0.3
Pr( X = 1) = Pr( Z = 1) = t
Pr(Y = 1) = 0.3
(independently)
Stochastic Bit Streams
0,1,0,1,0
x = 2/5
X
A real value x in [0, 1] is encoded as a stream of bits X.
For each bit, the probability that it is one is: P(X=1) = x.
Probabilistic Bundles
0
1
0
0
1
X
x = 2/5
A real value x in [0, 1] is encoded as a stream of bits X.
For each bit, the probability that it is one is: P(X=1) = x.
Stochastic Logic
Probability values are the input and output signals.
4/8
3/8
4/8
8/8
combinational
circuit
5/8
3/8
Stochastic Logic
Probability values are the input and output signals.
0,1,1,0,1,0,1,0,…
0,1,1,0,1,0,0,0,…
1,0,1,0,1,0,1,0,…
combinational
circuit
1,1,1,1,1,1,1,1,…
serial bit streams
1,1,0,1,0,1,1,0…
1,0,0,0,1,1,0,0,…
Stochastic Logic
Probability values are the input and output signals.
4/8
5/8
3/8
4/8
combinational
circuit
3/8
8/8
parallel bit streams
Randomness
Analog interface with fractional weighting of 1’s.
A/D
A/D
A/D
combinational
circuit
A/D
A/D
A/D
parallel bit streams
Randomness
Analog interface with fractional weighting of 1’s.
LFSR
Accumulator
LFSR
combinational
circuit
LFSR
Accumulator
LFSR
parallel bit streams
Nanowire Crossbar (idealized)
{
M Wires
VDD
{
A
N Wires
Randomized connections,
yet nearly one-to-one.
A
Fault Tolerance
Conventional approach: binary radix encoding.
0.111 (7/8)
0.001 (1/8)
0.010 (2/8)
Fault Tolerance
Conventional approach: binary radix encoding.
0.111 (7/8)
0.101 (5/8)
0.110 (6/8)
Bit flips can result in large error.
Fault Tolerance
Stochastic Logic
0111111… (7/8)
01000000… (1/8)
1100000… (2/8)
AND
• Highly redundant.
• Complex operations can be performed with simple logic.
Fault Tolerance
Stochastic Logic
0111111… (7/8)
01000100… (2/8)
1100100… (3/8)
AND
Bit flips never result in large errors.
• Highly redundant.
• Complex operations can be performed with simple logic.
Arithmetic Operations
Multiplication
(Scaled) Addition
MUX
A
A
B
C
1
C
B
0
AND
c = P(C )
= P( A) P( B )
=ab
S
c = P (C )
= P( S ) P( A) +[1 - P( S )]P( B)
= s a + (1 - s ) b
Synthesizing Stochastic Logic
t
combinational
circuit
g(t )
Only polynomials…
Questions:
• What kinds of functions can be implemented in the
probabilistic domain?
• How can we synthesize the logic to implement these?
Synthesizing Polynomials
t
combinational
circuit
g(t )
Only polynomials…
• Implement polynomials using AND (multiplication) and MUX
(scaled addition).
• Must consider polynomials with coefficients less than 0 or
larger than 1…
A little math…
Bernstein basis polynomial of degree n
n i
B (t ) =   t (1 - t )n-i ,
i
n
i
i = 0,1,
,n
A little math…
Bernstein basis polynomial of degree n
n i
B (t ) =   t (1 - t )n-i ,
i
n
i
i = 0,1,
,n
Bernstein polynomial of degree n
n
B (t ) =  bin Bin (t )
n
i =0
bin is a Bernstein coefficient
A little math…
Obtain Bernstein coefficients from power-form coefficients:
n
n
n n
Given g (t ) =  a t =  bi Bi (t ), we have
i =0
n i
i
i
bin = 
j =0
i =0
(ij )
n
j
( )
a nj ,
i = 0,1,
,n
Example: Converting a Polynomial
Power-Form
Polynomial
g (t ) = 3t - 8t 2 + 6t 3
Bernstein
Polynomial
2 3
3
g (t ) = B (t ) - B2 (t ) + B3 (t )
3
3 4
1 4
1 4
4
= B1 (t ) + B2 (t ) - B3 (t ) + B4 (t )
4
6
4
3 5
2 5
5
= B1 (t ) + B2 (t ) + B5 (t )
5
5
3
1
coefficients in unit interval
Synthesizing Polynomials
t
combinational
circuit
g(t )
Synthesis steps:
1. Convert the polynomial into a Bernstein form.
2. Elevate it until all coefficients are in the unit interval.
3. Implement this with “generalized multiplexing”.
Probabilistic Multiplexing
MUX
A
C
B
T
c = P (C )
= P(T ) P( A) +[1 - P(T )]P( B)
= t a + (1 - t ) b
Bernstein polynomial
Probabilistic Multiplexing
X1, …, Xn are independent Boolean random
variables with Pr(Xi=1) = t, for 1 ≤ i ≤ n
Z0, …, Zn are independent Boolean random
n
variables with Pr(Zi=1)= bi , for 0 ≤ i ≤ n
n
Pr(Y = 1) =  bin Bin (t )
i =0
A Reconfigurable Architecture
Implement different functions
by setting the coefficients:
n
Pr(Y = 1) =  bin Bin (t )
i =0
Example
Implement
f (t ) =
1 9 15 2 5 3
+ t- t + t
4 8
8
4
Example
Convert to
f (t ) =
2 3
5
3
6
B0 (t ) + B13 (t ) + B23 (t ) + B33 (t )
8
8
8
8
Example
f (t ) =
2 3
5
3
6
B0 (t ) + B13 (t ) + B23 (t ) + B33 (t )
8
8
8
8
x1 0,0,0,1,1,0,1,1 (4/8)
x2 0,1,1,1,0,0,1,0 (4/8)
+
x3 1,1,0,1,1,0,0,0 (4/8)
1,2,1,3,2,0,2,1
z0 0,0,0,1,0,1,0,0 (2/8)
z1 0,1,0,1,0,1,1,1 (5/8)
0
z2 0,1,1,0,1,0,0,0 (3/8)
z3 1,1,1,0,1,1,0,1 (6/8)
2
1
3
y
MUX 0,1,0,0,1,1,0,1 (4/8)
Non-Polynomial Functions
Find a Bernstein polynomial to approximate
the function:
n
B (t ) =  bin Bin (t )
n
i =0
with 0  bin  1 , such that
1

0
is minimized.
( f (t ) - B n (t ))2 dt
Non-Polynomial Functions
Example: Gamma correction function.
f (t) = t 0.45
Degree 6 Bernstein coefficients are:
b0 = 0.0955, b1 = 0.7207, b2 = 0.3476, b3 = 0.9988,
b4 = 0.7017, b5 = 0.9695, b6 = 0.9939
Deterministic v.s. Stochastic Implementation of
Gamma correction function with 10% noise injection.
1%
2%
10%
Conventional Implementation
Stochastic Implementation
Deterministic
implementation:
37% pixels with
errors > 20%
Stochastic
Implementation:
no pixels with
errors > 20%!
Comparison with Conventional Hardware
Implementation of Image Processing Functions
Number of LUTs in FPGA mapping
* The entire ReSC architecture, including Randomizers and De-Randomizers.
** The ReSC Unit by itself.
Comparison with Conventional Software
Implementation of Image Processing Functions
Speedup (1024 cycles needed)
* Software using math function from ‘Math.h’
** Software using direct function table lookup
Comparison of Fault Tolerance for
Image Processing Functions
Noise is injected in the form of a percentage of bit flips.
Percentage of Output Pixels with Errors Greater than 25%
The stochastic implementation never produces such errors!
Comparison of Fault Tolerance for
Mathematical Functions
Sixth-order Maclaurin polynomial approx., 10 bits:
sin(x), cos(x), tan(x), arcsin(x), arctan(x), sinh(x),
cosh(x), tanh(x), arcsinh(x), exp(x), ln(x+1)
60
relative error
50
Stochastic
Deterministic
40
30
20
10
0
0
0.001
0.002
0.005
0.01
0.02
error ratio of input data
0.05
0.1
Conclusions
• The hardware cost is comparable.
• Stochastic computation is much more error tolerant.
• Advantage for applications where large errors are critical but
small fluctuations can be tolerated is dramatic.
• (Also some pretty interesting math…)
Future Directions
• Apply the method at the processor level.
• Apply the method at the circuit level (e.g., with PCMOS).
[computational]
Quantities
of Different
Types
Synthetic Biology
Biological
Process
Probability
Distribution
on outcomes
[computational]
Synthetic Biology
X
Biological
Process
Y
fixed
 X 
Z with Pr 

+
X
Y

