Transcript (.ppt)

Accuracy-Configurable Adder for
Approximate Arithmetic Designs
Andrew B. Kahng, Seokhyeong Kang
VLSI CAD LABORATORY, UC San Diego
49th Design Automation Conference
June 6th, 2012
UC San Diego / VLSI CAD Laboratory
-1-
Outline
Background and Motivation
 Accuracy Configurable Adder Design

Experimental Setup and Results
 Conclusions and Ongoing Works

-2-
Why Approximate Designs?

Threats to traditional IC design approach ...
Extreme variations
variations:/ Reliability issues / Cost:
PVT variation uncertainty lead to design overhead
 Approximate designs
Reliability issues:
Relaxing
the(NBTI,
requirement
of Soft
correctness
can
Hard errors
latchup),
errors (α-particle)
dramatically reduce costs of the design
Cost:
What is the squareof
root
of 10accuracy
?
Cost (power/performance)
perfect
is too
high!
“a little more

than three”
Approximate designs
Relaxing the requirement“3.162278....”
of correctness can
dramatically reduce costs of the design
Approximation could be faster and more powerful
-3-
Previous Approximate Adders
Lu et al. IEEE Computer 2004
 Faster adder w/ shorter carry chain
 High performance with small error rate
 Large area overhead: not applicable for
low energy design
Zhu et al. TVLSI 2010
 ETAI : accurate part + inaccurate part
 Reduce error size
 Error rate is high
Output accuracy is fixed  benefits can be limited by
required accuracy
-4-
Our Work: Accuracy-Configurable Approximate Adder
normalized power
accurate mode
How power benefits
can be achieved …
accurate
design
1.0
approximate mode
required accuracy
80%
100%
90%
event occurred
accuracy
configurable
design
80%
time
Accuracy-configurable design adapts to changing
requirements by using different modes in each situation
-5-
Our Work: Accuracy-Configurable Approximate Adder
normalized power
accurate mode
accurate
design
1.0
approximate mode
required accuracy
80%
100%
90%
accuracy
configurable
design
How power benefits
can be achieved …
80%
event occurred
time
 Accuracy-configurable approximate adder
approximate
adder
accuracy: 90%
Mode 1: turn-off
ECC-1, ECC-2
error
collection
error
collection
(ECC-1)
(ECC-2)
accuracy: 95%
Mode 2: turn-off
ECC-2
accuracy: 100%
Mode 3: turn-on
All ECC
-6-
Outline
Background Motivation
 Accuracy Configurable Adder Design

Experimental Setup and Results
 Conclusions and Ongoing Works

-7-
Approximate Adder Implementation
AH=A[15:8],
AM=A[11:4],
AL=A[7:0]
A[15]
carry
8-bit
‘
adder
16-bit adder case
SUMH
SUM[16]
AH+BH
SUM[15:12]
SUM[11:8]
8-bit
adder SUMM
SUM[7:4]
AM+BM
SUM[3:0]
A[0]
A[15:0] B[15:0]
8-bit
adder
SUML
SUM
AL+BL
 Carry chain is cut to reduce critical path delay
 Sub-adders generate results of partial summation
 Middle sub-adder improves accuracy (error 50%  5.5%)
-8-
Approximate Adder Implementation
k
carry
N: bit width, k: ½ carry-chain depth
N-bit adder case
A [N-1:N-k]
A [N-k-1:N-2k]
A [N-2k-1:N-3k]
A [N-2k-1:N-3k]
B [N-1:N-k]
B [N-k-1:N-2k]
B [N-2k-1:N-3k]
B [N-2k-1:N-3k]
SUM [N-k-1:N-2k]
SUM [N-2k-1:N-3k]
SUM [N-1:N-k]
carry
Probability of correct result :
Estimation over CLA (N=16)
K
 Approximate adder can
be configured with “k”
2
3
4
5
6
Min. clock cycle
0.5
0.65
0.75
0.83
0.89
area
0.87
1.05
1.12
1.15
1.12
power
0.44
0.68
0.84
0.95
1.00
pass rate
0.554 0.829 0.942 0.982 0.995
-9-
Error Detection and Correction
approximate adder
EDC circuit
SUMapprox
IN
sub-adderi
OUT
sumi
incrementor
sub-adderi+1
SUMcorrect
errori
error
data stall

Error can be detected and corrected with small overhead



carryi+1
Variable latency
operation
Error detection: ‘and’ gates
Error correction: incrementor circuit
Error detection and correction can take more time than
critical path delay of “sub-adder”; the throughput can be
reduced
-10-
Accuracy Configuration with Pipeline
Stage 1
A
Stage 2
correction on S1
approximate
adder
B
Stage 3
errors
on S1
SUM
S3
S2
S1
approximate


S0
S3
S2
S1
correction on S2
correction on S3
errors
on S2
errors
on S3
S0
correct approximate correct
Each stage generates
a result with different
accuracy
Can turn off later stages
with power gating
according to accuracy
requirement
Stage 4
S3
S2
S1
S0
S3
approx. correct
S2
SUMcorrect
S1
S0
correct
Config.
Powergating
Accuracy
Power
reduction
Mode-1
None
1.000
-11.5%
Mode-2
Stage 4
0.960
12.4%
Mode-3
Stage-3, 4
0.925
31.0%
Mode-4
Stage-2, 3, 4
0.900
51.6%
-11-
Outline
Background Motivation
 Accuracy Configurable Adder Design

Experimental Setup and Results
 Conclusions and Ongoing Works

-12-
Experimental Setup and Metrics

Experimental Setup






Library: TSMC 65GP
Implementation: Synopsys Design Compiler
Simulation: Cadence NC-SIM
Input patterns: random data and actual data
Library preparation: Cadence Library Characterizer
Accuracy Metrics
Metric
ACCamp
ACCinf


Definition
1-|Rc-Re|/Rc
1-Be/Bw
Data type
Amplitude data
Information data
Rc and Re : correct and obtained results
Be: number of error bits, Bw: bit-width of data
-13-
Approximate Adder Comparison

Accuracy vs. power consumption
Image smoothing
(Gaussian filter)
(a)
(d)
(b)
(e)
(c)
(f)
(a)
(b)
(c)
(d)
(e)
(f)
Original image
Accurate adder
ACA (PSNR 24.5dB)
ETAI (25.3dB)
ETAII (16.2dB)
LU (11.1dB)
(c)~(f) have 50% power of
accurate adder (b)
* ETAI cannot detect and correct errors
-14-
Approximate Adder Comparison
1.000
0.900
1.000
Voltage scaling
(1.0V~0.6V)
0.800
ACA adder
CLA
Lu's adder
ETAI
ETAIIM
0.700
0.600
0.500
0.400
2.00E-04

total power (W)
4.00E-04
6.00E-04
8.00E-04
0.900
ACCinf
Accuracy vs. power consumption w/voltage scaling
ACCamp

0.800
ACA adder
CLA
Lu's adder
ETAI
ETAIIM
0.700
0.600
0.500
0.400
2.00E-04
total power (W)
4.00E-04
6.00E-04
8.00E-04
ACA adder shows fine results (accuracy vs. power)
on both ACCamp and ACCinf metrics
-15-
Accuracy Configuration and Power Saving
 Power saving from voltage scaling + mode change
4-stage 32-bit adder case
Accuracy:
1.0 → 0.9
4.00E-03
mode change
3.00E-03
Conventional pipelined adder
ACA adder (mode 1)
ACA adder (mode 2)
ACA adder (mode 3)
ACA adder (mode 4)
2.00E-03
1.00E-03
0.00E+00
0.80
0.85
0.90
0.95
1.00
ACCinf

Accuracy configuration w/ mode change is more
effective than w/ voltage scaling
mode change
5.00E-03
voltage scaling
total power consumption (W)
6.00E-03
voltage scaling
4X reduction
accurate
result
7.00E-03
-16-
Accuracy Configuration and Power Saving
 Power consumption when accuracy requirement
is varying (w/ SPEC 2006 benchmarks)
0.8
0.6
0.4
0.2
0
0.95  Accuracy  1.00
mode-4
mode-3
mode-2
mode-1
High accuracy
Normalized power
consumption
1
Average 30% power savings
over no accuracy configuration
-17-
Outline
Background Motivation
 Accuracy Configurable Adder Design

Experimental Setup and Results
 Conclusions and Ongoing Works

-18-
Conclusions and Ongoing Works

Conclusions



We proposed accuracy-configurable approximate (ACA)
adder, which can adapt to changing accuracy requirement
ACA can provide 30% power reduction with accuracy
configuration during runtime
Ongoing Works


Accuracy-configurable design for other arithmetic units
(multiplier, divider)
Automated synthesis flow (minimize power under the
required accuracy)
RTL Required accuracy
exact
adder
approximate
adder
Accuracy estimation
Synthesis
-19-
Thank You!
-20-
Accuracy-Configurable Approximate Design
 Required accuracy can change during runtime
 Idea of High-Efficiency Math
highlighted by Intel Labs at ISSCC-2012
 Variable-precision floating point unit w/
accuracy tracking : 24-bit  12-bit 
6-bit as needed
 Accuracy-configurable
design adapts to changing
requirements, maximizing
benefits of approximate
design paradigm
accurate mode
normalized power
Variable-precision
Mantissa
accurate
design
1.0
approximate mode
required accuracy
80%
100%
90%
event occurred
accuracy
configurable
design
80%
time
-21-