Transcript (.ppt)
Accuracy-Configurable Adder for
Approximate Arithmetic Designs
Andrew B. Kahng, Seokhyeong Kang
VLSI CAD LABORATORY, UC San Diego
49th Design Automation Conference
June 6th, 2012
UC San Diego / VLSI CAD Laboratory
-1-
Outline
Background and Motivation
Accuracy Configurable Adder Design
Experimental Setup and Results
Conclusions and Ongoing Works
-2-
Why Approximate Designs?
Threats to traditional IC design approach ...
Extreme variations
variations:/ Reliability issues / Cost:
PVT variation uncertainty lead to design overhead
Approximate designs
Reliability issues:
Relaxing
the(NBTI,
requirement
of Soft
correctness
can
Hard errors
latchup),
errors (α-particle)
dramatically reduce costs of the design
Cost:
What is the squareof
root
of 10accuracy
?
Cost (power/performance)
perfect
is too
high!
“a little more
than three”
Approximate designs
Relaxing the requirement“3.162278....”
of correctness can
dramatically reduce costs of the design
Approximation could be faster and more powerful
-3-
Previous Approximate Adders
Lu et al. IEEE Computer 2004
Faster adder w/ shorter carry chain
High performance with small error rate
Large area overhead: not applicable for
low energy design
Zhu et al. TVLSI 2010
ETAI : accurate part + inaccurate part
Reduce error size
Error rate is high
Output accuracy is fixed benefits can be limited by
required accuracy
-4-
Our Work: Accuracy-Configurable Approximate Adder
normalized power
accurate mode
How power benefits
can be achieved …
accurate
design
1.0
approximate mode
required accuracy
80%
100%
90%
event occurred
accuracy
configurable
design
80%
time
Accuracy-configurable design adapts to changing
requirements by using different modes in each situation
-5-
Our Work: Accuracy-Configurable Approximate Adder
normalized power
accurate mode
accurate
design
1.0
approximate mode
required accuracy
80%
100%
90%
accuracy
configurable
design
How power benefits
can be achieved …
80%
event occurred
time
Accuracy-configurable approximate adder
approximate
adder
accuracy: 90%
Mode 1: turn-off
ECC-1, ECC-2
error
collection
error
collection
(ECC-1)
(ECC-2)
accuracy: 95%
Mode 2: turn-off
ECC-2
accuracy: 100%
Mode 3: turn-on
All ECC
-6-
Outline
Background Motivation
Accuracy Configurable Adder Design
Experimental Setup and Results
Conclusions and Ongoing Works
-7-
Approximate Adder Implementation
AH=A[15:8],
AM=A[11:4],
AL=A[7:0]
A[15]
carry
8-bit
‘
adder
16-bit adder case
SUMH
SUM[16]
AH+BH
SUM[15:12]
SUM[11:8]
8-bit
adder SUMM
SUM[7:4]
AM+BM
SUM[3:0]
A[0]
A[15:0] B[15:0]
8-bit
adder
SUML
SUM
AL+BL
Carry chain is cut to reduce critical path delay
Sub-adders generate results of partial summation
Middle sub-adder improves accuracy (error 50% 5.5%)
-8-
Approximate Adder Implementation
k
carry
N: bit width, k: ½ carry-chain depth
N-bit adder case
A [N-1:N-k]
A [N-k-1:N-2k]
A [N-2k-1:N-3k]
A [N-2k-1:N-3k]
B [N-1:N-k]
B [N-k-1:N-2k]
B [N-2k-1:N-3k]
B [N-2k-1:N-3k]
SUM [N-k-1:N-2k]
SUM [N-2k-1:N-3k]
SUM [N-1:N-k]
carry
Probability of correct result :
Estimation over CLA (N=16)
K
Approximate adder can
be configured with “k”
2
3
4
5
6
Min. clock cycle
0.5
0.65
0.75
0.83
0.89
area
0.87
1.05
1.12
1.15
1.12
power
0.44
0.68
0.84
0.95
1.00
pass rate
0.554 0.829 0.942 0.982 0.995
-9-
Error Detection and Correction
approximate adder
EDC circuit
SUMapprox
IN
sub-adderi
OUT
sumi
incrementor
sub-adderi+1
SUMcorrect
errori
error
data stall
Error can be detected and corrected with small overhead
carryi+1
Variable latency
operation
Error detection: ‘and’ gates
Error correction: incrementor circuit
Error detection and correction can take more time than
critical path delay of “sub-adder”; the throughput can be
reduced
-10-
Accuracy Configuration with Pipeline
Stage 1
A
Stage 2
correction on S1
approximate
adder
B
Stage 3
errors
on S1
SUM
S3
S2
S1
approximate
S0
S3
S2
S1
correction on S2
correction on S3
errors
on S2
errors
on S3
S0
correct approximate correct
Each stage generates
a result with different
accuracy
Can turn off later stages
with power gating
according to accuracy
requirement
Stage 4
S3
S2
S1
S0
S3
approx. correct
S2
SUMcorrect
S1
S0
correct
Config.
Powergating
Accuracy
Power
reduction
Mode-1
None
1.000
-11.5%
Mode-2
Stage 4
0.960
12.4%
Mode-3
Stage-3, 4
0.925
31.0%
Mode-4
Stage-2, 3, 4
0.900
51.6%
-11-
Outline
Background Motivation
Accuracy Configurable Adder Design
Experimental Setup and Results
Conclusions and Ongoing Works
-12-
Experimental Setup and Metrics
Experimental Setup
Library: TSMC 65GP
Implementation: Synopsys Design Compiler
Simulation: Cadence NC-SIM
Input patterns: random data and actual data
Library preparation: Cadence Library Characterizer
Accuracy Metrics
Metric
ACCamp
ACCinf
Definition
1-|Rc-Re|/Rc
1-Be/Bw
Data type
Amplitude data
Information data
Rc and Re : correct and obtained results
Be: number of error bits, Bw: bit-width of data
-13-
Approximate Adder Comparison
Accuracy vs. power consumption
Image smoothing
(Gaussian filter)
(a)
(d)
(b)
(e)
(c)
(f)
(a)
(b)
(c)
(d)
(e)
(f)
Original image
Accurate adder
ACA (PSNR 24.5dB)
ETAI (25.3dB)
ETAII (16.2dB)
LU (11.1dB)
(c)~(f) have 50% power of
accurate adder (b)
* ETAI cannot detect and correct errors
-14-
Approximate Adder Comparison
1.000
0.900
1.000
Voltage scaling
(1.0V~0.6V)
0.800
ACA adder
CLA
Lu's adder
ETAI
ETAIIM
0.700
0.600
0.500
0.400
2.00E-04
total power (W)
4.00E-04
6.00E-04
8.00E-04
0.900
ACCinf
Accuracy vs. power consumption w/voltage scaling
ACCamp
0.800
ACA adder
CLA
Lu's adder
ETAI
ETAIIM
0.700
0.600
0.500
0.400
2.00E-04
total power (W)
4.00E-04
6.00E-04
8.00E-04
ACA adder shows fine results (accuracy vs. power)
on both ACCamp and ACCinf metrics
-15-
Accuracy Configuration and Power Saving
Power saving from voltage scaling + mode change
4-stage 32-bit adder case
Accuracy:
1.0 → 0.9
4.00E-03
mode change
3.00E-03
Conventional pipelined adder
ACA adder (mode 1)
ACA adder (mode 2)
ACA adder (mode 3)
ACA adder (mode 4)
2.00E-03
1.00E-03
0.00E+00
0.80
0.85
0.90
0.95
1.00
ACCinf
Accuracy configuration w/ mode change is more
effective than w/ voltage scaling
mode change
5.00E-03
voltage scaling
total power consumption (W)
6.00E-03
voltage scaling
4X reduction
accurate
result
7.00E-03
-16-
Accuracy Configuration and Power Saving
Power consumption when accuracy requirement
is varying (w/ SPEC 2006 benchmarks)
0.8
0.6
0.4
0.2
0
0.95 Accuracy 1.00
mode-4
mode-3
mode-2
mode-1
High accuracy
Normalized power
consumption
1
Average 30% power savings
over no accuracy configuration
-17-
Outline
Background Motivation
Accuracy Configurable Adder Design
Experimental Setup and Results
Conclusions and Ongoing Works
-18-
Conclusions and Ongoing Works
Conclusions
We proposed accuracy-configurable approximate (ACA)
adder, which can adapt to changing accuracy requirement
ACA can provide 30% power reduction with accuracy
configuration during runtime
Ongoing Works
Accuracy-configurable design for other arithmetic units
(multiplier, divider)
Automated synthesis flow (minimize power under the
required accuracy)
RTL Required accuracy
exact
adder
approximate
adder
Accuracy estimation
Synthesis
-19-
Thank You!
-20-
Accuracy-Configurable Approximate Design
Required accuracy can change during runtime
Idea of High-Efficiency Math
highlighted by Intel Labs at ISSCC-2012
Variable-precision floating point unit w/
accuracy tracking : 24-bit 12-bit
6-bit as needed
Accuracy-configurable
design adapts to changing
requirements, maximizing
benefits of approximate
design paradigm
accurate mode
normalized power
Variable-precision
Mantissa
accurate
design
1.0
approximate mode
required accuracy
80%
100%
90%
event occurred
accuracy
configurable
design
80%
time
-21-