Conference Presentation Template
Download
Report
Transcript Conference Presentation Template
41st DAC Tuesday Keynote
Giga-scale Integration for
Tera-Ops Performance
Opportunities and New Frontiers
Pat Gelsinger
Senior Vice President & CTO
Intel Corporation
June 8, 2004
Why Bother?
Litho Cost
$10,000
$1,000
$100
1.E+03
$10
G. Moore ISSCC
03
$1
1960
1970
1980
1990
2000
2010
$10,000
FAB Cost
Test Capital
Per Chip
1.E+01
1.E+00
1.E-01
1.E-02
Based on SIA roadmap
1.E-03
$1,000
Fab Cost ($M)
1.E+02
Test Capital ($)
Litho Tool Cost ($K)
$100,000
1.E-04
1980
$100
$10
www.icknowledge.com
$1
1960
1970
1980
1990
2000
2010
1990
2000
2010
Why Bother?
Litho Tool Cost ($K)
$100,000
Litho Cost
$10,000
$1,000
$100
$10
G. Moore ISSCC 03
$1
1960
1970
1980
1990
2000
2010
$10,000
FAB Cost
Fab Cost ($M)
$1,000
$100
Scaling dead at 130-nm,
says IBM technologist
$10
www.icknowledge.com
$1
1960
1970
1980
1990
2000
2010
By Peter Clarke , Silicon Strategies
May 04, 2004 (2:28 PM EDT)
PRAGUE, Czech Republic — The traditional scaling of
semiconductor manufacturing processes died somewhere
between the 130- and 90-nanometer nodes, Bernie Meyerson,
IBM's chief technology officer, told an industry forum.
Believe in the Law
1.E+04
$ per MIPS
1.E+03
$/MIPs
1.E+02
1.E+01
1.E+00
1.E-01
1.E-02
1960
1970
1980
1990
2000
2010
1.E-01
$ per Transistor
No exponential is forever,
but you can delay forever…
–Gordon Moore
$/Transistor
1.E-02
1.E-03
1.E-04
1.E-05
1.E-06
1960
1970
1980
1990
2000
2010
Direction For The Future
CMOS Outlook
High Volume
Manufacturing
Technology Node
(nm)
Integration
Capacity (BT)
2004
2006
2008
2010
2012
2014
2016
2018
90
65
45
32
22
16
11
8
2
4
8
16
32
64
128
256
Moore’s Law Is Alive & Well …
However …
CMOS Outlook
High Volume
Manufacturing
Technology Node
(nm)
Integration
Capacity (BT)
Delay = CV/I
scaling
Energy/Logic Op
scaling
2004
2006
2008
2010
2012
2014
2016
2018
90
65
45
32
22
16
11
8
2
4
8
16
32
64
128
256
0.7
~0.7
>0.7
Delay scaling will slow down
>0.35
>0.5
>0.5
Energy scaling will slow down
Bulk Planar CMOS
High Probability
Low Probability
Alternate, 3G etc
Low Probability
High Probability
Variability
ILD (K)
RC Delay
Metal Layers
Medium
High
~3
<3
1
1
1
6-7
7-8
8-9
Very High
Reduce slowly towards 2-2.5
1
1
1
1
1
0.5 to 1 layer per generation
Guiding Observations
Transistors (and silicon) are free
Power is the only real limiter
Optimizing for frequency AND/OR area may achieve
neither
MOS Transistor Scaling
GATE
SOURCE
DRAIN
SOURCE
BODY
GATE
Xj
DRAIN
D
Tox
BODY
Leff
Dimensions scale
down by 30%
Doubles transistor
density
Oxide thickness
scales down
Faster transistor,
higher performance
Vdd & Vt scaling
Lower active power
Technology has scaled well, and will continue…
Relative Performance
Relative Performance
Delivering Performance in
Power Envelope
1.2
MobileMark
1.1
17%
1
0.9
130nm
1.3
Spec 2000
1.2
1.1
1
0.9
90nm
130nm
Mobile, Power Envelope ~20-30W
Relative Performance
21%
Desktop, Power Envelope ~60-90W
1.3
Spec 2000
1.2
1.1
23%
1
0.9
130nm
90nm
90nm
Server, Power Envelope ~100-130W
Strained Silicon – 90nm+
G
G
S
S
D
D
NMOS
PMOS
SiGe S-D creates strain
Tensile Si3N4 Cap
10-25% higher ON current
84-97% leakage current reduction
OR
15% active power reduction
Source: Mark Bohr, Intel
Gate & Source-Drain Leakage
90nm MOS Transistor
Ioff (na/u)
10000
45nm
1000
100
10
0.25u
50nm
1
30
Gate
1.2 nm SiO2
Silicon substrate
50
70
90
110
130
Temp (C)
Gate Leakage Solutions:
High-K + Metal Gate
New Transistors: Tri-Gate…
Tri-gate
Gate 3
Gate
Lg
Drain
Source
Gate 1
Source
Drain
WSi
TSi
Gate 2
Source: Intel
Improved short-channel effects
Higher ON current for lower SD Leakage
Manufacturing control: research underway
Low-K ILD
0.5
0
500
RC Delay (Relative)
Line Res (Relative)
1
250
130
65
32
100
1000
100
10
1
500
10000
Delay (ps)
Line Cap (Relative)
Metal Interconnects
10
0.7x Scaled RC Delay
1
1000
100
10
250
130
65
32
Interconnect RC Delay
Clock Period
Copper Interconnect
RC delay of 1mm interconnect
1
500
250
130
65
32
350 250 180 130 90
65
New Challenge: Variations
Static & Dynamic
Mean Number of Dopant Atoms
Random Dopant Fluctuations
10000
1000
100
10
1000
500
250
130
65
32
Technology Node (nm)
Uniform
Non-uniform
Sub-wavelength Lithography
Adds Variations
1
1000
Lithography
Wavelength
365nm
248nm
193nm
180nm
130nm
micron 0.1
Gap
90nm
100 nm
65nm
Generation
45nm
32nm
13nm
EUV
0.01
1980
10
1990
2000
2010
2020
Impact of Static Variations
Normalized Frequency
1.4
1.3
Frequency
~30%
30%
1.2
130nm
Leakage
Power
~5-10X
1.1
1.0
5X
0.9
1
2
3
4
Normalized Leakage (Isb)
5
Dynamic Variations:
Vdd & Temperature
250
100
50
0
Heat Flux (W/cm2)
Results in Vcc variation
100
90
80
70
60
50
40
Temperature Variation (°C)
Hot spots
Temperature (C)
150
Heat Flux (W/cm2)
200
110
Technology Challenges
Power: Active + Leakage
Interconnects (RC Delay)
Variations
Design Methodology Is
Changing…
Slow
Fast
Slow
High Supply
Voltage
Low Supply
Voltage
Active Power Reduction
Multiple Vdd
• Vdd scaling will slow down
• Mimic Vdd scaling with multiple Vdd
• Challenges:
– Interface between low & high Vdd
– Delivery and distribution
Leakage Control
Body Bias
Vdd
Stack Effect
Sleep Transistor
Vbp
+Ve
Equal Loading
-Ve
Logic Block
Vbn
2-10X
5-10X
2-1000X
Reduction
Reduction
Reduction
Number of dies
Adaptive Body Biasing
too
leaky
ABB
too
slow
RBB
FBB
f target
Frequency
f target
Adaptive Body Biasing
No BB
ABB
Within die ABB
100%
Accepted Die
97% highest bin
60%
100% yield
20%
0%
Low Frequency Bin
High Frequency Bin
100% yield with Adaptive Body Biasing
97% highest freq bin with ABB for within die variability
RC Delay Mitigation
Throughput Oriented Design
Vdd/2
Vdd
Logic Block
Freq
=1
Vdd
=1
Throughput = 1
Power
=1
Area
=1
Power Den = 1
Logic Block
Logic Block
Freq
= 0.5
Vdd
= 0.5
Throughput = 1
Power
= 0.25
Area
=2
Power Den = 0.125
RC Delay Tolerant Design
Lower Power And Power Density
Variation Tolerant Circuit Design
power
2
target
frequency
probability
1.5
1
0.5
0
2
1.5
1
0.5
small
large
Transistor size
0
low
high
Low-Vt usage
Higher probability of target frequency with:
1. Larger transistor sizes
2. Higher Low-Vt usage
But with power penalty
µ-architecture Is Also Changing…
Variations and µ-architecture
40%
20%
0%
40%
20%
0.9
1.1
1.3
Clock frequency
NMOS
PMOS
1.5
Device I ON
Delay
40%
20%
0%
-16%
-8%
0%
Variation (%)
8%
16%
1.4
1.3
1.2
1.1
1
delay-s to Ion-s
Mean clock frequency
# critical
paths
Ratio of
# of samples (%)
Number of dies
60%
9
17
# of critical paths
1.0
0.5
0.0
16
49
Logic depth
25
Variation Tolerant µ-architecture
1.5
1.5
frequency
1
target
frequency
probability
0.5
0
Small
Large
Logic depth
1
0.5
0
More
Less
# uArch critical paths
Decrease variability in the design:
1. Deeper logic depth
2. Smaller number of critical paths
Implications For CAD
Logic & Circuits
Layout
Test
Probabilistic Design
Probability
Due to
variations in:
Vdd, Vt, and
Temp
Path Delay
Delay Target
Probabilistic
Delay Target
Frequency
Deterministic
# of Paths
# of Paths
Delay
Deterministic
Probabilistic
10X variation
~50% total power
Leakage Power
Deterministic design techniques inadequate in the future
Shift in Design Paradigm
• Multi-variable design optimization for:
–
–
–
–
Yield and bin splits
Parameter variations
Active and leakage power
Performance
Today:
Tomorrow:
Local Optimization
Single Variable
Global Optimization
Multi-variate
Today’s Freelance Layout
Vdd
Ip
Vdd
Op
Op
Vss
Vss
No layout restrictions
Future Transistor Orientation Restrictions
Vdd
Vdd
Ip
Op
Op
Vss
Vss
Transistor orientation restricted to improve
manufacturing control
Future Transistor Width Quantization
Vdd
Vdd
Op
Ip
Vss
Op
Vss
Today’s Unrestricted Routing
Future Metal Restrictions
Today’s Metric:
Maximizing Transistor Density
Dense layout causes hot-spots
Tomorrow’s Metric:
Optimizing Transistor & Power Density
Balanced Layout
Other Challenges …
Test & Debug
Test Challenges
1.E+03
Test Capital
Per Chip
1.E+01
Understandable …
1.E+00
1.E-01
1.E-02
Based on SIA roadmap
1.E-03
1.E-04
1980
1990
2000
Disturbing …
2010
1.E-04
Test Capital ($)/Transistor
Test Capital ($)
1.E+02
Test Capital/ Transistor
1.E-05
From SIA roadmap
1.E-06
Based on SIA roadmap
1.E-07
1.E-08
1980
1990
2000
2010
On Die Test Methodology
Voltage (V)
0.25
>1E-4
0.125
1E-5
0
1E-6
-0.125
1E-7
<1E-8
-0.25
0
104
208 312
Time (ps)
416
Differential Voltage (V)
On die debug & test of 8Gb/sec IO interface
0.4
On-Die Scope Waveform
0.2
0.0
-0.2
-0.4
0.0 1.8 3.6 5.4 7.1 8.9 10.7 12.5
Time (ns)
• Move from external to on-die “self testing”
• High-speed test & debug hardware on each die
• Low speed, low cost, interface to external tester
ISSCC 2003: 8Gb/s Differential Simultaneous Bidirectional Link with 4mV, 9ps Waveform Capture
Diagnostic Capability
Other Challenges …
Mixed-signal Design
System-level Design
Correctness
Multi-clock domains
Resiliency
Business As Usual Is
NOT An Option For CAD…
Summary
BELIEVE
CMOS scaling will continue, transistors become free
SHIFT
Deterministic Probabilistic, Single Multi
EMBRACE
local to global optimization: power,…