Transcript Slide 1

Reading assignment presentations for
EN0291 S40
“Effect of increasing chip density on the evolution of
computer architectures,” IBM J. Res & Dev, Vol. 46.
 Brendan
“Repeater scaling and its impact on CAD,” IEEE Trans. on
CAD, Vol. 23(4)  Elif
“SOI technology for the GHz era,” IBM J. Res. & Dev., Vol.
46.  Cesare
“Turning Silicon on Its Edge,” IEEE Circuits & Devices
Magazine, Jan/Feb’04.  Yiwen
1
Effect of Increasing Chip Density on the
Evolution of Computer Architectures
R. Nair
IBM Journal of Research and Development
Volume 46 Number 2/3 March/May 2002
2
International Roadmap for Semiconductors
(1999)
3
A Billion Transistors on a Chip
• “What functions will be
expected of billiontransistor chips, and how
will they be organized?”
• Move Memory closer to
the processors (physically
speaking)
• System on a Chip –
integration on the same
chip of varied structures
such as processors,
DRAM, sensors, and
transducers
4
Processor Evolution
• New generations
depending on
prediction
algorithms
• Performance benefit
decreasing
• Sometimes simpler
is better!
5
The Current Techniques (Benchmarks)
• Increasing pipeline depth and frequency
• Fewer applications responding well
6
Cellular Architectures
• Little communication overhead between
threads
• Connectionist architecture – large number
of processors with little memory
• Advantages
– Off-the-shelf commodity parts
– Use existing compilers
– Possibilities of redundancy
7
System-on-a-Chip
• Integrate functions that are outside
processor
• Reduce communication costs between
elements
• Current state – performance decrease
when combining technologies on one die
• Help with clock skew
8
Conclusions
• Convergence of processors
• Less focus on more computation power
• Scalable, distributed computing
9
The Scaling Challenge:
Can Correct-by-Construction Design Help?
P. Saxena
N. Menezes
P. Cocchini
D. A. Kirkpatrick
Intel Labs. (CAD Research)
EN0291
Elif Alpaslan
10
Introduction:
FROM LAST LECTURE:
• Cmos scaling in VLSI chips bring new design concerns:
• increasing dominance of interconnects
• leakage
• In this paper : Results of scaling studies in the context of typical block level
wiring distributions, and study the impact of the identified
trends on post-RTL design process.
• Goal of the paper: To show how does exponentially increasing repeater and
clocked repeater count will effect logic synthesis,
technology mapping, layout and new research problems
relevant to future designs.
11
Some Scaling Experiments (SPICE)
Critical repeater length
Min. distance at which inserting
repeater speeds up line
CRL for M3 & M6 shrink at the rate of 0.57x per
generation ~ faster than normal scaling of 0.7x
Additional repeaters need to be added during
a shrink of an optimal repeated interconnect
from one process generation to next.
Critical sequential length
Max. distance that signal can travel
in an optimally sized and buffered
interconnect in 1 cycle
CSL shrink at a rate of 0.43x per generation ~ faster than
normal scaling and the rate of decrease in CRL
Ideally shrink interconnects won’t only
require additional repeaters but many of
them need to be clocked
12
Impact of decrease in critical repeater and
sequential lengths
How CRL & CSL are migrated across Block Level Wiring Histogram ?
• # of nets requiring repeaters ~ area under histogram curve to right of line
representing critical length
•left migration of critical length
exponentially increasing # of nets
13
Impact of decrease in critical repeater and
sequential lengths (cont.)
Block Level Wiring Histogram (Zoomed View)
Increasingly steep slope of curve (log scale on y-axis) => # impacted nets
exploding!
14
Impact of decrease in critical repeater and
sequential lengths (cont.)
• Percentage of block-level nets
impacted by repeaters
• Percentage of block-level nets
impacted by clocked repeaters
15
Impact on POST-RTL CAD
•
Logic Synthesis and Technology Mapping:
–
Metrics that drive capacitive load of wires are traditional literal or gate-count and fanout –based wire
load metrics and they don’t take into account interconnect repeaters
Fanout-based metrics can be
misleading due to isolation of some
sinks of an interconnect from its
driver by a repeater
–
–
Gate count metric can lead to wrong heuristic choices during early stage of synthesis due to more
delay migration to repeated interconnects.
Amount of logic available in a single pipeline stage shrinks
Maximum possible benefit
of a good logic synthesis
solution reduces
16
Impact on POST-RTL CAD
•
Placement and Routing:
–
–
•
Biggest impact of repeaters on placement stage
# of repeaters required by an interconnect is strongly dependant on placement of cells
Current Placement Algorithms:
–
Handle repeater insertion by reserving a certain fraction of block area for repeaters prior placement
and then inserts them into long nets using ECO’s after placement
•
•
ECO technique breaks down when more than 5-10%nets in netlist changes
–
Block level placement algorithms at any level have to deal with the complications that arising from
repeater requirements for nets at any other levels of hierarchy.
–
When CSL shrink below to the dimensions of synthesizable block placement algorithms has to
handle clocked repeater insertion which is not as straight forward as buffering.
Routing:
–
–
–
Routers can’t operate in a purely geometric world, it must understand buffering
Complications due to large number of via blockage
Complications due to the multi pin nets
17
SOI technology for the GHz era
(by G. G. Shahidi – 2002 IBM)
• Presented by Cesare Ferri
18
SOI technology for the GHz era
(by G. G. Shahidi – 2002 IBM)
• Silicon-on-Insulator (SOI) : Technology
Introduction
• Brief History
• SOI vs. Bulk (power, performance, scaling)
• Applications
• Future Trends
19
SOI - Introduction
G
S
 SOI : Process Technology
 Basic Idea : placing a thin
layer of insulator upon the
substrate
n+
CSB
Si-poly
SiO2
p-substrate
D
n+
CDB
B
A lot of capacitance here (i.e. slow)
Less area junction capacitance
smaller Capacitance of the switch
G
S
n+
n+
CSB
Si-poly
SiO2
D
n+
TBOX SiO2
p-substrate
CDB
B
Faster Transistor!
No capacitance here
(i.e. fast)
20
SOI : video (© IBM 2002)
21
SOI : Brief History
• First developed by IBM in early 70s
• Not suitable until `90 (expensive process,
progress in bulk CMOS by scaling)
• FD(fully depleted)–SOI vs. PD(partially
depleted)–SOI
• IBM Fabrication technique : SIMOX
(Separation by Implantation of Oxygen)
Implant Oxigen
Annealing
22
SOI vs. Bulk
• Pros:
– Less Capacitance (~9-25%) , NO Body effect (floating
body, Vbs>0)
– Same frequency but Lower VDD  Lower power
– Reduced Short Channel Effects (higher doping
concentrations)
– No latch-up -> Layout simplicity (no wells, plugs, …)
– Same scaling rules of Bulk
• Cons:
– History-dependent timing (floating body)
– Floating Body (Vsb) Reduced effective VT
(=F(Vsb))  higher off current, Ioff (OSS: on the other
hand, we are decreasing VDD  Ioff is the same than
in Bulk..)
– Self heating (the channel is isolated from the bulk)
23
SOI : Applications & Future Trends
• High Performance processors (servers,
Cell Processors, XBOX360..)
• Low-Power Devices (MPSoC)
• Wireless Technology (high-resistivity
substrate  less crosstalk)
• spacecraft, satellites and military
electronics (less sensitive to alpha
radiation)
24
TURNING SILICON ON ITS EDGE
--Overcoming Silicon scaling barriers with
double-gate and FinFET technology
by Edward J.Nowak, Ingo Aller, Thomas Ludwig, Keunwoo Kim,
Rajiv V.Joshi, Ching-Te Chuang, Kerry Bernstein, and Ruchir Puri
• Presenter: Yiwen Shi
25
Overcoming Obstacles by Doubling Up
Two dominant barriers
for further CMOS
scaling:
• Subthreshold
• Gate-dielectric
leakages
Reduce drain-induced-barrier lowering (DIBL)
Improve subthreshold swing (S)
Double-gate (DG) FET
Lower threshold voltage for a given off-current
Higher drive current at lower power-supply voltage
26
Centering of Double-gate Threshold Voltage
• Body doping
Halo doping
• Asymmetric gate work function
Two gate electrodes of differing work functions
e.g. degenerately doped n+ & p+ polysilicon
• Symmetric mid-gap workfunction gate-electrodes
Metal gates
e.g. nickel-silicide
27
Double-gate Taxonomy
Planar DG
Vertical DG
FinFET
Four significant obstacles
28
FinFET-DGCMOS Process Flow
& Circuit Demonstration
A ring of 60 inverters with a single two-way NAND
•
Demonstration of DGCMOS
static operation: to prove the device
parametrics can all be centered to the
practical values demanded for VLSI (W,L,T)
transient operation: to prove the numerous
parasitic elements that can degrade circuit
performance can be tamed (inverter delay)
•
•
Achieve numerous landmarks
May indeed prove manufacturable
29
Microprocessor Design with FinFETs
Sidewall image transfer (SIT)
(a converted six-transistor-SRAM cell)
Potential for double-gate applications
a. Low-power design
b. Variable threshold CMOS
c. Simplified logic gates
Still a lot of challenges…
Overall, promising!
Device width quantization
30