Transcript TSV
Power and Slew-aware Clock
Network Design for ThroughSilicon-Via (TSV) based 3D
ICs
Xin Zhao and Sung Kyu Lim
School of Electrical and Computer
Engineering
Georgia Institute of Technology
Atlanta, Georgia, U.S.A.
Reference the slides of this paper in ASPDAC 2010
Outline
Motivation and contribution
Modeling and synthesis of 3D clock tree
Experimental result and discussions
Conclusion
Outline
Motivation and contribution
Modeling and synthesis of 3D clock tree
Experimental result and discussions
Conclusion
Motivation
Clock skew is required to be less than
3%-4% of the clock period in an aggressive clock network design according to
ITRS projection.
Driving large capacitive load and switches at a high frequency leads to an
increasingly large proportion of the total
power of a system dissipated in the clock
distribution network
Motivation
TSV provides the vertical interconnection
to deliver the clock signal to all dies in
the 3D stack
In general, the total wirelength of the 3D
clock network decreases significantly if
more TSVs are used
Too many TSVs often cause routing congestion and yield reduction problem
Contribution
Three major goals
Using SPICE to simulate the result
• Clock skew minimization
• Clock slew control
• Clock power reduction
Contribution
Investigate the impact of design techniques on 3D clock network
Outline
Motivation and contribution
Modeling and synthesis of 3D clock tree
Experimental result and discussions
Conclusion
Electrical model
Electrical model
• Wire
• TSVs
• Clock buffer
TSV usage
TSV upper bound
• Maximum number of TSVs allowed between
•
adjacent dies
According to yielding and routing reason
TSV count
• The number of TSV be used actually
• Using stacked-TSV
Simple sample clock tree
Problem formulation
Input
Output
• Sink set (N dies), clock source location
• Upper bound of TSV usage
• Slew constraint
• Zero-Elmore-skew 3D clock tree
Problem formulation
Object
Constraint
• Zero-Elmore-skew
• Minimize wirelength, clock power
• Maximum slew
• Upper bound of TSV usage
3D clock routing algorithm-flow
What is MMM?
Method of Means and Medians
•
Jackson, Srinivasan, Kuh, “Clock routing for high-performance ICs,”
DAC, 1990.
Each clock pin is represented as a point in the region, S.
The region is partitioned into two subregions, SL and SR.
The center of mass is computed for each subregion.
The center of mass of the region S is connected to each of the
centers of mass of subregion SL and SR.
The subregions SL and SR are then recursively split in Ydirection.
The above steps are repeated with alternate splitting in X- and
Y-direction.
Time complexity: O(n log n).
An MMM example
3D abstract tree
3D abstract tree (cont.)
A N-colored binary tree
3D-MMM
Suppose TSV upper bound is 3 and the
clock source is on die-0
3D-MMM
Slew-aware buffering, merging
and embedding
Merging and slew-aware buffering, embedding
• 3D clock tree with multiple TSVs
• Using deferred-merge embedding, DME
Slew-aware buffering, merging
and embedding (cont.)
For the purpose of
That is
• Skew controlling
• Shorter wire length
• Zero skew in Elmore delay model
• Minimize the clock power consumption
3D clock tree
Unique property of 3D clock tree
• A complete tree + many sub-trees
About clock source location
In general, locating on middle die can
make #TSVs and wirelength less
About clock source location –
Theoretical max TSV usage
Suppose M clock sinks evenly distribute
on N dies and clock source location is
die-s
Outline
Motivation and contribution
Modeling and synthesis of 3D clock tree
Experimental result and discussions
Conclusion
Experimental result
Sample clock tree of IBM r5
benchmark in 6-die
With TSV upper bound : 20
Impact of TSV bound
Point A: 20% power
saving, TSV bound
≥70% of #sinks
die
TSV bound and slew distribution
r5, six-die CMAX=300fF
[11.4ps, 86.2ps]
Avg. 53.9ps
#Bufs: 2933
[10.9ps, 79.6ps]
Avg. 42.6ps
#Bufs: 2638
Multi-TSV vs. Single-TSV: 4-die stack
Multi-TSV vs. Single-TSV: 6-die
stack
Skew comparison
CMAX and slew
Using single TSV
Using multiple TSVs
Impact of clock source location
on power and wirelength
A uses 33% fewer
TSVs than B
Statistics of TSVs number and
the clock source location
#TSVs = 3720
#TSVs = 2791
Outline
Motivation and contribution
Modeling and synthesis of 3D clock tree
Experimental result and discussions
Conclusion
Conclusions
Provided SPICE simulation information
Using multiple TSVs helps to reduce
wirelength and power. Multi-TSV also has
better control on slew variations
Smaller CMAX efficiently lowers the clock slew
Clock source location also affects the 3D clock
network in a significant way: placing the clock
source on the middle die helps reducing slew
and TSV usage under the same power
budgets