Dimensionally-Decomposed Router for 3D-NoC*

Download Report

Transcript Dimensionally-Decomposed Router for 3D-NoC*

Dimensionally-Decomposed
Router for 3D-NoC*
J. Kim et al., ISCA 2007
ECE 284
5/14/2013
Presenter: Hyein Lee
*J. Kim et al., "A Novel Dimensionally-Decomposed Router for On-Chip Communication in 3D
Architectures”," Proc. ISCA, pp. 138-149, June 2007.
Outline
• Introduction
• Current 3D NoC Architecture
• Dimensionally-Decomposed Router
• Performance Results
ECE 284 / Dimensionally-decoupled 3D-NoC
2
Introduction: NoC for 3D Chips
• 3D chip technology
• Reduce interconnect delay by stacking multiple layers
• NoC for 3D chips
• Interconnect architecture design across the layers should
be carefully designed
• Needs an integrated approach for the network in 2D plane
and the vertical interconnect
 The interconnect delay of inter-layer is much smaller than
that of intra-layer (inherent asymmetry)
• This work
• A new 3D dimensionally-decomposed router architecture
ECE 284 / Dimensionally-decoupled 3D-NoC
3
Current 3D NoC Architecture
• 3D Symmetric NoC Architecture
• 3D NoC-Bus Hybrid Architecture
• True 3D NoC Router
A Generic NoC Router
ECE 284 / Dimensionally-decoupled 3D-NoC
4
Current 3D NoC Architecture
• The 3D Symmetric NoC Architecture
• Simple extension to 3D: 5x5 crossbar in typical 2D router +
inter-layer movement (hop-by-hop traversal)
• Does not exploit the low inter-layer distances
• Requires larger crossbars (7x7)  2X power of 5x5
crossbar
A 3D Symmetric NoC Network
PE : Processing Elements (CPUs, DSPs, etc.)
ECE 284 / Dimensionally-decoupled 3D-NoC
5
Current 3D NoC Architecture
• Area and power comparison of various crossbar
types
ECE 284 / Dimensionally-decoupled 3D-NoC
6
Current 3D NoC Architecture
• The 3D NoC-Bus Hybrid Architecture
• Uses a bus link in the vertical dimension
• However, it does not allow concurrent communication in the
vertical dimension
• Bus is a shared medium; can be used by a single flit only
ECE 284 / Dimensionally-decoupled 3D-NoC
7
Current 3D NoC Architecture
• A True 3D NoC Router
• Add vertical links to allow flexible flit traversal within the 3D
crossbar
• Dedicated connection box(CB)s at
each layer
• 3D connection box can facilitate
linkage between vertical and
horizontal channel
Connection Box
ECE 284 / Dimensionally-decoupled 3D-NoC
8
Current 3D NoC Architecture
• A True 3D NoC Router
• Increase path diversity  good for optimization but hard to
control
• Requires excessive control signals
Number of minimal paths between A and
B
Impact of the number of vertical bundles
on performance
 use of 4 vertical pillars is not efficient
3x3x3 crossbar, k=90
4x4x4 crossbar, k=1680
ECE 284 / Dimensionally-decoupled 3D-NoC
9
Dimensionally-Decomposed Router: 2D RoCo
• 2D Row-Column (RoCo) Decoupled Router
• Incoming traffic is decomposed into two independent
stream (Guided Fit Queuing*)
• X dimension (East-West traffic) and Y dimension (NorthSouth traffic)
• Use two smaller 2x2 crossbars instead of 5x5
*J. Kim, et al., “A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks”, Proc.
ISCA, 2006
ECE 284 / Dimensionally-decoupled 3D-NoC
10
Dimensionally-Decomposed Router: 3D DimDe
• 2D RoCo + vertical links
• Segmented vertical links are integrated into
row/column module
ECE 284 / Dimensionally-decoupled 3D-NoC
11
Dimensionally-Decomposed Router: 3D DimDe
• Vertical link arbitration
• 2-stage arbitration (in deterministic XYZ routing)
• Local arbitration (intra-layer)  global arbitration (inter-layer)
• Concurrent communication is needed to increase the
bandwidth
• 3D global arbitration request scenarios
(1), (5), (9) combination can allow full
concurrent communication
ECE 284 / Dimensionally-decoupled 3D-NoC
12
Dimensionally-Decomposed Router: 3D DimDe
Guided Flit Switching
Decompose incoming
flits into x, y, and z
direction
Early Ejection Mechanism
Enable flits to bypass
the destination
router and be
ejected to the NIC
Vertical module
Two bidirectional
vertical bundles
ECE 284 / Dimensionally-decoupled 3D-NoC
13
Performance Results
• The proposed method remains within 5% on average of the
performance of the full 3D crossbar (ideal)
• 20% improvement in latency over other 3D crossbars (without
Full 3D); 26% reduction in Energy-Delay Product
ECE 284 / Dimensionally-decoupled 3D-NoC
14
THANK YOU!
ECE 284 / Dimensionally-decoupled 3D-NoC
15