Transcript PowerPoint

FlexiBuffer: Reducing Leakage Power
in On-Chip Network Routers
Gwangsun Kim, John Kim
Sungjoo Yoo
Dept. of Computer Science
Dept. of Electronic and Electrical
Engineering
Korea Advance Institute of
Science and Technology
Pohang University of
Science and Technology
Motivation
 Buffer size has a huge
impact on performance.
 Buffers take a large
portion of router power.
 However, not all of the
buffers are fully utilized
even at a high load.
Buffer
70% size
Average utilization of buffers
Average latency (cycles)
 On-chip network is
becoming more critical.
Router Power Breakdown
Allocator
803%
60%
Clock
buffer
70
50%
60
40%
50
30%
40
20%
30
10%
20
16%
2
4
8
16
32
Crossbar
Switch
35%
Input
buffer
0%
10
0
1
0
46%
0.2
0.4
0.6
0.8
1
0 Injection
0.1 Rate
0.2 (flits/node/cycle)
0.3
0.4
Injection
[Kumar etrate
al., (flits/node/cycle)
ICCD’07]
Use power-gating and turn off unused entries!
Our Approach
 Dynamically adjust the active window size.
•
Active window: set of ON (or active) entries of a buffer.
Active window
At a low traffic load
F
F
ON
At a high traffic load
F
F
F
OFF
F
F
F
F
Issue 1: Flow Control
 Need to communicate the availability of buffers
Case 1: Increase the active window size using early credit
Router 0
F
CR
21
Router 1
flit
credit
Router 2
flit
ON OFF
credit
 When?

There is an incoming flit.

There is an OFF buffer entry.

There is congestion in both upstream and local router.
Issue 1: Flow Control (cont’d)
Case 2: decrease the active window size by withholding credit.
Router 0
Router 1
flit
CR
2
F F
credit
Router 2
flit
credit
 When?

There is an outgoing flit.

There is more than the minimum # of ON entries.
Issue2: Circular Queue Problem
 When utilization is low, each incoming flit turns on an entry.
→ Each activation of an entry incurs power overhead!
 Problematic circular buffer
• Each flit activates an entry.
OFF
ON
FLIT
OFF0
FLIT 1 OFF
ON
FLIT 2 OFF
ON
ON
FLIT 3 OFF
ON
FLIT 4 OFF
Large power overhead
 Ideal buffer management
• The same entry is reused.
OFF
ON 0
2
FLIT 4
1
3FLIT
OFF
OFF
OFF
OFF
No power overhead
Split Queue
 A buffer is separated into two regions.
 Use the primary region only (as long as possible).
 Adjust the active window size dynamically.
Operate like a circular queue
Unified
FLIT
ON 0 mode
Primary region
FLIT
ON 1
ON
FLIT 2 OFF
ON
OFF
Secondary region
OFF
OFF
Not used
Split Queue (cont’d)
 Cannot stay in the unified mode indefinitely.

Switch to split mode.
 When the primary region is empty,

Switch back to unified mode.
FLIT 3 ON
Primary region
FLIT
ON 1
FLIT
ON 2
OFF
Primary region
FLIT 4 OFF
ON
Secondary regionFLIT 5 OFF
ON
OFF
ON
Secondary
region
OFF
Yet,
there are
unused entries.
Unified
Split queue
mode
Primary
is empty!
Flits are region
read out
from here.
Next flit’s place is
NOT available.
Flits are written to here.
Summary of Evaluation
Simulator : Cycle-accurate OCN simulator - Booksim
Power Measurement - Orion 2.0
Parameter
Topology
Parameter
8x8 2D mesh Technology node
# of VCs
4
Clock frequency
VC buffer depth
8
Vdd
Performance
75
60
45
30
15
0
0
0.1
0.2
0.3
0.4
Injection rate (flits/node/cycle)
1.5GHz
1.0 V
FlexiBuffer (SQ)
Total router power (w)
Average latency
(cycles)
baseline
32nm
0.25
Power consumption
0.2
0.15
13%
0.1
0.05
39%
0
0
0.25
0.5
0.75
1
Injection rate (flits/node/cycle)
Conclusions
 There’s a huge opportunity of power-saving with finegrained power gating when buffers are large.
 Proposed modified credit-based flow control.
 Split queue is proposed to minimize activation power
overhead.
 Our simulation results show that, with minimal performance
loss, FlexiBuffer + SQ can save

39% of router power at low traffic load

13% of router power at high traffic load
Thank you!
Questions?
For more discussion,
please come to my poster!