Wires and Devices - Washington State University

Download Report

Transcript Wires and Devices - Washington State University

EE 587
SoC Design & Test
Partha Pande
School of EECS
Washington State University
[email protected]
SoC Physical Design Issues
Interconnect Architectures and Signal Integrity
Design Challenges
1. Non-scalable global wire delay
2. Moving signals across a large die within one clock
cycle is not possible.
3. Current interconnection architecture- Buses are
inherently non-scalable.
4. Transmission of digital signals along wires is not
reliable.
Bus – non scalability
Clock cycle depends on the parasitic and bus length
Multiple bus segments
•More than one design iteration
•Converges to network
Bus Architectures
Split Bus Architecture
E 2  0.5swV 2
[C BUS 1
  xfer( M , M
iBUS 1 jBUS 1,i  j
 C BUS 2

i
j
)
 xfer( M , M
iBUS 2 jBUS 2 ,i  j
 (C BUS 1  C BUS 2 )
i
j
)
  xfer( M , M
iBUS 1 jBUS 2
i
j
)  xfer( M j , M i )
Achievable Clock Cycle in a Bus segment
Minimize Power Consumption

Modification of interconnect architectures


Incorporate parallelism (ITRS 2003 & ISSCC 2004)
Decoupling of communication and processing
Modular architecture

Minimize use of global wires
Locality in communication


SoC Micro architecture Trend






50-100K gates block – No global wire delay problem.
Block-based hierarchical design style that uses block sizes of
50-100K gates.
Single synchronous clock regions will span only a small fraction
of the chip area.
Different self-synchronous IPs communicate via networkoriented protocols.
Structured network wiring leads to deterministic electrical
parameters - reduces latency and increases bandwidth.
Failures due to inherent unreliable physical medium can be
addressed by introducing error correction mechanisms.
New design paradigm


New designs – very large number of functional blocks
Moving bits around efficiently
•

Develop on-chip infrastructure to solve future inter-block
communication bottlenecks
Development of infrastructure IPs
•
SoC =  (SFIP + SI2P)
Silicon Back plane
MIPS SoC-it
The network-on-chip paradigm

Driven by

Increased levels of integration

Complexity of large SoCs
–
New designs counting 100s of IP blocks

Need for platform-based design methodologies

DSM constraints (power, delay, time-to-market, etc…)
NoC Features


Decoupling of functionality from communication
Dedicated infrastructure for data transport
High-performance
ARM processor
High-bandwidth
memory interface
High-bandwidth
ARM processor
AHB
B
R
I
D
G
E
Timer
UART
APB
Keypad
PIO
DMA Bus
master
NoC infrastructure
link
switch
Some Common Architectures

(a) Mesh, (b) Folded-Torus (FT) and (c) Butterfly Fat Tree (BFT)
(a)
(b)
(c)
- Functional IP
- Switch
Data Transmission


Packet-based communication
Low memory requirement

Wormhole routing
Packets are broken down into flow control units or
flits which are then routed in a pipelined fashion


Packet switching
Connecting Different IP Blocks Using Tree
Architecture
Communication Pipelining
• Need to constrain the
delay of each stage within
15 FO4
Signal Integrity
 According to ITRS signal integrity will become a
major issue in future technologies
 Causes for such inherent unreliability
 Shrinking geometries, layout dimensions
 Reduction in the charge used for storing bits
 Increased probability of transient events like:
 Crosstalk
 Ground Bounce
 Alpha particle hits
Micro network Protocol Stack
On Chip Signal Transmission





Future global wires will function as lossy transmission lines
Reduced-swing signaling
Noise due to crosstalk, electromagnetic interference, and other
factors will have increased impact.
it will not be possible to abstract the physical layer of on-chip
networks as a fully reliable, fixed-delay channel
At the micro network stack layers atop the physical layer, noise
is a source of local transient malfunctions.
Coding Schemes

Low-Power Coding

Reducing self-transition activity
Crosstalk Avoidance Coding

Reducing Coupling with adjacent lines
Error Control Coding



SEC, SECDED
Low Power Coding




Reduction of self-transition activity
Bus-Invert Code
Data is inverted and an invert bit is sent to the decoder if the
current data word differs from the previous data word in more
than half the number of bits
Effectiveness decreases with increase in bus width
Error Control Coding







Linear block codes
(n, k) linear block code, a data block, k bits long, is mapped
onto an n bit code word,
Forward Error Correction or Automatic Repeat Request
Redundant wires
Possibility of voltage reduction
Energy efficiency is an important criterion
Codec overhead
Worst Case Crosstalk


Transition from 101 to 010
pattern or vice versa
Due to Miller Capacitance worst
case capacitance between
adjacent wires become
Aggressor Wire 1
1
1 4 CL
0
1
Victim Wire
0
1
Aggressor Wire 2
0
Victim Rise Time
Aggressor Rise Time
Joint Crosstalk Avoidance and Single Error Correction Codes

Reduce crosstalk as well correct errors due to other transient
events

Duplicate Add Parity (DAP)

Dual Rail Code (DR)

Boundary Shift Code (BSC)
Modified Dual Rail Code (MDR)
Worst case crosstalk capacitance is reduced to (1+2λ)CL


Duplicate-Add-Parity Code



Each bit is duplicated
A parity bit from one
copy is computed
Same as Dual Rail
Code
Crosstalk Avoidance Double Error Correction Code
(CADEC)





The 32-bit flit is Hamming
coded and then an overall
32 bit i/p
parity is calculated
All bits apart from the overall 32
parity are duplicated
The 32 bit original flit
becomes 77 bits
Minimum Hamming distance
is 7
Worst case crosstalk
capacitance is reduced to
(1+2λ)CL
bit 0
bit 1
(38,32)Hamming
encoding
38
bit 2
bit 3
bit 4
bit 5
bit 6
bit 7
bit 74
bit 75
38
Hamming
encoding
parity, bit76
DAP duplication
77
bit
o/p
Energy Savings with Joint Codes


Due to increased error resilience lower noise margins
can be tolerated and hence operating voltage can be
reduced
Coding adds overhead in terms of extra wires and
codec
Voltage Swing Reduction for CADEC
 The probability of word error for DAP
PDAP 
3k ( k  1) 2

2
1
ED
DAP
CADEC
0.9
0.8
V
0.7
0.6
0.5
0.4
-20
10
-10
10
Word error rate
PCADEC ( )  n2 (n  4) 3
Energy Savings with CADEC
  10
20
Communication Pipelining
Inter- and Intra-switch stages
encoder
inter-switch
link
intra-switch
pipelined stages
decoder
encoder
decoder
intra-switch
pipelined stages
inter-switch
link
Pipelined Data Transfer
inter-switch
link

Average Message Latency
(Cycles)
Latency Characteristics
2000
1800
1600
1400
1200
1000
800
600
400
200
0
Uncoded
Coded
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Injection Load
•The codes should be
optimized
 It can be merged with
existing stages
 No Latency penalty
Adaptive Supply Voltage Links




Dynamic Voltage Scaling (DVS)
DVS schemes dynamically adjust the processor clock frequency
and supply voltage to just meet instantaneous performance
requirement, making the system energy aware.
communication architectures display a wide variance in their
utilization depending on the communication patterns of
applications
adapts the link’s frequency and supply voltage in accordance
with the instantaneous traffic bandwidth.
Repeater Insertion & Coding




Repeater insertion reduces interconnect wire delay
Increases power dissipation due large drivers
CACs reduce coupling capacitance
Joint repeater insertion and CAC is a promising solution to
reduce power in global wires
Repeater Insertion & Coding
Reference: A lowPower Bus
Design Using
Joint Repeater
Insertion and
Coding
130 nm
Repeater Insertion & Coding
45 nm
Reliability


Crosstalk, electromigration,material ageing….
Transient failures

Error control coding

Crosstalk avoidance coding
Power, area trade-off
Permanent failures



Spare switches and links

Overall routing complexity

Effect on system performance