Onchip Interconnect Exploration for Multicore Processors Utilizing FPGAs Graham Schelle and Dirk Grunwald University of Colorado at Boulder.

Download Report

Transcript Onchip Interconnect Exploration for Multicore Processors Utilizing FPGAs Graham Schelle and Dirk Grunwald University of Colorado at Boulder.

Onchip Interconnect
Exploration for Multicore
Processors Utilizing
FPGAs
Graham Schelle and Dirk Grunwald
University of Colorado at Boulder
Outline




Network on Chip (NoC) defined
Current onchip interconnect tools
NoCem (NoC Emulator) specification
What else is needed before release


We want it to be used…and cited
Conclusions
Network on Chip Defined (in 1 slide!)
Power/design
concerns in modern
processors lead to
multicore chips
Transistors seen as
“free” allowing more
transistors for noncomputational tasks
Network on
Chip
High speed clocking
leads to signals not
propagating across
chip in single cycle
Networking scales to
infinite number of
access points and is
well understood
Onchip Interconnects for FPGAs

Existing Buses on FPGAs
PLB,OPB,FSL
 Can have multiple masters (e.g. processors)
 Scale well for current uses of FPGAs


Existing NoCs
Research projects
 Proprietary projects
 Application specific (streaming…)
 Not built for parameterization, some other VALID
focus

NoCem Specification




Synthesizable VHDL
Heavy use of generics / generate statements
Requires minimal Xilinx IP (FIFOs…)
To modify anything
Change generics, everything automatically generated
 E.g. to go from 2x2 mesh with 16b datawidth to 4x4
torus with 8b datawidth, change 3 lines of code!

NoCem Interface

FIFO-ish
Enqueue and dequeue path for every access point
 Packet Control and Data paths


Meaning of those paths depends on NoC
configuration

Datapath


Only variable width. Length of packet determined by
packet control
Packet control: src, dest, packet length

Underlying Network reads toplevel packet structure, reads
correct fields at correct times
NoCem Bridges

Use Existing Buses, bridge to NoC
Integration into existing Xilinx tool flows
 NoC can look like memory, SoC, …
 Use IPIF interface


PLB, OPB
Different bus widths…
 But processors both 32b

How Big is NoCem?
NoC
Dimensions
2x2
3x3
4x4
2x2
3x3
4x4
Datawidth
LUTs
16b
16b
16b
32b
32b
32b
4,086
11,693
21,570
5,822
16,394
34,370
xc2vp30
LUTs used
14%
42%
78%
21%
59%
125%
Mesh, 16-deep channel FIFOs, RR Arbitration
Example Uses

Memory Architecture (in paper)


Asymmetric Processor Configuration


Various distributed cache configurations
Using Microblaze, PowerPC
Special Processor Offloads

Floating Point, Network Processing
All can be emulated over NoC using NoCem…
For Release

We want NoCem to be used!
Already in use at CU Boulder
 Full source will be made available online


To do for release
Clean/zip up code
 Some Documentation


ETA: April 2006
Conclusions

NoCem as a research tool
Open source
 Non-proprietary
 Non application Specific


NoCem for multicore processor research
Allows NoC exploration
 Easy integration into Xilinx EDK flow
 Useful for a variety of research topics in this space

Any Questions?