Onchip Interconnect Exploration for Multicore Processors Utilizing FPGAs Graham Schelle and Dirk Grunwald University of Colorado at Boulder.
Download
Report
Transcript Onchip Interconnect Exploration for Multicore Processors Utilizing FPGAs Graham Schelle and Dirk Grunwald University of Colorado at Boulder.
Onchip Interconnect
Exploration for Multicore
Processors Utilizing
FPGAs
Graham Schelle and Dirk Grunwald
University of Colorado at Boulder
Outline
Network on Chip (NoC) defined
Current onchip interconnect tools
NoCem (NoC Emulator) specification
What else is needed before release
We want it to be used…and cited
Conclusions
Network on Chip Defined (in 1 slide!)
Power/design
concerns in modern
processors lead to
multicore chips
Transistors seen as
“free” allowing more
transistors for noncomputational tasks
Network on
Chip
High speed clocking
leads to signals not
propagating across
chip in single cycle
Networking scales to
infinite number of
access points and is
well understood
Onchip Interconnects for FPGAs
Existing Buses on FPGAs
PLB,OPB,FSL
Can have multiple masters (e.g. processors)
Scale well for current uses of FPGAs
Existing NoCs
Research projects
Proprietary projects
Application specific (streaming…)
Not built for parameterization, some other VALID
focus
NoCem Specification
Synthesizable VHDL
Heavy use of generics / generate statements
Requires minimal Xilinx IP (FIFOs…)
To modify anything
Change generics, everything automatically generated
E.g. to go from 2x2 mesh with 16b datawidth to 4x4
torus with 8b datawidth, change 3 lines of code!
NoCem Interface
FIFO-ish
Enqueue and dequeue path for every access point
Packet Control and Data paths
Meaning of those paths depends on NoC
configuration
Datapath
Only variable width. Length of packet determined by
packet control
Packet control: src, dest, packet length
Underlying Network reads toplevel packet structure, reads
correct fields at correct times
NoCem Bridges
Use Existing Buses, bridge to NoC
Integration into existing Xilinx tool flows
NoC can look like memory, SoC, …
Use IPIF interface
PLB, OPB
Different bus widths…
But processors both 32b
How Big is NoCem?
NoC
Dimensions
2x2
3x3
4x4
2x2
3x3
4x4
Datawidth
LUTs
16b
16b
16b
32b
32b
32b
4,086
11,693
21,570
5,822
16,394
34,370
xc2vp30
LUTs used
14%
42%
78%
21%
59%
125%
Mesh, 16-deep channel FIFOs, RR Arbitration
Example Uses
Memory Architecture (in paper)
Asymmetric Processor Configuration
Various distributed cache configurations
Using Microblaze, PowerPC
Special Processor Offloads
Floating Point, Network Processing
All can be emulated over NoC using NoCem…
For Release
We want NoCem to be used!
Already in use at CU Boulder
Full source will be made available online
To do for release
Clean/zip up code
Some Documentation
ETA: April 2006
Conclusions
NoCem as a research tool
Open source
Non-proprietary
Non application Specific
NoCem for multicore processor research
Allows NoC exploration
Easy integration into Xilinx EDK flow
Useful for a variety of research topics in this space
Any Questions?