Improved EDF schedulability analysis of EDF on

Download Report

Transcript Improved EDF schedulability analysis of EDF on

Survey of multicore architectures
Marko Bertogna
Scuola Superiore S.Anna,
ReTiS Lab, Pisa, Italy
Summary




CELL processor
Reconfigurable devices
Software-Hardware co-design
Parallel programming problems






data dependencies
process synchronization
memory barriers
locking mechanisms
Language extensions for parallel programming
Real-time multiprocessor scheduling
Cell processor
A Cell Processor
Cell History
Cell basic concepts
Cell synergy
Cell Chip
Cell features
Cell Processor Components
Cell Processor Components
Cell Processor Components
Cell Processor Components
Synergistic Processor Element (SPE)
SPE
SPE details
Element Interconnect Bus (EIB)
EIB: Data topology
Example: 8 concurrent transactions
Theoretical peak operations
Cell BE performance
Why is Cell Processor so fast?
CELL software environment
System Level Simulator
SPE management library
CELL parallelism
Typical CELL sw development flow
ARM’s MPcore
PicoArray (by PicoChip)
PicoArray scaling
FPGA and Reconfigurable devices
Field Programmable Gate Arrays





SRAM-based matrix of integrated elements
whose interconnections can be programmed
statically or even dynamically
Basic block is Logic Element (LE)
Chip capacities from 1k to 1000k LEs
Each LE is typically composed by logic gates,
LUTs, Flip-Flops and latches
Need for optimized CAD or pre-binded design
libraries
FPGA
CSL organization:
Basic Logic Element:
Altera’s Stratix IV basic block

Adaptive Logic Module (ALM)
Flexibility vs efficiency
Reconfigurable devices advantages






Efficiency AND Flexibility
Time to market
Easier upgrade
Lower cost (on scale production)
Reusable IP
Customable interface
Reconfigurable devices parameters

Block granularity




Density
Reconfiguration time



Coarse grained: Functional Units, Processor Cores,
Memory Tiles
Fin grained: gate and register level
Compile-Time Reconfiguration (CTR)
Run-Time Reconfiguration (RTR)
Partial or Total reprogramming
Triscend’s A7S chip
Example: multiplier on Altera’s Stratix IV
Typical FPGA software development
environment

FPGA optimized module
library
IO Editor
Generate  file.h
Bind (placement and
route)  file.csl
Config  file.cfg

Download




Typical FPGA module library
Altera’s Nios II




Nios II is a soft-core processor
IP that can be downloaded into an Altera’s
FPGA, obtaining the functionalities of a real RISC
CPU
Logic elements are programmed so as to behave
like gates of classic ASIC processors
Different Nios versions are available



faster and with full functionalities  bigger size
medium sized
compact but slower and with limited functionalities
Nios II core
Selecting Nios II e/s/f
Example of a Nios II Processor system
Final global layout
Soft-core processors and FPGAs




Possible to have multiple cores on a single chip
Customizable hardware can be used to
coordinate the various cores
Build and test a whole multicore system in a
faster time
Detect and solve bottlenecks without needing to
repeatedly return to the integration phase
Co-design problems with FPGAs





A task may be executed by a (soft-core or ASIC)
processor or may be entirely implemented in
hardware on the reconfigurable logic
“Programming in Space” versus “Programming
in Time”
Centralized vs Distributed computing
Sequential vs Parallel programming
Interconnect Network
What is a task in hardware?
Software programming
Hardware implementation
a
c=a+b;
result=c/2;
+
c
b
shifter
Assembler expansion:
ldr
ldr
add
mov
str
r0,a
r1,b
r0,r0,r1
r0,LSR r0
r0,result
result
5 operations
All in one clock cycle!
Conclusions




FPGAs are interesting devices for multicore
systems developers
Valid benchmark upon which to compare classic
serial programming methods and parallel
computing approaches
Allow reducing time-to-market for nextgeneration multicore systems
Provide common platforms that can easily
reproduce any architecture (given a proper
VHDL/Verilog description)