Performance-Driven Multi-FPGA Partitioning Using
Download
Report
Transcript Performance-Driven Multi-FPGA Partitioning Using
Codesign of Embedded
Systems
Allen C.-H. Wu
Department of Computer Science
Tsing Hua University
Hsinchu, Taiwan, R.O.C
{Email: [email protected]}
Outline
Introduction
Implementation technologies
Design technologies
Summary
Ref: Rolf Ernst, “Codesign of Embedded Systems:
Status and Trends, IEEE Design and Test of
Computers, pp. 45-54, April-June, 1998.
Introduction
Embedded systems:
Embedded IC
revenues (B$)
35
Executes specific tasks
within larger electronic
device
Found in nearly everything
electric - cars, office
automation, PDA’s, home
electronics, factory control
30
25
20
ASIC
15
E.
uP/uC
10
5
2000
1998
1996
0
1994
Source: Dataquest
Embedded system characteristics
Fixed functionality
I/O intensive, reactive
Multiple processes
Time constraints
Low cost ($8-$100), low power (.5-4W), small
size
A typical embedded system structure
SAP
ASIP
DSP
Memory
uC
uP
ASIC
DSP Code
I/O
RTOS
uP Code
D/A
A/D
Implementation technologies
(Processor types)
Micro-processor and micro-controller
ASIP - application-specific instruction-set
processor
DSP - digital signal processor
SAP - single-application processor
ASIC - application-specific integrated circuit
Implementation technologies
(Package types)
Full-custom IC
Cell-based IC
Gate array
PLD - FPGA
SOC (System-on-a-chip) - core-based design,
Intellectual property (IP), system-level
integration (merging hw/sw onto 1 chip)
Embedded-system design process
Requirements definition
Customer/
marketing
Specification
System
architect
Support
(CAD, test)
System architecture development
SW
development
Interface
design
HW
development
Integration & test
Source: Ernst (IEEE D & T of Computer)
Reused
comp.
Two types of codesign
uP/SAP-based design
System
Vertical
partitioning
Core processor
Application-specific
coprocessors
SW
HW
Two types of codesign
ASIP-based design
SW
Application SW +
simulator, compilers,
OS
System
Vertical
partitioning
Application-specific
processors
HW
Design technologies
Design specification, modeling, and capture
Synthesis - system-level, RTL, logic level,
physical level.
Design space exploration
Design verification and testing.
Specification and modeling
Executable specification - Verilog, VHDL, C,
C++, Java.
Common models: synchronous dataflow
(SDF), sequential programs (Prog.),
communicating sequential processes (CSP),
object-oriented programming (OOP), FSMs,
hierarchical/concurrent FSM (HCFSM).
Depending on the application domain and
specification semantics, they are based on
different models of computation.
Hardware Synthesis
Many RTL, logic level, physical level
commercial CAD tools.
Some emerging high-level synthesis tools:
the Behavioral Compiler (Synosys), Monet
(Mentor Graphics), and RapidPath (DASYS).
Many open problems: memory optimization,
parallel heterogeneous hardware
architectures, programmable hardware
synthesis and optimization, and
communication optimization.
Software synthesis
The use of real-time operating systems
(RTOSs)
The use of DSPs and micro-controllers - code
generation issues
Special processor compilation in many cases
is still far less efficient than manual code
generation!
Retargeting issues - C code developed for
the TI TMS320C6x is not optimized for
running on Philips TriMedia processor.
Software synthesis (Cont.)
The porting is worse when using parallel
compilers because of architecture
specialization.
Using libraries of predefined and
parameterized code modules adapted to an
application: SPW and the Mentor Graphics
DSP Station.
Interface synthesis
Interface between:
- hardware-hardware
- hardware-software
- software-software
Timing and protocols
Has been neglected for a long time in
commercial tools
Recently, first commercial tools appeared:
the CoWare system (hw-sw protocols) and
the Synopsys Protocol Compiler (hw
interface synthesis tool)
Synthesis: status and trends
Many tools reach a high degree of
automation for specific applications;
however, many design tasks still need to be
done manually
Lacking the ability of exploiting the design
space to obtain an optimized solution
IP-based (core-based) synthesis
methodology
Design space exploration
Process transformation
Hardware/software codesign tasks
Estimation
Manual/automated/assisted
Design space exploration process
Customer/marketing
system architect
Cospecification
High-level
transformation
System
architect
Design space
exploration
space
System
analysis
Process
transformation
HW/SW partitioning
and scheduling
Reused functions
and processes
HW arch & comp.
Reused HW & SW
components
HW synthesis
SW synthesis
Evaluation (cosimulation)
Source: Ernst (IEEE D & T of Computer)
Process transformation
Communication transformation
Process merging
Granularity adaptation
Process retargeting : e.g., a RISC -> a DSP
Granularity effects
Optimization potential
Communication overhead
Design effort
Granularity
Analysis
Process
no(explicit)
Function/
global data
Global
data flow
Basic block/
local data set
Global and local
data flow
Statement/
variables
Global and local
data flow
HW/SW codesign
Hardware-software partitioning
Communication synthesis
Hardware-software scheduling
Memory optimization
Estimation
Cosimulation
Communication synthesis
Communication channel selection
communication channel allocation
communication channel scheduling
Currently, no tool can cover the whole variety
of communication mechanisms
HW/SW scheduling
Static scheduling
Derived from RTOSs - e.g., static table-driven
and priority-based preemptive scheduling
Static scheduling for event-driven reactive
systems
Distributed scheduling policies for complex
embedded architectures
Memory optimization
Dominant cost factor in integrated systems
and the bottlenecks in system performance
Program cache optimization techniques
Optimization for architectures with memories
of different types - such as scratch-pad
SRAM and DRAM
Dynamic memory allocation
Estimation
Accuracy VS. fidelity
Simulation based
Fast synthesis based
Cosimulation
Simulate processor software along with
custom hardware
Simulation speed, compile time, debugging
capability, test vector creation
Speed VS. accuracy - rate accurate,
functionally accurate, cycle accurate, gate
accurate
Simulator categorization
General-purpose simulator - event-driven
Uni-purpose simulator - designed to simulate
a specific model (e.g., 80586)
Emulator
- Logic emulator
- Processor emulator
- In-circuit emulator (ICE)
Common cosimulation approaches
HDL simulator
Simple to implement
Slow
Foreign software debug environment
Common cosimulation approaches
Linking software processor simulator and
HDL simulator
Eagle-I (Mentor Graphics), Seamless
(Viewlogic)
Ptolemy (UC Berkley) -- OO software
framework for linking simulators
Faster, native software debug environment
Common cosimulation approaches
Linking processor emulator and logic
emulator
Fast
In-circuit debugging
Expensive
Quickturn
Design verification and testing
Closely-coupled design, verification, and
testing methodologies
Integrating multi-level design, verification,
and testing design tasks
Cosimulation, coemulation, design for test
Rapid prototyping
Rapid prototyping
Custom-designed prototyping board
Logic emulators
Field-programmable PCBs
Development without prototyping
SW
Design Code
Design
Design
Build
Integration Debug
Integration Debug
Fab Debug
Development with prototyping
SW
HW
CHIP
Integration
Design Code System
& SW Debug
Design
Design
Build
HW Integration
& Debug
Chip debug Fab
Final
Integration
Summary
Embedded systems market is big and
growing
Computer-aided hardware-software codesign
has made considerable progress in the past
few years
System analysis is in great demand cosimulation, coverification and
cospecification
Cosynthesis and computer-aided design
space exploration are just beginning to reach
the industrial practice