PARTIAL RECONFIGURATION : ARCHITECTURE AND TOOLS

Download Report

Transcript PARTIAL RECONFIGURATION : ARCHITECTURE AND TOOLS

1
PARTIAL RECONFIGURATION USING FPGAs:
ARCHITECTURE
2
Agenda
• Introduction
• Partial Reconfiguration Basics
• Design Considerations
• Advantages of Partial Reconfiguration
• Challenges of Partial Reconfiguration
• Application Examples
• Case Study
3
Introduction
• Basic Premise : Hardware reconfiguration is allowed
during execution of an application.
Some Interesting Applications
• Dynamic Instruction Set
Architecture
• Software Defined Radio
• Video encoding techniques
• Cryptography
Design
FPGA
• Networking protocols
A
Chip
Design
B
Design
C
4
Introduction
• Classification of FPGA with respect to configuration capabilities
• Dynamic Partial Reconfiguration : Reconfiguring only a part of
the device at run time while the rest of the device executes.
• Useful for systems which can time share the FPGA resources.
5
Introduction
Benefits
• Area reduction
• Power reduction
• Hardware Reuse
• Flexibility
• Performance Improvement
• Higher Level of Parallelism
• Time sliced resource sharing
• Fast system start
• Load a basic module to enable a fast system boot up
• Load peripheral modules later.
• Smaller bitstreams sizes
• Application Portability
• Encapsulation of reconfigurable system into a portable application.
6
Partial Reconfiguration Basics
• Each vendor’s products can have different characteristics
and utilities
• Some common terminology are as below.
TERMINOLOGY
• Reconfigurable Partition(RP)
• Dynamic Partial Reconfiguration (DPR)
• Reconfigurable Module (PRM)
• Configuration Memory (CM)
• Frames
• Partial Bitstream
• Merged Bitstream
• Static Logic (Base Region)
• Bus Macro
7
Partial Reconfiguration Basics
Structure Overview
• Overall Structure
• CLBs – Configurable logic blocks
• IOBs – Input-output buffers
• DSP48s – Xilinx’s digital signal
processing units
• BRAMs – Block Random Access
Memories
• FIFOs – First-in First-out buffers
• DCMs – Digital Clock Managers
CLBs
IOBs
DSP48s
BRAMS
and FIFOs
DCMs and
Clock Dist.
Figure 1. Virtex-4 LX15 FPGA layout
8
Partial Reconfiguration Basics
Structure Overview
9
Partial Reconfiguration Basics
Bit Stream and Frames
• FPGAs are reprogrammed by
•
•
•
•
•
writing bits into CM
Organized in small blocks called
‘Frames’
Multiple frames required to
program a column of tiles(After
Virtex II )
Contains both routing and logic
tile configuration.
Virtex-6 Frame size:
81 x 32 bits (81 words)
Typical Bit streams for Virtex-6
are in the range of 43Mb to 190
Mb
10
Partial Reconfiguration Basics
Bit Stream
• Different columns of
FPGA fabric can have
different bit streams
• PR overhead for full
flexibility
• Possible to reduce Bit
stream Size :
- Compression Techniques
- Partial Reconfiguration
11
Partial Reconfiguration Basics
Frames
• Row address – 0 to 9
• Top/Bottom row – with respect to HCLK
• Together with row address can locate the tile
• Major Address : Columns 0 onwards
• Minor Address : No. of frames in tile
• Block type : Logic Blocks, BRAMs, Routing
Blocks.
12
Partial Reconfiguration Basics
Bus Macros
• Bus Macros: Means of communication between PRMs
and static design
• All connections between PRMs and static design must
pass through a bus macro with the exception of a clock
signal
• Type of Bus Macros
 Tri-state buffer (TBUF) based bus macros
 Slice-based (or LUT-based) bus macros
13
Partial Reconfiguration Basics
Xilinx Bus Macros (Tri state Buffer Based)
•
•
•
•
•
Used for connecting points to link Static and reconfigurable part
Introduced in 2002
Fixed positions on the FPGA fabric
Present along a thin vertical slice
Extra hardware required. No longer supported in modern FPGAs.
14
Partial Reconfiguration Basics
Xilinx Bus Macros (LUT Based)
• LUTs and Switch matrix acts as the connection points (2004)
• Passes the boundary of static and reconfigurable regions in a
•
•
•
•
predefined manner.
Uses 2 LUTs per wire
Increased latency and area
Not used any more.
Partition Pins replace Bus Macros
15
Partial Reconfiguration Basics
Partition Pins
• Partition Pins are the logical and physical connection
between static logic and reconfigurable logic.
• Automatically created for all RP ports.
• Also referred to as Proxy LUTs.
• It is single LUT1
• No special instantiations required
• Not Bidirectional
16
Partial Reconfiguration Basics
Methods of Reconfiguration
• Externally
• Serial configuration port
• JTAG (Boundary Scan) port
• Select Map port
• Internally
• Though the Internal configuration access port (ICAP) using
an embedded microcontroller or state machine
Summary of Configuration Options
17
Partial Reconfiguration Basics
Reconfiguration via a processor
18
Partial Reconfiguration Basics
ICAP Interface
• Port to read and write the FPGA
configuration at run time
• Enables a user to write software
programs for an embedded
processor that modifies the circuit
structure and functionality during
the circuit’s operation.
• Allows for automated runtime
reconfiguration
19
Partial Reconfiguration Basics
ICAP Interface
•
•
•
•
•
Storage Device
Bus System
DMA to Storage Device
Read back Support
Configuration manager
20
Design Considerations
Partitioning Style
• Partitioning style could be island style
• Slot Based
• Grid Based
21
Design Considerations
Placement Flexibility
•
•
•
•
Partitioning style affects placement and flexibility
A partition defines the smallest atomic area a module can be assigned
Island style – suffers from fragmentation
Slot style - also suffers from fragmentation but to a lesser extent.
Offered
by the current vendors Xilinx and Altera.
• Grid Style – Reduced fragmentation. Difficult to support.
• To enhance flexibility, the PR module must be placed and routed in every region
it needs to be configured.
• Additional stress on Bit stream size.
22
Design Considerations
Resource
• Column wise layout of different logic primitives
• Must be considered when placing
• Depending on the type of logic primitives used by the
module(SLICEX, SLICEM, etc), relocation may or may not
be possible.
23
Design Considerations
Power
• One of the potential advantages of PR – Power reduction
• But PR itself requires power.
• Power during PR is spent in:
1. Configuration Data Access –
- Spent on the configuration controller
- Off/On chip Memory access
- Programming interface(ICAP, SelectMAP,etc)
2. Actual configuration of FPGA Resources
Bonamy, R., et al. "Power Consumption Models for the Use of Dynamic and Partial Reconfiguration."
Microprocessors and Microsystems (2014).
24
Design Considerations
Power
• Tasks switching power graph
T1 T2 and T2T1
Bonamy, R., et al. "Power Consumption Models for the Use of Dynamic and Partial Reconfiguration."
Microprocessors and Microsystems (2014).
25
Design Considerations
Design Flows
1. Module-based PR:
• Implement any single component separately.
• Constrain components to be placed at a given location.
• Complete bitstream is finally built as the sum of all partial bit
streams.
2. Difference-based PR:
•
•
•
Implement the complete bitstreams separately.
Implement fix parts + reconfigurable parts with components
constrained at the same location in all the bitstreams.
Compute the difference of two bitstreams to obtain the partial
bitstream needed to move from one configuration to the next one.
26
Design Considerations
Module Based P-R
27
Design Considerations
Difference Based P-R
• Useful for making small on-the-fly changes to design
parameters such as logic equations, Filter Parameters.
• Procedure:
1.
Designer makes small logic changes using FPGA_Editor:
•
•
•
•
•
•
•
changing I/Os,
block RAM contents
LUT programming
muxs
flip-flop initialization and reset values
pull-ups or pull-downs on external pins
block RAM write modes
•
2.
Changing any property or value that would impact routing is not recommended due to
the risk of internal contention
Uses BitGen to generate a bitstream that programs only the
difference between the two versions.
•
Very quick switching
28
Design Considerations
Difference Based P-R
• LUT equations change
29
Design Considerations
Difference Based P-R
• Changing BRAM contents
30
Challenges of Partial Reconfiguration
• Complicated design flow
Manual
steps
• Manual assistance for
reconfiguring different target
devices.
• Security issues
• Decrease performance as
compared to full configuration.
• Xilinx reports 10% degradation in
clock frequency when using PR.
Xilinx PR Implementation
Flow
HDL Design
Description
HDL Synthesis
Set Design
Constraints
Placement
Analysis
Implement Static Design
and PR Modules
Merge
Final
Bitsreams
31
Complete Architecture Overview
32
Application examples of Partial
Reconfiguration
• Evolution Architectures
•
•
•
•
Artifical Neural Networks
Evolvable Hardware Platforms
Fuzzy systems
Modular Robotics
• Speed Up
• Crypto (Asym)
• Area Saving
• Networking (exchange packet filters according to traffic)
• Modulation/frequency/encryption hopping in military radios
• Digital Signal Processing
• JPEG Encoder/Decoder systems
• Edge detection applications
33
Case Study
Fault Tolerance – Self Healing Architecture
• Fault tolerant Processor
• IF ,MAC and ALU are the
PRMs
• Different configurations
available for each module.
• Focus on the self healing
feature more than the
performance itself.
34
Case Study
Reconfigurable Crypto processor
• Processor can choose from
Different crypto algorithms
• Major Area savings
• Some Power Savings too.
35
Case Study
Fast Start Up
• Fast Start up is a
2 step configuration
• Useful in time critical
systems to initiate a
swift system start up.
• Example :
Automotive safety
36
Thank you
Questions ?