DSP Implementation on FPGA

Download Report

Transcript DSP Implementation on FPGA

DSP Implementation on FPGA
Ahmed Elhossini
ENGG*6090 : Reconfigurable Computing
Systems
Winter 2006
1
References




Reconfigurable Computing for Digital Signal Processing: A Survey ,
RUSSELL TESSIER AND WAYNE BURLESON, Journal of VLSI Signal
Processing 28, 7–27, 2001
FPGA implementations of fast Fourier transforms for real-time
signal and image processing, I.S. Uzun, A. Amira and A. Bouridane ,
IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 3, June 2005.
Image Processing Algorithms on Reconfigurable Architecture
using HandelC, V Muthukumar and Daggu Venkateshwar Rao,
Proceedings of the EUROMICRO Systems on Digital System Design
(DSD’04).
Experiences on developing computer vision hardware algorithms
using Xilinx system generator, Ana Toledo Moreo, Pedro Navarro
Lorente, F. Soto Valles, Juan Suardı´az Muro*, Carlos Ferna´ndez
Andre´s , Microprocessors and Microsystems 29 (2005) 411–419
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
2
Introduction





The application domain of digital signal processing over the past decade
expanded because of the advance in VLSI technology.
ASIC and programmable DSP processors was the implementation
mechanisms of choice for many DSP applications.
In the last few decades new system implementations based on
reconfigurable computing are being considered.
They offer the functional efficiency of hardware and the programmability
of software.
These flexible platforms are quickly maturing in logic capacity of
programmable devices and the availability of embedded modules
(Multipliers and Hard Cores).
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
3
Architectural Requirements for DSP

Data path configured for DSP




Multiple memory banks and buses
Specialized addressing modes




Fixed-point arithmetic
MAC- Multiply-accumulate
Bit-reversed addressing
Circular buffers
Specialized execution control
Specialized peripherals for DSP
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
4
Choice Measures




Performance.
Cost
Power
Flexibility
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
5
DSP Implementation
DSP
Implementation
Pure HW
Implementation
ASIC
Hybrid HW/SW
Implementation
(HW/SW Codesign)
Reconfigurable
Devices
Soft or Hard core
Processor on
Reconfigurable
Device/Hardware
Accelerator
ENGG*6090 – Winter 2006
Pure SW
Implementation
General Purpos
Processor
DSP processor/
Hardware
accelerator on
Reconfigurable
Device
DSP Processor
MicroBlaze
DSP Implementation on FPGA
Hard/Soft Core on
Reconfigurable
Device
PowerPC
Nios
6
Topics Covered



FFT Implementation of FPGA.
Image Processing Algorithms on Reconfigurable
Architecture using Handel-C.
Experiences on developing computer vision
hardware algorithms using Xilinx system
generator.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
7
Handle C


Handel-C is essentially an extended
subset of the standard ANSI-C
language, specifically designed for use
in a hardware environment.
Unlike other C to FPGA tools Handel-C
allows hardware to be directly targeted
fromsoftware, allowing a more efficient
implementation to be created.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
8
Xilinx System Generator




System Generator is a tool box added to
MATLAB simulink.
It allow a graphical representation of
the algorithm.
Includes many blocks that are
commonly used by DSP algorithms.
Allow converting directly to HDLs.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
9
FPGA implementations of fast
Fourier transforms for
real-time signal and image
processing
I.S. Uzun, A. Amira
A. Bouridane
IEE Proc.-Vis. Image Signal
Process., Vol. 152, No. 3, June
2005
10
Target





The design and implementation of a parametrisable architecture, which
provides a framework for the implementation of different types of 1-D FFT
algorithms.
The development of an FPGA-based FFT library by implementing radix-2,
radix-4, split-radix and FHT algorithms in order to provide system designers
and engineers with the flexibility to meet different system requirements (such
as chip area, memory etc.) with given hardware resources.
The evaluation and comparison of hardware implementations of
aforementioned FFT algorithms. The performance measures to be considered
in comparisons are the computation speed, maximum system frequency, chip
area and memory usage.
The design and implementation of a generic parallel 2-D FFT architecture for
real-time image processing applications for use to enhance large medical and
astronomical images using frequency-domain filtering techniques.
The development of an FPGA-based parametrisable system for frequencydomain filtering of large images.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
11
FFT Implementation on FPGA

Implementing 4 Different Transforms





Radix 2 FFT
Radix 4 FFT
Split Radix FFT
Fast Hartley transform
Introduce a parallel version of the 2D parallel FFT
transform based on Radix 2 and Radix 4.

Make use of more FFT processing elements to perform
computation.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
12
Proposed system for FFT
implementation
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
13
Butter-Fly Used With Different
Architectures
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
14
Functional block diagram of 1-D
FFT processor architecture
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
15
Block diagram of radix-2 butterfly
used in FPGA FFT processor
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
16
Architectural block diagram of AGU
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
17
Computation time (us) of different
algorithms for 1024 point FFT
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
18
Functional block diagram of parallel
2-D FFT processor architecture
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
19
Computation time and Device utilization
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
20
2-D FFT performance comparison
with existing FPGA-based designs
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
21
Conclusion



This work introduces an implementation
platform for FFT Transforms.
Handle-C is used as the description
language.
A comparison of this implementation
shows a lower execution time with a
reasonable resource utilization.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
22
Image Processing Algorithms
on Reconfigurable Architecture
using HandelC
V Muthukumar and Daggu
Venkateshwar Rao
Proceedings of the EUROMICRO Systems on Digital
System Design (DSD’04)
23
Target


In this work the canny edge detection architecture for 2D
images has been developed using reconfigurable
architecture and hardware modeled using Handle-C.
The algorithm involve the implementation of different
image processing algorithms such as:



First the image is smoothed by Gaussian Convolution which is 5x5
convolution operation.
Morphological Operation, which is 3x3 operator on the image.
2D convolution.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
24
Implementation



The algorithm is modeled using Handle-C.
It is implemented using the EDK2 and
RC1000-PP XilinxVertex-E FPGA. This chip
doesn’t have any embedded multiplier.
The hardware implementation is compared to
a software implementation using a PC with
pentium processor at 1300MHz Frequancy.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
25
Architecture of 3x3 moving
window
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
26
Edge Detection Architecture
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
27
Results
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
28
Results
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
29
Conclusion


Handle C is used to implement 2D
convolution which is used to implement
edge detection.
The implementation is compared to
VC++ implementation on P3 1300MHz
processer, and shows a better
performance.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
30
Experiences on developing computer vision
hardware algorithms using Xilinx system
generator
Ana Toledo Moreo, Pedro Navarro Lorente, F.
Soto Valles,
Juan Suardı´az Muro*, Carlos Ferna´ndez
Andre´s
Microprocessors and Microsystems 29 (2005)
411–419
31
Target

This paper shows how the Xilinx system
generator (XSG) environment can be
used to develop hardware-based
computer vision algorithms from a
system level approach, which makes it
suitable for developing co-design
environments.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
32
Application Examples

Binarization algorithm




Converting a gray scale image into a black and
white binary image.
Xilinx System Generator is used to implement this
unit.
Compared with a VHDL implementation.
Generalized convolution blocks


Convolution is one of the basic image processing
algorithms.
Xilinx System Generator is used to implement
different type of algorithms.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
33
Modular-blockset-based
hardware binarization block
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
34
VHDL-based hardware
binarization block
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
35
Hardware convolution block
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
36
Hardware binarization block
implementation results
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
37
Generalized hardware convolution
implementation results
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
38
Results
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
39
Conclusion


This work demonstrate the use of Xilinx
System Generator to implement Image
processing algorithm.
A comparison is made to the VHDL
implementation and show a competitive
results.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
40
Results
System
FPGA
Tool
Alternative
Implementation
Main
Algorithn
FPGA implementations of fast
Fourier transforms for real-time
signal and image processing
Xilinx
XCV2000E
RC1000-PP
Board
Handle C with
EDK2
Comparison with
Different
implementations
4 FFT
algorithms
Image Processing Algorithms on
Reconfigurable Architecture
using HandelC
Xilinx
XCV2000E
RC1000-PP
Board
Handle C with
EDK2
VC++ Program
on P3 1300 MHZ
processor
2D
Convulsion
and Edge
Detection
Experiences on developing
computer vision hardware
algorithms using Xilinx system
generator
Xilinx
XCV800
Xilinx System
Generator
VHDL
Binarization
and 2D
convulsion
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
41
Conclusion




In this review 3 different papers on implementing
DSP algorithms on FPGA are demonstrated.
Handle C is an efficient tool to implement DSP
algorithms and provide a competitive result to those
of current HDLs.
Xilinx System Generator , which is tool based on
MathWorks MATLAB, is an good tool to implement
DSP systems.
Modern tools for implementing DSP algorithms could
be used to replace the current HDLs.
ENGG*6090 – Winter 2006
DSP Implementation on FPGA
42
Thank You
Questions ?
43