Transcript pptx

A radix-8/4/2 FFT processor for OFDM
systems
Jungmin Park
Project background
 OFDM used widely for high-speed digital communication
 High performance of FFT processor for real time application
 Dedicated FFT processor for only specific application
 Variable-length FFT processor for many applications
Application
FFT length
DVB-T/H
2K-8K
DAB
256-2K
xDSL
256-4K
WLAN
64-128
Nicola E. L’insalata, Sergio Sanponara, Luca Fanucci, Pierangelo Terreni, “Automatic Synthesis of
cost effective FFT/IFFT cores for VLSI OFDM Systems”
Architecture of conventional FFT processors
 Pipeline architecture
–
Pipeline process
–
Butterfly unit and memory every computation stage
High throughput
Input
 Parallel architecture
–
Parallel process
–
The worst case in hardware efficiency
High throughput
The best case in hardware efficiency
–
Low throughput
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
...
(b) Parallel Architecture
 Shared memory architecture
–
Butterfly
Unit
Shared memory
Bank 1
Bank 2
High-radix FFT algorithm
Bank r
Radix-r
Butterfly

W
Radix-r
Butterfly
(a) Pipeline Architecture

W
Radix-r
Butterfly
Radix-r
Butterfly Unit
(c)Shared memory
Architecture
Output
large area with large FFT length
Butterfly
Unit
Project objective and contents
Project objective
Design of High-performance and variable-length FFT processor for OFDM
systems
Project contents
Hardware efficiency and
area
• Shared memory architecture
• Proposed twiddle factor generator
High throughput
• Pipelined Radix-8 DIF FFT algorithm
Only Radix-8 FFT : 8n points (64, 512, 4K points)
Mixed Radix-8/4 FFT : 4 8n points (256, 2K points) and 8n points
Mixed Radix-8/2 FFT: 2 8n points (128, 1K, 8K points) ) and 8n points
Variable FFT length
Mixed radix-8/4/2 DIF butterfly unit
Memory assignment and
addressing
Efficient memory assignment and addressing
Structure of proposed FFT processor
Address generator
 Memory
Bank0
– 8 banks
Bank1
Bank3
Bank4
Bank5
Commutator2
Commutator1
Bank2
 Radix-8/4/2 BU
 Memory address
generator and
Commutator
Bank6
Bank7
 Twiddle factor
generator
Pipelined Radix-8/4/2 butterfly
unit
Twiddle factor
generator
– Dual port
Control Unit
 Control unit
Operation of the proposed FFT processer (64-point data flow)
The Pipelined radix-8/4/2 DIF butterfly unit
S0
S1
Mode
0
0
4 parallel Radix-2
0
1
2 Parallel Radix-4
1
0
Radix-8 without multiplication
1
1
Radix-8 with multiplication
Application of proposed butterfly unit
Point
# Computation Stages
1 stage
2 stage
3 stage
4 stage
64
2
Radix-8
Radix-8
128
3
Radix-8
Radix-8
Radix-2
256
3
Radix-8
Radix-8
Radix-4
512
3
Radix-8
Radix-8
Radix-8
1K
4
Radix-8
Radix-8
Radix-8
Radix-2
2K
4
Radix-8
Radix-8
Radix-8
Radix-4
4K
4
Radix-8
Radix-8
Radix-8
Radix-8
8K
5
Radix-8
Radix-8
Radix-8
Radix-8
5 stage
Radix-2
Twiddle factor generator
 Twiddle factor generator
– Recurisive feedback difference equation
sin n  2cos n sin(n  1)  sin(n  2)
cos n  2cos  cos(n  1)  cos(n  2)
– Error propagation problem
sin m  2cos  sin(m  1)  sin(m  2)
3
3
 n  error   n
2
2
 2(cos   2 ( n1) )[sin(m  1)  2 ( n1) ]  [sin(m  2)  2 ( n1) ]
 2cos  sin(m  1)  sin(m  2)  2 n [sin(m  1)  cos   21  2 ( n1) ]
max | 2 n [sin(m  1)  cos   21  2 ( n1) ] | 2 ( n 2)
 Proposed error correction using correction table
[ z2 z1 z0 ]2  unsigned ([ x2 x1 x0 ]2  [ y2 y1 y0 ]2 )
Correct_ value  signed (computed _ value  [ z2 z1 z0 ]2 )
( where [ x x x ] is 3LSBs of correct value in correction table,
2 1 0 2
[ y y y ] is 3LSBs of computed value.)
2 1 0 2
Structure of proposed twiddle factor generator
LUT(Correction table)
2
3
R
E
G
16
17
16
16
Error
Correction
16
16
cos  j )
16
LUT(
LUT( s in

1
16
sin(nθ)
1
)
j
R
E
G
0
0
sin n  2cos n sin(n  1)  sin(n  2)
cos n  2cos  cos(n  1)  cos(n  2)
(a) Sine function generator
LUT(Correction table)
2
R
E
G
3
23
24
23
Error
Correction
23
23
LUT(
0
23'h200000
23
1
cos  j)
23
R
E
G
(b) Cosine function generator
0
1
23
Roundoff
16
cos(nθ)
Implemenation and verification (1)
 VHDL modeling
 How to verify and measure SQNR
HDL model
Random
Generator
16-QAM
Modultation
Ideal IFFT
16
Quantization
Proposed FFT
processor
MATLAB
Ideal FFT
(MATLAB)
MATLAB
 (Re( A))   (Im( A))
 (Re( A)  Re( B))   (Im( A)  Im( B))
2
SQNR  10 log10 (
2
2
2
)
( where A is the value of MATLAB, B is the value of proposed FFT )
16
Comparison
Implemenation and verification (2)
 Simulation (64 point FFT)
(a) Constellations
Point
64
SQNR 66.9
(dB)
(b) Error and SQNR
128
256
512
1024
2048
4096
8192
63.2
60.3
57.7
55
51.9
48.1
45.3
Implemenation and verification (3)
 FPGA synthesis
– Xilinx ISE 12.4
– Xilinx Virtex-5
Input data
width(bit)
Twiddle
factor
width(bit)
LUTs
Block
RAMs
DSP 48s
Critical
path(ns)
Max.
Freq.(MH
z)
16
16
4811
22
57
10.339
96.723
Conclusion
 Design of high-performance and veriable-length FFT processor
 Shared memory architecutre
–
Simplicity of hardware
 Proposed Radix-8/4/2 DIF butterfly unit
– Every point FFT computation from 64 to 8192 points
 Proposed twiddle factor generator
– 80% reduction
 SQNR : 45.3 ~ 66.7 dB
 OFDM standard (symbol duration)
– Proposed FFT processor for OFDM applications, such as 802.11a, 802.16a, DAB,
DVB-T and so on
Thank you for listening