Transcript pptx
A radix-8/4/2 FFT processor for OFDM
systems
Jungmin Park
Project background
OFDM used widely for high-speed digital communication
High performance of FFT processor for real time application
Dedicated FFT processor for only specific application
Variable-length FFT processor for many applications
Application
FFT length
DVB-T/H
2K-8K
DAB
256-2K
xDSL
256-4K
WLAN
64-128
Nicola E. L’insalata, Sergio Sanponara, Luca Fanucci, Pierangelo Terreni, “Automatic Synthesis of
cost effective FFT/IFFT cores for VLSI OFDM Systems”
Architecture of conventional FFT processors
Pipeline architecture
–
Pipeline process
–
Butterfly unit and memory every computation stage
High throughput
Input
Parallel architecture
–
Parallel process
–
The worst case in hardware efficiency
High throughput
The best case in hardware efficiency
–
Low throughput
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
Butterfly
Unit
...
(b) Parallel Architecture
Shared memory architecture
–
Butterfly
Unit
Shared memory
Bank 1
Bank 2
High-radix FFT algorithm
Bank r
Radix-r
Butterfly
W
Radix-r
Butterfly
(a) Pipeline Architecture
W
Radix-r
Butterfly
Radix-r
Butterfly Unit
(c)Shared memory
Architecture
Output
large area with large FFT length
Butterfly
Unit
Project objective and contents
Project objective
Design of High-performance and variable-length FFT processor for OFDM
systems
Project contents
Hardware efficiency and
area
• Shared memory architecture
• Proposed twiddle factor generator
High throughput
• Pipelined Radix-8 DIF FFT algorithm
Only Radix-8 FFT : 8n points (64, 512, 4K points)
Mixed Radix-8/4 FFT : 4 8n points (256, 2K points) and 8n points
Mixed Radix-8/2 FFT: 2 8n points (128, 1K, 8K points) ) and 8n points
Variable FFT length
Mixed radix-8/4/2 DIF butterfly unit
Memory assignment and
addressing
Efficient memory assignment and addressing
Structure of proposed FFT processor
Address generator
Memory
Bank0
– 8 banks
Bank1
Bank3
Bank4
Bank5
Commutator2
Commutator1
Bank2
Radix-8/4/2 BU
Memory address
generator and
Commutator
Bank6
Bank7
Twiddle factor
generator
Pipelined Radix-8/4/2 butterfly
unit
Twiddle factor
generator
– Dual port
Control Unit
Control unit
Operation of the proposed FFT processer (64-point data flow)
The Pipelined radix-8/4/2 DIF butterfly unit
S0
S1
Mode
0
0
4 parallel Radix-2
0
1
2 Parallel Radix-4
1
0
Radix-8 without multiplication
1
1
Radix-8 with multiplication
Application of proposed butterfly unit
Point
# Computation Stages
1 stage
2 stage
3 stage
4 stage
64
2
Radix-8
Radix-8
128
3
Radix-8
Radix-8
Radix-2
256
3
Radix-8
Radix-8
Radix-4
512
3
Radix-8
Radix-8
Radix-8
1K
4
Radix-8
Radix-8
Radix-8
Radix-2
2K
4
Radix-8
Radix-8
Radix-8
Radix-4
4K
4
Radix-8
Radix-8
Radix-8
Radix-8
8K
5
Radix-8
Radix-8
Radix-8
Radix-8
5 stage
Radix-2
Twiddle factor generator
Twiddle factor generator
– Recurisive feedback difference equation
sin n 2cos n sin(n 1) sin(n 2)
cos n 2cos cos(n 1) cos(n 2)
– Error propagation problem
sin m 2cos sin(m 1) sin(m 2)
3
3
n error n
2
2
2(cos 2 ( n1) )[sin(m 1) 2 ( n1) ] [sin(m 2) 2 ( n1) ]
2cos sin(m 1) sin(m 2) 2 n [sin(m 1) cos 21 2 ( n1) ]
max | 2 n [sin(m 1) cos 21 2 ( n1) ] | 2 ( n 2)
Proposed error correction using correction table
[ z2 z1 z0 ]2 unsigned ([ x2 x1 x0 ]2 [ y2 y1 y0 ]2 )
Correct_ value signed (computed _ value [ z2 z1 z0 ]2 )
( where [ x x x ] is 3LSBs of correct value in correction table,
2 1 0 2
[ y y y ] is 3LSBs of computed value.)
2 1 0 2
Structure of proposed twiddle factor generator
LUT(Correction table)
2
3
R
E
G
16
17
16
16
Error
Correction
16
16
cos j )
16
LUT(
LUT( s in
1
16
sin(nθ)
1
)
j
R
E
G
0
0
sin n 2cos n sin(n 1) sin(n 2)
cos n 2cos cos(n 1) cos(n 2)
(a) Sine function generator
LUT(Correction table)
2
R
E
G
3
23
24
23
Error
Correction
23
23
LUT(
0
23'h200000
23
1
cos j)
23
R
E
G
(b) Cosine function generator
0
1
23
Roundoff
16
cos(nθ)
Implemenation and verification (1)
VHDL modeling
How to verify and measure SQNR
HDL model
Random
Generator
16-QAM
Modultation
Ideal IFFT
16
Quantization
Proposed FFT
processor
MATLAB
Ideal FFT
(MATLAB)
MATLAB
(Re( A)) (Im( A))
(Re( A) Re( B)) (Im( A) Im( B))
2
SQNR 10 log10 (
2
2
2
)
( where A is the value of MATLAB, B is the value of proposed FFT )
16
Comparison
Implemenation and verification (2)
Simulation (64 point FFT)
(a) Constellations
Point
64
SQNR 66.9
(dB)
(b) Error and SQNR
128
256
512
1024
2048
4096
8192
63.2
60.3
57.7
55
51.9
48.1
45.3
Implemenation and verification (3)
FPGA synthesis
– Xilinx ISE 12.4
– Xilinx Virtex-5
Input data
width(bit)
Twiddle
factor
width(bit)
LUTs
Block
RAMs
DSP 48s
Critical
path(ns)
Max.
Freq.(MH
z)
16
16
4811
22
57
10.339
96.723
Conclusion
Design of high-performance and veriable-length FFT processor
Shared memory architecutre
–
Simplicity of hardware
Proposed Radix-8/4/2 DIF butterfly unit
– Every point FFT computation from 64 to 8192 points
Proposed twiddle factor generator
– 80% reduction
SQNR : 45.3 ~ 66.7 dB
OFDM standard (symbol duration)
– Proposed FFT processor for OFDM applications, such as 802.11a, 802.16a, DAB,
DVB-T and so on
Thank you for listening