Embedded Signal Processing Laboratory at UT Austin Prof. Brian L. Evans Dept. of Electrical and Computer Eng. The University of Texas at Austin http://www.ece.utexas.edu/~bevans.

Download Report

Transcript Embedded Signal Processing Laboratory at UT Austin Prof. Brian L. Evans Dept. of Electrical and Computer Eng. The University of Texas at Austin http://www.ece.utexas.edu/~bevans.

Embedded Signal Processing
Laboratory at UT Austin
Prof. Brian L. Evans
Dept. of Electrical and Computer Eng.
The University of Texas at Austin
http://www.ece.utexas.edu/~bevans
Got to Texas as Fast I Could…

BSEE/CS Rose-Hulman 1987
MSEE Georgia Tech 1988
PhDEE Georgia Tech 1993
Post-Doc UC Berkeley 1993-1996

Very happy to land in Austin in Fall 1996 at …



2
Summary of Previous NI Interaction

NI Support Prior to Fall 2002


Real-Time DSP Lab alumni who went to NI


Funding for Babar Ahmed (undergraduate)
Prethi Gopinath, Newton Petersen, Junichi Suguira
NI employees in Embedded Software Systems

Hugo Andrade, Scott Kovner, Sadia Malik,
Kurt Nee, Newton Petersen, Ram Rajagopal,
Michael Schaeffer
3
Outline

Real-Time Digital Signal Processing Lab



Embedded Software Systems graduate course



Programmable Digital Signal Processors
Future Uses of LabVIEW
Electronic Design Automation Tools
Interaction with National Instruments
Research Group (Embedded Signal Proc. Lab)


Common Themes
ADSL Transceiver Design
4
Real-Time DSP Lab





Introduced Fall 1997: 384 served
Digital signal processing theory/algorithms
Digital communication systems
Digital signal processor architecture
Deliverable: Voiceband modem



Design of sinusoidal generators, filters, etc.
Implementation in C/assembly on TI floating-point
TMS320C6700 DSP using Code Composer Studio
Test implementation with spectrum analyzers, etc.
5
Digital Signal Processors (DSPs)


For real time (guaranteed delivery)
Fixed-point DSPs for high-volume products




Battery-powered: cell phones, dial-up modems,
portable MP3 players, digital still cameras, and
digital video (e.g. TI C5000)
Wall-powered: ADSL modems, VDSL modems, cell
phone basestations, modem banks, laser printers,
video conferencing systems (e.g. TI 6200, C6400)
Floating-point DSPs for low-volume products
and feasibility analysis on fixed-point DSPs
TI 45%, Agere 25%, Mot 10%, 8% Analog
6
Digital Signal Processor Architecture



Harvard architecture: program/data memory
separated and can be accessed on same cycle
Word size: 16, 20, 24, or 32 bits
Programmer must manage memory




32-128 kwords data/program on chip
On-chip data cache rare (TI C6000)
No support for virtual memory
Predictable input/output: deterministic
interrupt service routine latency (e.g. 11
cycles on TI C6000)
7
Digital Signal Processor Architecture



Deterministic, no-overhead looping
Single instruction cycle multiply unit(s)
No-overhead addressing modes in hardware



Modulo addressing for circular buffers, e.g. filters
Bit-reversed addressing, e.g. fast Fourier
transforms (not available on TI C6000)
Native number formats



Integer: binary point on far right of bit pattern
Fractional: binary point just right of sign bit
Floating-point: could emulate on fixed-point DSPs
8
Drawbacks to Programming DSPs

General drawbacks



Fixed-point issues




Limited on-chip memory
Poor C compiler performance
Non-standard C extensions for fractional data
Converting floating-point programs to fixed-point
Manual tracking of binary point prone to error
Conventional DSPs


No byte addressing (needed for image/video)
Limited addressable memory on fixed-point DSPs
9
LabVIEW for Real-Time DSP Lab


Students use LabVIEW in a pre-requisite
Fall 2003: System-level representation



In the first lab, students are given a LabVIEW
simulation of voiceband modem running on PC
In each subsequent lab, students substitute the
subsystem implemented on the DSP in the
LabVIEW simulation to test the design
Future: Synthesis vs. handcoding

In each lab, students use the LabVIEW modem
simulation to synthesize the subsystem being
designed and compare their handcode with it
10
Embedded Software Systems



Introduced in Spring 1997: 87 served
Modern methods for specifying, simulating,
and synthesizing embedded systems
Programming languages
Concurrency
Dataflow models
Process network
Scheduling
Discrete-event models
Software synthesis
Cosimulation
Students evaluate/build system designs in


Ptolemy from UC Berkeley
Advanced Design System from Agilent
11
Dataflow Models
Examples in modern design automation tools
EDA Tool
Dataflow Models
Example Application
Agilent Advanced Design
System
Synchronous Dataflow,
Timed Synchronous
Dataflow
Mixed analog, digital, and RF
communication systems
(data transmission subsystem)
Co-Centric System Design
Studio
Cyclostatic Dataflow
Periodic digital systems, e.g. data
converters, MP3 decoder, digital
baseband communications
Cadence Signal
Processing Worksystem
Synchronous Dataflow,
Dynamic Dataflow
Periodic digital systems
UC Berkeley Ptolemy
Synchronous Dataflow,
Boolean Dataflow,
Dynamic Dataflow
Periodic and aperiodic digital systems
12
Synchronous Dataflow


Arcs: one-way first-in first-out queues
A block is enabled for execution when enough tokens
are available on all inputs



Source blocks are always enabled
When block executes, it always produces and
consumes the same fixed amount of tokens


[Lee 1986]
Consumed data is dequeued from arc
Flow of data through graph may not depend on
values of data
Delay is a property of an arc

Delay of n samples means that n tokens are initially in the
queue of that arc
13
Synchronous Dataflow

Systems are determinate



History of tokens produced on communication
channels do not depend on the execution order
May be executed sequentially or in parallel with
the same outcome
Scheduling


Load balancing to make sure that all tokens
produced can be consumed: linear complexity
Find a periodic schedule


List scheduling: worst-case is exponential complexity
Heuristics to minimize buffer size: cubic complexity
14
Synchronous Datalflow Modeling

Signal Processing





Communication Systems




Finite impulse response filters
Infinite impulse response filters
Fast Fourier transform
Multirate systems and filter banks
Sinusoidal modulation and demodulation
Pulse shapers
Transmission subsystem
Inappropriate for data-dependent graphs,
e.g. baud rate negotiation at modem startup
15
Process Network


A set of concurrent processes that
communicate through network of one-way
infinite first-in first-out (FIFO) queues
Reads from queues are blocking



[Kahn 1974]
If the queue is empty, the process will suspend
until there is enough data in the queue.
When a process blocks, the scheduler will not run
the process until enough data becomes available.
Writes to the queues are non-blocking
16
Process Network


A process is either enabled or blocked waiting
for data on only one of its input channels
Systems are determinate




History of tokens produced on communication
channels do not depend on the execution order
May be executed sequentially or in parallel with
the same outcome
Supports recurrence and recursion
Formal mathematical representation:
processes are functions that map streams
into streams
17
Process Network


Turing complete: questions of termination
and bounded buffering are undecidable
Undecidable (in finite time) if process network




Terminates
Requires bounded memory
Signal processing: run for infinite time
Scheduler can find a bounded memory
solution using infinite time [Parks 1995]


Ptolemy Process Network domain
UT Austin Computational Process Network
framework in C++
http://www.ece.utexas.edu/~allen/PNSourceCode/
18
NIers in Embedded Software Systems

Hugo Andrade and Scott Kovner, 1998,
“Software Synthesis from Dataflow Models for
Embedded Software Design in the G
Programming Language and the LabVIEW
Development Environment”

Kurt Nee (with Chad Roesle), 1999,
“Feasability of Implementating an H.263+ Decoder
on a TMS320C6x Digital Signal Processor”
19
NIers in Embedded Software Systems

Michael Schaeffer, 1999,
“An Extension to the Foundation Fieldbus Model for
Specifying Process Control Strategies”

Sadia Malik and Ram Rajagopal, 2000,
“LabVIEW Based Embedded Design”

Newton Petersen (with Martin Wojcik), 2000,
“Node Prefetch Prediction in Dataflow Graphs”
20
Prof. Brian L. Evans
http://signal.ece.utexas.edu
ADSL/VDSL Transceiver Design
Ph.D. graduates: Güner Arslan (Cicada)
Biao Lu (Schlumberger)
Ph.D. students: Dogu Arifler
Ming Ding
Milos Milosevic (Schlumberger)
Real-Time Imaging
Ph.D. graduates: Thomas D. Kite (Audio Precision)
Niranjan Damera-Venkata (HP Labs)
Ph.D. students: Gregory E. Allen (UT Applied Research Labs)
Serene Banerjee
MS graduates: Young Cho (UCLA)
MS students:
Vishal Monga
Wireless Communications
Ph.D. graduates: Murat Torlak (UT Dallas)
Ph.D. student: Kyungtae Han
MS graduates: Srikanth K. Gummadi (TI)
Amey A. Deosthali (TI)
MS students:
Zukang Shen
Ian Wong
Wireless Networking and Comm.
Group: http://www.wncg.org
Image Analysis
Ph.D. graduates: Dong Wei (SBC Research)
K. Clint Slatton (UT Center for Space Research)
Wade C. Schwartzkopf
Center for Perceptual Systems:
http://www.cps.utexas.edu
21
Common Themes


Find or derive optimal algorithm
Develop low-complexity algorithms
(bottom-up design)




System-level design (top-down design)



Keep in mind that these algorithms will ultimately
be realized in real time on a fixed-point DSP
Algorithms should be statically scheduled
Evaluate performance-implementation tradeoff
Dataflow modeling for synthesis
Simulate system to validate algorithm
Software releases
22
ADSL Transceiver Design

Asymmetric Digital Subscriber Line modem


Line driver (single chip)
Transceiver: analog front end + digital baseband

Sampling rate: 2.208 Mbps (real time)
Bit error rate: 10-7 (Reed-Solomon codes)
Symbol rate: 4,000 symbols/s
Frame is symbol plus redundant information
Single frame transmission (low delay)

Proper equalizer design can double bit rate




23
Digital Subscriber Line (DSL)
Broadband Access
Interne
t
DSLAM
Central
Office
downstream
DSL
modem
DSL
modem
upstream
Voice
Switch
LPF
LPF
Customer Premises
DSLAM - DSL Access Multiplexer
Telephone
Network
LPF – Low Pass Filter
24
Discrete Multitone (DMT) Standards

ADSL – Asymmetric DSL (G.DMT Standard)




Echo cancelled no longer deployed in central office
Frequency division multiplexing max. data rates:
13.38 Mbps downstream,
G.DMT Asymmetric
1.56 Mbps upstream
ADSL DMT VDSL
25 kHz –
1 MHz –
Data band
ADSL:cable modem –
1.1 MHz
12 MHz
1:2 in US & 5:1 non-US Upstream
DMT VDSL – Very High
Rate DSL (Proposed)



Faster G.DMT ADSL
Freq. division multiplex
2m subcarriers m  [8, 12]
subcarriers
Downstream
subcarriers
Target upstream rate
Target downstream rate
32
256
256
2048/4096
1 Mbps
3 Mbps
8 Mbps
13/22 Mbps
25
Multicarrier Modulation

Divide channel into narrowband subchannels


pulse
Discrete multitone modulation

magnitude
No inter-symbol interference
(ISI) if constant gain in every
subchannel and ideal sampling
-wc
DTFT-1
wc
Based on fast Fourier transform (FFT)
sinc
w
k
sin w c k 
k
channel
carrier
Subchannels are 4.3 kHz wide in ADSL and DMT VDSL
subchannel
frequency
26
Discrete Multitone Modulation Symbol

Subsymbols are complex-valued


Quadrature
ADSL training uses 4-level Quadrature
Amplitude Modulation (QAM)
ADSL uses QAM of 22, 23, 24, …, 215
levels during data transmission
N/2 subsymbols
(one subsymbol
per carrier)
X0
X1
X2
XN/2
XN/21
N-point
Inverse
FFT
x1
x2
x3
Xi
In-phase
QAM
one symbol of N
real-valued samples
*
X2*
X1*
xN
27
Discrete Multitone Modulation Frame

Frame through D/A converter and transmitted


Frame is the symbol with cyclic prefix prepended
Cyclic prefix (CP) is last n samples of symbol
copy
copy
CP
v samples

s y m b o l i
CP
s y m b o l i+1
N samples
Linear convolution of frame
w/ channel impulse response


ADSL G.DMT Values
Down
Up
stream stream
4
n
32
64
N
512
Is circular convolution if channel
length is CP length plus one or shorter
If circular, frequency equalization in FFT domain
28
Eliminating Inter-Symbol Interference

Time domain equalizer (TEQ)



channel
impulse
response
Finite impulse response (FIR) filter
Effective channel impulse response:
convolution of TEQ impulse response
with channel impulse response
effective
channel
impulse
response
Frequency domain equalizer (FEQ)


n+1
Compensates magnitude and phase
distortion of channel + TEQ by dividing
each FFT coefficient by complex number
ADSL G.DMT equalizer training




: transmission delay
n: cyclic prefix length
Reverb: same symbol sent 1,024 to 1,536 times
Medley: aperiodic sequence of 16,384 symbols
At 0.25 s after medley, receiver returns number
of bits on each subcarrier that can be supported
29
ADSL Transceiver: Data Transmission
N/2 subchannels N real samples
Bits
00110
S/P
quadrature
amplitude
modulation
(QAM)
encoder
mirror
data
and
N-IFFT
add
cyclic
prefix
D/A +
transmit
filter
P/S
TRANSMITTER
channel
RECEIVER
N/2 subchannels
P/S
QAM
demod
decoder
invert
channel
=
frequency
domain
equalizer
N real samples
N-FFT
and
remove
mirrored
data
remove
S/P cyclic
prefix
time
domain
equalizer
(FIR
filter)


conventional ADSL equalizer structure
receive
filter
+
A/D
30
Simulation Results for 17-Tap TEQ
Achievable percentage of upper bound on bit rate
ADSL
Maximum Maximum
CSA Minimum Geometric Shortening Minimum Maximum
Loop
MSE
SNR
SNR
ISI Bit Rate
43%
84%
62%
99%
99%
1
70%
73%
75%
98%
99%
2
64%
94%
82%
99%
99%
3
70%
68%
61%
98%
99%
4
61%
84%
72%
98%
99%
5
62%
93%
80%
99%
99%
6
57%
78%
74%
99%
99%
7
66%
90%
71%
99%
100%
8
Cyclic prefix length
FFT size (N)
Coding gain
Margin
32
512
4.2 dB
6 dB
Input power
Noise power
Crosstalk noise
POTS splitter
Upper
Bound
(Mbps)
9.059
10.344
8.698
8.695
9.184
8.407
8.362
7.394
23 dBm
-140 dBm/Hz
8 ADSL disturbers
5th order Chebyshev
31
Simulation Results for 3-Tap TEQ
Achievable percentage of matched filter bound on bit rate
ADSL
Maximum Maximum
CSA Minimum Geometric Shortening Minimum Maximum
Loop
MSE
SNR
SNR
ISI Bit Rate
54%
70%
96%
97%
98%
1
47%
71%
96%
96%
97%
2
57%
69%
92%
98%
99%
3
46%
66%
97%
97%
98%
4
52%
65%
96%
97%
98%
5
60%
71%
95%
98%
99%
6
46%
63%
93%
96%
97%
7
55%
61%
94%
98%
99%
8
Cyclic prefix length
FFT size (N)
Coding gain
Margin
32
512
4.2 dB
6 dB
Input power
Noise power
Crosstalk noise
POTS splitter
Upper
Bound
(Mbps)
9.059
10.344
8.698
8.695
9.184
8.407
8.362
7.394
23 dBm
-140 dBm/Hz
8 ADSL disturbers
5th order Chebyshev
32
Contributions by Research Group

New time-domain equalizer design methods



Maximum Bit Rate method maximizes bit rate
(upper bound)
Minimum Inter-Symbol Interference method
(real-time, fixed-point)
Minimum Inter-Symbol Interference TEQ
design method


Reduces number of TEQ taps by a factor of ten
over Minimum Mean Squared Error method for the
same bit rate in discretized simulation
Implemented in real-time on Motorola 56000, TI
TMS320C6200 and TI TMS320C5000 DSPs:
http://www.ece.utexas.edu/~bevans/projects/adsl
33
Matlab DMT TEQ Design Toolbox 3.1
FIR, dual-path, per-tone & filter bank equalizers:
http://www.ece.utexas.edu/~bevans/projects/adsl/dmtteq/
default
parameters
from
G.DMT
ADSL
standard
23
-140
various
performance
measures
different
graphical
views
34
Future Interaction with NI



Integrate LabVIEW into Real-Time DSP Lab to
reinforce modem system being designed
Add lecture on LabVIEW computational model
in Embedded Software Systems course
Discuss ideas for extensions to LabVIEW for
synthesis onto programmable DSPs


Evaluate restrictions and extensions to the G
language for synthesis
Investigate methods for conversion of floatingpoint source code to fixed-point
35