DSP Lecture 01

Transcript DSP Lecture 01

DSP Lecture 01
Chapter 1
Introduction
Chapter 1, Slide 1
Learning Objectives
•
•
•
•
•
•
•
Chapter 1, Slide 2
Why process signals digitally?
Definition of a real-time application.
Why use Digital Signal Processing processors?
What are the typical DSP algorithms?
Parameters to consider when choosing a DSP
processor.
Programmable vs ASIC DSP.
Texas Instruments’ TMS320 family.
Present Day Applications
Wireless / Cellular
Voice-band audio
 RF codecs
 Voltage regulation

Consumer Audio

Stereo A/D, D/A
 PLL
 Mixers
HDD

PRML read channel
 MR pre-amp
 Servo control
 SCSI tranceivers
DSP:
Technology
Enabler
Automotive
Digital radio A/D/A
 Active suspension
 Voltage regulation

Multimedia
Stereo audio
 Imaging
 Graphics palette
 Voltage regulation

Chapter 1, Slide 3
DTAD

Speech synthesizer
 Mixed-signal
processor
Why go digital?
•
•
Digital signal processing techniques are now so
powerful that sometimes it is extremely difficult, if
not impossible, for analogue signal processing to
achieve similar performance.
Examples:
– FIR filter with linear phase.
– Adaptive filters.
Chapter 1, Slide 4
Why go digital?
•
Analogue signal processing is achieved by using
analogue components such as:
– Resistors.
– Capacitors.
– Inductors.
•
Chapter 1, Slide 5
The inherent tolerances associated with these
components, temperature, voltage changes and
mechanical vibrations can dramatically affect the
effectiveness of the analogue circuitry.
Why go digital?
•
With DSP it is easy to:
– Change applications.
– Correct applications.
– Update applications.
•
Additionally DSP reduces:
–
–
–
–
–
Chapter 1, Slide 6
Noise susceptibility.
Chip count.
Development time.
Cost.
Power consumption.
Why NOT go digital?
•
High frequency signals cannot be processed
digitally because of two reasons:
– Analog to Digital Converters, ADC cannot work fast
enough.
– The application can be too complex to be performed in
real-time.
Chapter 1, Slide 7
Real-time processing
•
•
•
DSP processors have to perform tasks in real-time,
so how do we define real-time?
The definition of real-time depends on the
application.
Example: a 100-tap FIR filter is performed in realtime if the DSP can perform and complete the
following operation between two samples:
99
y n    ak xn  k 
k 0
Chapter 1, Slide 8
Real-time processing
Waiting Time
Processing Time
n
n+1
Sample Time
•
We can say that we have a real-time application if:
– Waiting Time  0
Chapter 1, Slide 9
Why do we need DSP processors?
•
Why not use a General Purpose Processor (GPP)
such as a Pentium instead of a DSP processor?
– What is the power consumption of a Pentium and a DSP
processor?
– What is the cost of a Pentium and a DSP processor?
Chapter 1, Slide 10
Why do we need DSP processors?
•
•
Chapter 1, Slide 11
Use a DSP processor when the following are
required:
– Cost saving.
– Smaller size.
– Low power consumption.
– Processing of many “high” frequency signals in
real-time.
Use a GPP processor when the following are
required:
– Large memory.
– Advanced operating systems.
What are the typical DSP algorithms?
•
The Sum of Products (SOP) is the key element in
most DSP algorithms:
Algorithm
Equation
M
Finite Impulse Response Filter
a
y ( n) 
k
x( n  k )
k 0
M
Infinite Impulse Response Filter
a
y(n) 
N
k
k 0
 b y (n  k )
x ( n  k )
k
k 1
N
Convolution
 x ( k ) h( n  k )
y ( n) 
k 0
N 1
Discrete Fourier Transform
X (k ) 
 x(n) exp[  j(2 / N )nk]
n 0
Discrete Cosine Transform
Chapter 1, Slide 12
F u  
N 1


c(u ). f ( x). cos
u2 x  1
 2N

x 0

What Problem Are We Trying To Solve?
x
ADC
Digital sampling of
an analog signal:
DSP
Y
DAC
Most DSP algorithms can be
expressed with MAC:
count
A
Y =

i = 1
t
ai * xi
for (i = 1; i < count; i++){
sum += m[i] * n[i]; }
What does it take to do this fast … and easy?
Chapter 1, Slide 13
Fast MAC using only C
Multiply-Accumulate (MAC) in Natural C Code
for (i = 0; i < count; i++){
sum += m[i] * n[i]; }
• Fastest Execution of MACs
– The ‘C6x roadmap ... from 200 to 2400 MMACs
• Ease of C Programming
– Even using natural C, the ‘C6000 Architecture can perform 2 to 4 MACs
per cycle
– Compiler generates 80-100% efficient code
Chapter 1, Slide 14
How does the ‘C6000 achieve such performance from C?
'C6000 Architecture: Built for Speed
Memory
A0
..
A15
..
A31
B0
.D1
.D2
.M1
.M2
.L1
.S1
.L2
.S2
Controller/Decoder
Chapter 1, Slide 16

‘C6000 Compiler excels at
Natural C

While dual-MAC speeds
math intensive algorithms,
flexibility of 8 independent
functional units allows the
compiler to quickly perform
other types of processing

All ‘C6000 instructions are
conditional allowing efficient
hardware pipelining

Instruction set and CPU
hardware orthogonality allow
the compiler to achieve 80100% efficiency
..
B15
..
B31
Fastest MAC using Natural C
float mac(float *m, float *n, int count)
{ int i, float sum = 0;
Memory
A0
B0
.D1
..
A15
..
A31
.D2
.M1
.M2
.L1
.L2
.S1
.S2
Controller/Decoder
Chapter 1, Slide 17
..
B15
..
B31
for (i=0; i < count; i++) {
sum += m[i] * n[i]; } …
;** --------------------------------------------------*
LOOP: ; PIPED LOOP KERNEL
LDDW .D1
A4++,A7:A6
||
LDDW .D2
B4++,B7:B6
||
MPYSP .M1X
A6,B6,A5
||
MPYSP .M2X
A7,B7,B5
||
ADDSP .L1
A5,A8,A8
||
ADDSP .L2
B5,B8,B8
|| [A1] B
.S2
LOOP
|| [A1] SUB
.S1
A1,1,A1
;** --------------------------------------------------*
'C6000 System Block Diagram
External
Memory
Internal Buses
.D1 .D2
.M1 .M2
.L1 .L2
.S1 .S2
CPU
Chapter 1, Slide 18
Looking at the internal buses ...
Register Set B
Register Set A
P
E
R
I
P
H
E
R
A
L
S
Internal
Memory
‘C6000 Internal Buses
Internal
Program Addr
x32
Program Data
x256
Data Addr - T1
x32
Data Data - T1
x32/64
Data Addr - T2
x32
Data Data - T2
x32/64
PC
Memory
External
Memory
A
regs
B
regs
DMA Addr - Read
DMA Data - Read
Peripherals
DMA Addr - Write
DMA Data - Write
Chapter 1, Slide 19
DMA
'C6000 System Block Diagram
Internal
Memory
External
Memory
Internal Buses
.M1 .M2
.L1 .L2
.S1 .S2
CPU
Chapter 1, Slide 20
Next, the internal memory ...
Register Set B
Register Set A
.D1 .D2
‘C6711 Memory
0000_0000
64KB Internal
4K
Program
Cache
0180_0000
64K
CPU
Prog / Data
(Level 2)
8000_0000
9000_0000
4K
Data
Cache
A000_0000
B000_0000
cache logic
Chapter 1, Slide 21
On-chip Peripherals
cache details
FFFF_FFFF
0
128MB External
1
128MB External
2
128MB External
3
128MB External
'C6000 System Block Diagram
External
Memory
Internal Buses
.D1 .D2
.M1 .M2
.L1 .L2
.S1 .S2
CPU
Chapter 1, Slide 24
Looking at each peripheral ...
Register Set B
Register Set A
P
E
R
I
P
H
E
R
A
L
S
Internal
Memory
Hardware vs. Microcode multiplication
•
•
•
DSP processors are optimised to perform
multiplication and addition operations.
Multiplication and addition are done in hardware and in
one cycle.
Example: 4-bit multiply (unsigned).
Hardware
Microcode
1011
x 1110
1011
x 1110
10011010
0000
1011.
1011..
1011...
10011010
Chapter 1, Slide 26
Cycle
Cycle
Cycle
Cycle
1
2
3
4
Cycle 5
Parameters to consider when choosing a
DSP processor
Parameter
TMS320C6211
(@150MHz)
32-bit
TMS320C6711
(@150MHz)
32-bit
N/A
64-bit
Extended Arithmetic
40-bit
40-bit
Performance (peak)
1200MIPS
1200MFLOPS
2 (16 x 16-bit) with
32-bit result
2 (32 x 32-bit) with
32 or 64-bit result
32
32
Internal L1 program memory cache
32K
32K
Internal L1 data memory cache
32K
32K
Internal L2 cache
512K
512K
Arithmetic format
Extended floating point
Number of hardware multipliers
Number of registers

C6711 Datasheet: \Links\TMS320C6711.pdf

C6211 Datasheet: \Links\TMS320C6211.pdf
Chapter 1, Slide 27
Parameters to consider when choosing a
DSP processor
Parameter
TMS320C6211
(@150MHz)
2 x 75Mbps
TMS320C6711
(@150MHz)
2 x 75Mbps
16
16
Not inherent
Not inherent
3.3V I/O, 1.8V Core
3.3V I/O, 1.8V Core
Yes
Yes
On-chip timers (number/width)
2 x 32-bit
2 x 32-bit
Cost
US$ 21.54
US$ 21.54
256 Pin BGA
256 Pin BGA
External memory interface controller
Yes
Yes
JTAG
Yes
Yes
I/O bandwidth: Serial Ports
(number/speed)
DMA channels
Multiprocessor support
Supply voltage
Power management
Package
Chapter 1, Slide 28
Floating vs. Fixed point processors
•
Applications which require:
–
–
–
–
•
High precision.
Wide dynamic range.
High signal-to-noise ratio.
Ease of use.
Need a floating point processor.
Drawback of floating point processors:
– Higher power consumption.
– Can be more expensive.
– Can be slower than fixed-point counterparts and larger in
size.
Chapter 1, Slide 29
Floating vs. Fixed point processors
•
•
Chapter 1, Slide 30
It is the application that dictates which device and
platform to use in order to achieve optimum
performance at a low cost.
For educational purposes, use the floating-point
device (C6711) as it can support both fixed and
floating point operations.
General Purpose DSP vs. DSP in ASIC
•
•
Chapter 1, Slide 31
Application Specific Integrated Circuits (ASICs) are
semiconductors designed for dedicated functions.
The advantages and disadvantages of using ASICs
are listed below:
Advantages
Disadvantages
•
•
•
•
•
•
•
•
•
High throughput
Lower silicon area
Lower power consumption
Improved reliability
Reduction in system noise
Low overall system cost
High investment cost
Less flexibility
Long time from design to
market
General-purpose DSP market in 2003
Chapter 1, Slide 32
System Considerations
Interfacing
Performance
Power
Size
Ease-of Use
• Programming
• Interfacing
• Debugging
Chapter 1, Slide 33
Cost
• Device cost
• System cost
• Development cost
• Time to market
Integration
• Memory
• Peripherals
Texas Instruments’ TMS320 family
•
Different families and sub-families exist to support
different markets.
C2000
C5000
C6000
Lowest Cost
Efficiency
Performance &
Best Ease-of-Use
Control Systems
 Motor Control
 Storage
 Digital Ctrl Systems
Best MIPS per
Watt / Dollar / Size
 Wireless phones
 Internet audio players
 Digital still cameras
 Modems
 Telephony
 VoIP







Chapter 1, Slide 34
Multi Channel and
Multi Function App's
Comm Infrastructure
Wireless Base-stations
DSL
Imaging
Multi-media Servers
Video
Texas Instruments’ TMS320 family
TMS320C64x: The C64x fixed-point DSPs offer the industry's highest level of
performance to address the demands of the digital age. At clock rates of up
to 1 GHz, C64x DSPs can process information at rates up to 8000 MIPS with
costs as low as $19.95. In addition to a high clock rate, C64x DSPs can do
more work each cycle with built-in extensions. These extensions include new
instructions to accelerate performance in key application areas such as
digital communications infrastructure and video and image processing.
TMS320C62x: These first-generation fixed-point DSPs represent
breakthrough technology that enables new equipments and energizes
existing implementations for multi-channel, multi-function applications, such
as wireless base stations, remote access servers (RAS), digital subscriber
loop (xDSL) systems, personalized home security systems, advanced
imaging/biometrics, industrial scanners, precision instrumentation and multichannel telephony systems.
TMS320C67x: For designers of high-precision applications, C67x floatingpoint DSPs offer the speed, precision, power savings and dynamic range to
meet a wide variety of design needs. These dynamic DSPs are the ideal
solution for demanding applications like audio, medical imaging,
instrumentation and automotive.
Chapter 1, Slide 35
C6000 Roadmap
Object Code Software Compatibility
Floating Point
Performance
Multi-core
C64x™ DSP
1.1 GHz
2nd Generation
C6416
C6414
C6415
C6412
DM642
C6411
1st Generation
C6203
C6202
C6201
C6701
C6713
C6204 C6205
C6211
C6711
C6712
C62x/C64x/DM642: Fixed Point
C67x: Floating Point
Time
Chapter 1, Slide 36
’C6000 Floating-Point
C67x
3 GFLOPS
and beyond
C6701
1 GFLOPS
C6711
C6712
900 MFLOPS
600
MFLOPS
C33
C31
C30
C32
150
MFLOPS
Time
Chapter 1, Slide 37
TI Floating-Point Innovation
TI Floating Point - A History of Firsts:
First commercially-successful floating-point DSP
First floating-point DSP with multiprocessing support
First $10 floating-point DSP
First 1-GFLOPS DSP
First $5 floating-point DSP
First 2-level cache floating-point DSP
First to offer 600 MFLOPS for under $10
Chapter 1, Slide 38
‘C30 (1987)
‘C40 (1991)
‘C32 (1995)
‘C6701 (1998)
‘C33 (1999)
‘C6711 (1999)
‘C6712 (2000)
Useful Links
•
Selection Guide:
– \Links\DSP Selection Guide.pdf
\Links\DSP Selection Guide.pdf (3Q 2004)
\Links\DSP Selection Guide.pdf (4Q 2004)
Chapter 1, Slide 39
Looking for Literature on DSP?
Chapter 1, Slide 40

“A Simple Approach to Digital Signal Processing”
by Craig Marven and Gillian Ewers;
ISBN 0-4711-5243-9

“DSP Primer (Primer Series)”
by C. Britton Rorabaugh;
ISBN 0-0705-4004-7

“Understanding Digital Signal Processing”
by Richard G. Lyons;
Prentice Hall; 2nd edition (March 15, 2004)
ISBN 0-1310-8989-7

“DSP First : A Multimedia Approach”
James H. McClellan, Ronald W. Schafer, and
Mark A. Yoder;
ISBN 0-1324-3171-8
Looking for Books on ‘C6000 DSP?

“Digital Signal Processing Implementation
using the TMS320C6000TM DSP Platform”
by Naim Dahnoun; ISBN 0201-61916-4

“C6x-Based Digital Signal Processing”
by Nasser Kehtarnavaz and Burc Simsek;
ISBN 0-13-088310-7


Chapter 1, Slide 41
“Real-Time Digital Signal Processing: Based on
the TMS320C6000” by Nasser Kehtarnavaz;
Newnes; Book & CD-Rom (July 14, 2004)
ISBN 0-7506-7830-5
“Digital Signal Processing and Applications with the
C6713 and C6416 DSK (Topics in Digital Signal
Processing)” Wiley-Interscience; Book&CD-Rom
(December 3, 2004) by Rulph Chassaing;
ISBN 0-4716-9007-4
Looking for Books on ‘C6000 DSP?

Chapter 1, Slide 42
“Real-Time Digital Signal Processing from Matlab
to C with the TMS320C6x DSK” by Thad B. Welch;
Cameron Wright; Michael Morrow; Book & CD-Rom
(2006) ISBN 0-8493-7382-4
Chapter 1
Introduction
- End -
Chapter 1, Slide 43

DSP Lecture 01

Transcript DSP Lecture 01

Directory