Document 7221586
Download
Report
Transcript Document 7221586
MP3 Overview
John Ehrhardt
Elena Silenok
CSE228 – Spring 03
Where did MP3 come from?
In 1987 the Fraunhofer IIS (Institut
Integreierte Schaltungen) started work on
audio encoding
In 1991 with Prof. Deiter Seitzler from the
University of Erlangen they devised a very
powerful algorithm that was standardized as
ISO-MPEG Audio Layer-3
Why “MP3” and what is MP3?
Windows introduced the .mp3 file extension
for MPEG-1 Layer III encoder and decoder
software
Files encoded with the MPEG-2 lower
sampling rate extension of Layer III are also
known as mp3s
MPEG Audio
In MPEG audio, one may achieve a typical data
reduction of
1:4 by Layer 1 (corresponds to 384 kbps for a stereo
signal),
1:6...1:8 by Layer 2 (corresponds to 256..192 kbps
for a stereo signal),
1:10...1:12 by Layer 3 (corresponds to 128..112
kbps for a stereo signal),
While maintaining CD quality sound.
MP3
Encoding
Much more than just reducing the sampling rate and the
resolution of the samples
Consists of 3 major stages:
– Hybrid Filter Bank Analysis
– Perceptual Modeling
– Quantization & Coding
Hybrid Filter Bank Analysis
Polyphase filter bank
– Divides the audio signal into 32 equal-width frequency subbands
– Correlates subbands according to human perception of sound
frequencies
Modified Discrete Cosine Transform
– Since it’s not 64 values, the DCT has been modified to be used for
32.
– Increases the frequency resolution 18 times higher than that of
layer 2
Hybrid was chosen for compatibility with layers 1 & 2 that
do not use the MDCT.
Perceptual Modeling
Provides a masking threshold that allows the
quantization and coding step to know if the results
are perceptually indistinguishable from the
original signal.
– A strong tonal signal in
one subband will mask
weak noise in close
frequencies
Most important aspect when determining the
quality of an encoder.
Quantization & Coding
Quantization
– Power-law: larger values have less accuracy
– Huffman coding
Coding
– Attempts to quantize the resulting MDCT from the
Filter Bank at level that meets both the bitrate and the
masking requirements
– Huffman Coding and Quantization level provide
feedback for bitrate
– Scale factors for each subband are adjusted until they
meet the masking threshold.
Bitstream Layout
Divided into 1152 samples per block
One block is encoded within one MPEG-1 audio
frame (header and data.)
Header (First 4 bytes of a frame)
– No file header
– Contains: Frame Sync, MPEG Layer, Sampling
Frequency, Number of Channels, CRC, etc.
– Variable bit rate mp3’s switch bitrate between frames
Audio Tag: ID3v1
Contains information about the artist, title,
published year, genre, etc.
The last 128 bytes of the MPEG audio file.
ID3v2 much more complicated
– See www.id3.org for more details
Watermarking
Set of secondary digital data embedded in
the primary digital media
Provides ownership protection, copy
control, fingerprinting, authentication, and
control over information
Robust vs. fragile, invisible vs. visible,
public vs. private (detection w/ or w/o the
original unmarked image)
Current State of MP3
3 million of MP3 tracks downloaded every
day (International Federation of
Phonographic Industries) – mostly pirated
Forrester Research says MP3/other online
music sales have reached 7% by 2003 $1.1bn a year
MP3 alternative launched in December
1998: the Secure Digital Music Initiative
Secure Digital
Music Initiative
Started in 1998, currently over 200
members
Spearheaded by RIAA, IFPI, RIAJ and
major recording companies
SDMI intended to secure music in all
forms, across all delivery channels
2 phases, to finally incorporate dual
watermarking or other protection scheme
References
http://www.stanford.edu/~udara/SOCO/lossy/mp3
http://www.iis.fraunhofer.de/amm/techinf/layer3/index
.html
http://www.tnt.unihannover.de/project/mpeg/audio/faq/mpeg1.html
http://www.dv.co.yu/mpgscript/mpeghdr.htm
“MP3 And AAC Explained,” Karlheinz Brandenburg,
Fraunhofer Institute for Integrated Circuits FhG-IIS
A, Erlangen, Germany
http://www.sdmi.org
Thank you for your attention!