Document 7221586

Download Report

Transcript Document 7221586

MP3 Overview
John Ehrhardt
Elena Silenok
CSE228 – Spring 03
Where did MP3 come from?
In 1987 the Fraunhofer IIS (Institut
Integreierte Schaltungen) started work on
audio encoding
 In 1991 with Prof. Deiter Seitzler from the
University of Erlangen they devised a very
powerful algorithm that was standardized as
ISO-MPEG Audio Layer-3
Why “MP3” and what is MP3?
Windows introduced the .mp3 file extension
for MPEG-1 Layer III encoder and decoder
 Files encoded with the MPEG-2 lower
sampling rate extension of Layer III are also
known as mp3s
MPEG Audio
In MPEG audio, one may achieve a typical data
reduction of
1:4 by Layer 1 (corresponds to 384 kbps for a stereo
1:6...1:8 by Layer 2 (corresponds to 256..192 kbps
for a stereo signal),
1:10...1:12 by Layer 3 (corresponds to 128..112
kbps for a stereo signal),
While maintaining CD quality sound.
Much more than just reducing the sampling rate and the
resolution of the samples
 Consists of 3 major stages:
– Hybrid Filter Bank Analysis
– Perceptual Modeling
– Quantization & Coding
Hybrid Filter Bank Analysis
 Polyphase filter bank
– Divides the audio signal into 32 equal-width frequency subbands
– Correlates subbands according to human perception of sound
 Modified Discrete Cosine Transform
– Since it’s not 64 values, the DCT has been modified to be used for
– Increases the frequency resolution 18 times higher than that of
layer 2
 Hybrid was chosen for compatibility with layers 1 & 2 that
do not use the MDCT.
Perceptual Modeling
 Provides a masking threshold that allows the
quantization and coding step to know if the results
are perceptually indistinguishable from the
original signal.
– A strong tonal signal in
one subband will mask
weak noise in close
 Most important aspect when determining the
quality of an encoder.
Quantization & Coding
 Quantization
– Power-law: larger values have less accuracy
– Huffman coding
 Coding
– Attempts to quantize the resulting MDCT from the
Filter Bank at level that meets both the bitrate and the
masking requirements
– Huffman Coding and Quantization level provide
feedback for bitrate
– Scale factors for each subband are adjusted until they
meet the masking threshold.
Bitstream Layout
Divided into 1152 samples per block
 One block is encoded within one MPEG-1 audio
frame (header and data.)
 Header (First 4 bytes of a frame)
– No file header
– Contains: Frame Sync, MPEG Layer, Sampling
Frequency, Number of Channels, CRC, etc.
– Variable bit rate mp3’s switch bitrate between frames
Audio Tag: ID3v1
Contains information about the artist, title,
published year, genre, etc.
 The last 128 bytes of the MPEG audio file.
 ID3v2 much more complicated
– See for more details
Set of secondary digital data embedded in
the primary digital media
 Provides ownership protection, copy
control, fingerprinting, authentication, and
control over information
 Robust vs. fragile, invisible vs. visible,
public vs. private (detection w/ or w/o the
original unmarked image)
Current State of MP3
3 million of MP3 tracks downloaded every
day (International Federation of
Phonographic Industries) – mostly pirated
 Forrester Research says MP3/other online
music sales have reached 7% by 2003 $1.1bn a year
 MP3 alternative launched in December
1998: the Secure Digital Music Initiative
Secure Digital
Music Initiative
Started in 1998, currently over 200
 Spearheaded by RIAA, IFPI, RIAJ and
major recording companies
 SDMI intended to secure music in all
forms, across all delivery channels
 2 phases, to finally incorporate dual
watermarking or other protection scheme
“MP3 And AAC Explained,” Karlheinz Brandenburg,
Fraunhofer Institute for Integrated Circuits FhG-IIS
A, Erlangen, Germany
Thank you for your attention!