Introduction to the Special Issue on MPEG-7

Download Report

Transcript Introduction to the Special Issue on MPEG-7

Introduction to the Special
Issue on MPEG-7
報告人: 張富茂
指導教授: 尤信程
Outline
Overview of MPEG-7 Standard
Overview of MPEG-7 Audio
MPEG-7 Sound-Recognition Tools
Overview of MPEG-7 Standard(I)
MPEG-7 focuses on description of multimedia
content.
MPEG-7 intends to be an interoperable
interface,which defines the syntax and
semantic of various description tools.
Many groups and organizations have initiated
active works in defining interoperable
frameworks and representations for metadata
description.
Overview of MPEG-7 Standard(II)
Ds define syntax and semantics of features of
audio-visual content.
DSs allow construction of complex
descriptions by specifying the structure and
semantics of the relationships among the
constituent Ds or DSs
DDL allows flexible definition of MPEG-7 DSs
and Ds based on XML schema.
Overview of MPEG-7 Standard(II)
Scope of MPEG -7
Description
production
Standard
description
Normative part of
MPEG-7 standard
Description
consumption
Overview of MPEG-7 Standard(III)
1)ISO/IEC
2)ISO/IEC
3)ISO/IEC
4)ISO/IEC
5)ISO/IEC
6)ISO/IEC
7)ISO/IEC
15938-1:MPEG-7
15938-2:MPEG-7
15938-3:MPEG-7
15938-4:MPEG-7
15938-5:MPEG-7
15938-6:MPEG-7
15938-7:MPEG-7
System
DDL
Visual
Audio
Multimedia DSs
Reference Software
Conformance
Overview of MPEG-7 Audio(I)
First making textual search impossible,Second,
consider how one typically listens to audio
content.
It is a standard for content-based media
description. It is independent of the coding
format of the media.
It is independent of the physical location of
the media.
Overview of MPEG-7 Audio(II)
MPEG-7 standardizes a representation of
meta-data associated with media content.
The MPEG-7 audio standard is composed of
Descriptors and Description Schemes.
DDL allows for new DS to be written for
specific applications.
Overview of MPEG-7 Audio(III)
Query by Humming
Query for Spoken Content
Assisted Consumer-Level Audio Editing
Extraction and Query Paradigm
Overview of MPEG-7 Audio(V)
The MPEG-7 audio standard comprises six
main technologies that can be divided roughly
in to two classes:
•MPEG-7 Audio Description Framework
•Silence Segment
•Sound Effects Description Tools
•Musical Instrument Timbre Description Tools
•Spoken Content Description Tools
•Melody Contour Description Scheme
•Other Parts of the Standard
MPEG-7 Sound-Recognition Tools(I)
The Tools are designed for searching media by
automatically indexing a soundtrack.
Sound-recognition tools provide a unified interface for
automatic indexing of audio using trained sound
classes in a pattern recognition framework.
Description is divided into two types:
•Text-base description by category lables
•Quantitative description using probablilistic models.
MPEG-7 Sound-Recognition Tools(II)
Sound Recognition Descriptors and
Descriptions Schemes
•Qualitative Descriptors
•Quantitative Descriptors
•Probability Model Description Schemes
•Sound-Recognition Model Description Schemes
“Dogs”
NT
NT
“Bark”
UF
“Woof”
A simple taxonomy of sound categories
“Howl”
<SoundCategory term =“1” scheme = “DOGS”>
<Label>Dosg</Label>
<TermRelation term=“1,1” scheme=“DOGS”>
<Lable>Bark</Lable>
<TermRelation term=“1,2” scheme =“DOGS” type=“UF”>
<Label>Woof</Label>
<TermRelation>
<TermRelation term=“1.3” scheme=“DOGS”>
<Label> Howl</Label>
</TermRelation>
</SoundCategory>
0.”Pets”
NT
NT
1.”Dogs”
2”Cats”
NT
NT
1.1.”Bark”
1.2.”Woof”
UF
1.3.”Howl”
NT
NT
2.1.”Meow”
2.2.”Purr”
Combining categories into a larger taxonomy
<ClassificationScheme term”0” Scheme=“PETS”>
<Label> Pets </Label>
<ClassificationSchemeRef scheme=“DOGS”/>
<ClassificationSchemeRef scheme=“CATS”/>
</ClassificationScheme>
MPEG-7 Sound-Recognition Tools(III)
Building A Sound-Recognition Classifier
•HMM Model Training
•Audio Feature Extraction
Window
Audio
X
Spectrum
Envelop
÷
Extraction:
SVD/
ICA
Power
Envelope
Fig 8. Extraction of low-level audio features for
sound-recognition classification
Stored
Basis
Functions
X
Basis
Projection
Audio
Wav Files
Feature
Extract
Basis
Extract
HIMM
HMM
AND
BASIS
Fig 9. Extraction of hidden Markov model and
basis functions and storage in a DDL representation
Conclusion
MPEG-7 provides…

Multimedia content description framework
for interoperable applications
Description definition language (DDL)

XML Schema (flexibility) + BiM
Description Schemes (MDSs)


Library of description tools
Covers a wide range of generic needs
References
http://www.cmlab.csie.ntu.edu.tw/mpeg4workshop/MPEG7%20
Introduction.files/frame.htm
http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg7.htm