Discussion on Video Analysis and Extraction, MPEG 4 and

Download Report

Transcript Discussion on Video Analysis and Extraction, MPEG 4 and

Discussion on Video Analysis and
Extraction, MPEG-4 and MPEG-7
Encoding and Decoding in Java, Java
3D, or OpenGL
Presented by:
Emmanuel Velasco
City College of New York
Video Analysis and Extraction
•
As more videos are being created, digitized and
archived, the need for content-based search and
retrieval is necessary. This involves analyzing a
video and extracting its contents.
•
The videos are cut into frames. The frames are
analyzed and the objects can be extracted using
image processing techniques.
Video Analysis and Extraction
Temporal Video Segmentation
• Cut detection: The changes in the contents are
visible and occur instantaneously between
consecutive frames.
•
Gradual transition detection: The image transition
makes gradual changes. This requires multiple
frames to be analyzed. Gradual transitions include
fade in, fade out, wipe and dissolve.
Video Analysis and Extraction
Examples:
• Cut transition
•
Gradual transition
Video Analysis and Extraction
•
The cut transition is easier to detect. We check the
frame differences between two consecutive frames
and see if the difference is greater than a certain
threshold. If it is, then a cut is determined.
•
Gradual transitions are harder to detect. There are
several methods, which include the twincomparison algorithm. This works by noticing that
the first and last transition frames are different,
and any consecutive frames between them are
similar.
Video Analysis and Extraction
Twin-Algorithm Results
Video Analysis and Extraction
Scene and Object Detection
• We want to identify objects in a video. One
method of finding this is the opposite of transition
detection. Instead of finding the differences
between frames above a threshold, we want to
find image regions below a certain threshold.
•
Another method is to take an image and try all
possible transformations between the edges of the
two images.
Video Analysis and Extraction
Text Extraction
• We want to retrieve the captions in an video.
While most text segmentation is done on high
resolution media, video is low resolution.
•
One method is to assume that the gray levels of
the text is lighter or darker than the background.
Using a minimum difference with the background,
the text can be extracted.
Video Analysis and Extraction
Example of Text Extraction
Video Analysis and Extraction
So we see that video analysis and extraction is
useful in our projects.
The Classroom Project:
Object detection is used for finding the location of
the professor.
Text extraction is useful for capturing text in the
PowerPoint slides shown in a video.
Video Analysis and Extraction
The NYC Traffic Project:
Object detection is used for detecting how heavy
or light the traffic is.
Transition detection is used to see if we are
looking at the same view, or if the view has
changed.
MPEG-4
•
Is an ISO/IEC compression standard created by
the Moving Pictures Expert Group (MPEG).
•
Has been successfully used in:
•
•
•
digital television
interactive graphics applications
interactive multimedia
MPEG-4
•
Can bring multimedia to new networks such as
mobile networks.
•
Media objects are audio, video, or audiovisual
contents and can be natural (recorded using a
camera and/or microphone) or synthetic
(generated using a computer).
MPEG-4
•
An example of an MPEG-4 scene.
MPEG-4
•
The media objects are independent from their
background. This allows easy extraction of the
object and easier editing of an object.
•
The objects are synchronized by time and space.
MPEG-4
•
With a set of media objects, MPEG-4 allows us to:
• place objects anywhere in a given coordinate system.
• apply transforms to change an visual object
geometrically or change an audio object acoustically.
• group objects together (such as the visual image of the
person, and their voice).
• apply streamed data to media objects to modify their
attributes.
• change the user’s viewpoint or listening point anywhere
in the scene.
Encoder / Decoder Definitions
•
Encoder: To format (electronic data) according to
a standard format.
•
Decoder: to recognize and interpret (an electronic
signal)
MPEG-4 Encoder / Decoder
While many MPEG-4 encoders and decoders exists
as standalone applications, we want to be able to
encode and decode using Java, Java 3D, or
OpenGL.
MPEG-4 Encoder / Decoder
•
IBM Toolkit for MPEG-4 is a set of Java classes and
API with five applications.
• AVgen: a simple, easy-to-use GUI tool for creating
audio/video-only content for ISMA- or 3GPP-compliant
devices
• XMTBatch: a tool for creating rich MPEG-4 content
beyond simple audio and video
• M4Play: an MPEG-4 client playback application
• M4Applet for ISMA: a Java player applet for ISMAcompliant content
• M4Applet for HTTP: a Java applet for MPEG-4 content
played back over HTTP.
MPEG-4 Encoder / Decoder
Add media object
IBM MPEG-4 XMT Editor Tool
Object
Attributes
Time Frame
MPEG-4 Encoder / Decoder
•
IBM MPEG-4 Demos:
http://www.research.ibm.com/mpeg4/Demos/DemoSystems.htm
•
SKLMP4 Encoder / Decoder
is a C++ library that is capable of encoding and
decoding MPEG-4
http://skal.planet-d.net/coding/mpeg4codec.html
MPEG-4
MPEG-4 can make it easier for us to extract the
objects since each object is independent of each
other.
The Classroom Project:
The professor is an image object, separated from
the PowerPoint background.
MPEG-4
The NYC Traffic Project:
The background (roads) are separate from the
objects (cars).
The interactivity that MPEG-4 allows can make the
user interface easier to interact with. They can
point and click on the map and view the cameras
in that location.
MPEG-7
•
Since audiovisual data is increasing and coming
from many different sources, searching for a
certain type of media content will be more difficult.
Therefore we need a way to search the data
quickly and efficiently. The solution is MPEG-7.
•
MPEG-7 is a standard for describing media content.
Unlike MPEG-1, MPEG-2, and MPEG-4, MPEG-7 is
not a standard for the actual coding of moving
pictures and audio.
MPEG-7
•
MPEG-7 uses XML Schema as the language of
choice for content description.
•
These descriptions may include information
describing the creation of the content (title,
author). It may include the storage features of the
content (storage format, encoding). It can contain
low level features in the content (color, texture,
shape, motion, audio).
So what will MPEG-7 standardize?
•
A set of descriptors (D): Descriptors define the
syntax and the semantics of each feature
(metadata element).
•
A set of description schemes (DS): A description
scheme specifies the structure and semantics of
the relationships between its components.
So what will MPEG-7 standardize?
•
Description Definition Language (DDL): to define
the syntax of the descriptors and description
schemes.
Some possible MPEG-7 Applications
•
Audio: play a few notes on the keyboard, and it
will return musical pieces with similar tunes.
•
Graphics: sketch a few lines on a screen and get a
set of images containing similar graphics or logos .
•
Images: define objects, color patterns or textures
and retrieve images that look like the image
described.
MPEG-7 Encoder / Decoder
•
MPEG-7 Library is a set of C++ classes,
implementing the MPEG-7 standard.
http://iis.joanneum.at/mpeg-7/overview.htm
•
Java MPEG-7 Audio Encoder is a java library that
provides a MPEG-7 audio encoder to describe an
audio content with some descriptors of the MPEG-7
standard.
http://www.ient.rwthaachen.de/team/crysandt/software/mpeg7audioenc/
MPEG-7
Once we have a lot of media contents, MPEG-7
allows us to search through them easier.
The Classroom Project:
If we have a lot of videos, sound, or both. We can
find the content we need quickly.
The NYC Traffic Project:
If there are many cameras at several locations,
finding a specific location can be easier.
Discussion on Video Analysis and
Extraction, MPEG-4 and MPEG-7
Encoding and Decoding in Java, Java
3D, or OpenGL
Presented by:
Emmanuel Velasco
City College of New York