MPEG-4 Natural Video Coding

Download Report

Transcript MPEG-4 Natural Video Coding

MPEG-4: Multimedia Coding Standard
Supporting Mobile Multimedia System
-MPEG-4 Natural Video Coding
April, 2001
MPEG-4 Natural Video Coding
MPEG-4 Video Coding Basics
Video Coding Details
Binary Shape Coder
Motion Coder
Video Texture Coder
Scalable Video Coding
MPEG-4 Video Coding Basics
Semantic segmentation
Class hierarchy
Prediction structure
Logical structure
Detail structure
Semantic Segmentation of a picture into VOPs
Video Object
VO’s instances – Video
Object Planes
VOP described by texture
variations and shape
representation
Class hierarchy for structuring coded video Data
Video Session
Video Object
Video Object Layer
Group of Video Object
Planes
Video Object Plane
An prediction structure using I-,P-,B-VOPs
I - Intra Picture (I-VOP)
P - Predictive Picture (PVOP)
B – Bidirectionally
Predictive Picture (BVOP)
Logical structure of VO based codec of MPEG-4
Video
Detailed structure of video objects encoder
Binary Shape Coder
Generating binary alpha planes
Completely transparent
Completely opaque
Partially transparent
Coding binary alpha planes
Lossy or losslessly depends on
threshold
selecting a maximum
subsampling factor on 16*16
binary alpha that results in just
acceptable distortion
Binary Shape Coder (Cont’)
Assigning mode to binary alpha planes
1. Zero differential motion vector and on inter shape update
2. Nonzero differential motion vector and no inter shape update
3. Transparent
4. Opaque
5. Intra shape
6. Zero differential motion vector and inter shape update
7. Nonzero differential motion vector and inter shape update
Depending on the coding mode and whether it is an I-, P- or BVOP, a variable length codeword is assigned identifying the
coding type of the binary alpha block
Motion Coder
Motion Coder consists of:
-Motion Estimator
For P-VOPs, compute motion vectors using the current VOP
and temporally previous reconstructed VOP available from the
previous reconstructed VOPs store
For B-VOPs, compute motion vectors using the current VOP
and temporally previous reconstructed VOP from the previous
reconstructed VOP store, as well as, the current VOP and
temporally next VOP from the next reconstructed VOP store
Motion Coder (cont’)
-Motion Compensator
Use the motion vectors to compute motion
compensated prediction signal using the temporally
previous reconstructed version of the same VOP
-Previous/Next VOPs Store
-Motion Vector (MV) Predictor and Coder
Generate prediction for the MV to be coded
Video Texture Coder
The Texture Coder codes the luminance and
chrominance variations of blocks forming
macroblocks within a VOP.
The blocks that lie inside the VOP are coded using
DCT coding
The blocks that lie on the VOP boundary are first
padded and then coded using DCT coding
The remaining blocks are not coded at all
Scalable Video Coding
Scalability of video is the property that allows a video
decoder to decode portions of the coded bistreams to
generate decoded video of quality commensurate
with the amount of data decoded.
Temporal Scalability
Spatial Scalability
Temporal Scalability
The base layer is shown to
have one-half of the total
temporal resolution to be coded
The base layer is coded
independently as in normal
video coding
The enhancement layer uses BVOPs that use both, an
immediate temporally previous
decoded base layer VOP as well
as an immediate temporally
following decoded base layer
VOP for prediction
Spatial Scalability
The base layer is shown to
have one-quarter resolution
of the enhancement layer
Base layer is coded
independently as in normal
video coding
The enhancement layer
mainly uses B-VOPs that use
both, an immediate previous
decoded enhancement layer
VOP as well as a coincident
decoded base layer VOP for
prediction