Overview of the H.264/AVC Video Coding Standard

Download Report

Transcript Overview of the H.264/AVC Video Coding Standard

Overview of the H.264/AVC
Video Coding Standard
T. Wiegand, G.J. Sullivan, G. Bjøntegaard and A. Luthra,
IEEE Transaction on Circuits and Systems for Video
Technology, Vol. 13, no. 7, Jul. 2003.
Presented by Peter
H.264/AVC



Latest Video coding standard
Basic design architecture similar to MPEG-x
or H.26x
Better compression efficiency



Up to 50% in bit rate savings
Subjective quality is better
Advance functional element
History of H.264/AVC






Initiate by the Video Coding Experts Group (VCEG)
in early 1998
Previous name H.26L
Target to double the coding efficiency
First draft was adopted in Oct. of 1999
In Dec. of 2001, VCEF and the Moving Pictures
Experts Group (MPEG) formed a Joint Video Team
(JVT)
Approved by the ITU-T as H.264 and ISO/IEC as
International Standard 14496-10 (MPEG-4 part 10)
Advanced Video Codec (AVC) in Mar. 2003
Timeline of Video Development
Design Features Highlights

Features for enhancement of prediction










Directional spatial prediction for intra coding
Variable block-size motion compensation with small block
size
Quarter-sample-accurate motion compensation
Motion vectors over picture boundaries
Multiple reference picture motion compensation
Decoupling of referencing order form display order
Decoupling of picture representation methods from picture
referencing capability
Weighted prediction
Improved “skipped” and “direct” motion inference
In-the-loop deblocking filtering
Design Features Highlights

Features for improved coding efficiency






Small block-size transform
Exact-match inverse transform
Short word-length transform
Hierarchical block transform
Arithmetic entropy coding
Context-adaptive entropy coding
Design Features Highlights

Features for robustness to data errors/losses








Parameter set structure
NAL unit syntax structure
Flexible slice size
Flexible macroblock ordering (FMO)
Arbitrary slice ordering (ASO)
Redundant pictures
Data Partitioning
SP/SI synchronization/switching pictures
Directional spatial prediction for intra
coding


Intra prediction is to predict the texture in current block using
the pixel samples from neighboring blocks
Intra prediction for 44 and 16  16 blocks are supported in
H.264
Figs. from [2]
Directional spatial prediction for intra
coding - 4  4 example
Mode 7 is selected
Figs. from [2]
Directional spatial prediction for intra
coding – 16  16 example
Mode 3 is selected
Figs. from [2]
Variable block-size motion compensation
with small block size


Partitioned in 2 stages
In the 1st stage, determine first 4
modes


If mode 4 (88) is chosen, further
partition into smaller blocks for
every 88 block



1616, 168, 816, 88
84, 48, 44
At most 16 motion vectors may
be transmitted for a 1616
macroblock
Large computational complexity
to determine the modes
Fig. from [3]
Variable block-size motion compensation
with small block size
Multiple reference picture motion
compensation – P Slices




More than one prior coded
picture can be used as reference
for MC prediction
Reference index parameter is
Fig. from [1]
transmitted for each MC 1616, 168, 816 or
88
For smaller blocks within the 88 use 1 reference
index
P macroblock can also be coded in P-Skip type
Multiple reference picture motion
compensation – B Slices


Utilize two distinct lists of reference pictures
Four different types of inter-picture predict


Bi-predictive




weighted average of MC list 0 and list 1
Direct prediction


List 0, list 1, bi-predictive, and direct prediction
Inferred from previously transmitted syntax
Either list 0 or list 1 prediction or bi-predictive
Similar macroblock partitioning as P slices is utilized
B_Skip mode is supported
Small block-size transform




Transformation is applied on 44 blocks
Close to 44 DCT transform
Inverse-transform mismatches are avoided
The transform matrix is given as
Short word-length transform



Post-scaling matrix in forward transform
Pre-scaling matrix in inverse transform
Only integer operations and shifting are needed in transformation and
quantization
Hierarchical block transform




For macroblock is coded in
1616 Intra mode and
chrominance blocks
DC coefficients are further
grouped and transformed
Hadamard transform is used for
chrominance block
Intended for coding of smooth
areas
Figs. from [4]
Some results – Foreman QCIF @ 10 Hz
Fig. from [1]
Some results – Foreman CIF @ 30 Hz
Fig. from [1]
Profiles




3 profiles - Baseline, Main and Extended
Profile
15 levels
Picture size: up to 250M pixels/s
Bit Rate: up to 240M bps
Potential Applications

Baseline (low latency)






Main (moderate latency)




Modified H.222.0/MPEG-2
Broadcast via satellite, cable, terrestrial or DSL
DVD and VOD
Extended


H.320 conversational video services
3GPP conversational H.324/M services
H.323 with IP/RTP
3GPP using IP/RTP and SIP
3GPP streaming using IP/RTP and RTSP
Streaming over wired Internet
Any (no requirement on latency)


3GPP MMS
Video mail
References
1.
2.
3.
4.
T. Wiegand, G.J. Sullivan, G. Bjøntegaard and A.
Luthra, “Overview of the H.264/AVC Video Coding
Standard,” IEEE Transaction on Circuits and
Systems for Video Technology, Vol. 13, no. 7, Jul.
2003.
I.E.G. Richardson, “H.264/MPEG4 Part 10: Intra
Prediction,” available at http://www.vcodex.com
I.E.G. Richardson, “H.264/MPEG4 Part 10: Inter
Prediction,” available at http://www.vcodex.com
I.E.G. Richardson, “H.264/MPEG4 Part 10:
Transform and Quantization,” available at
http://www.vcodex.com