Overview of the H.264/AVC Video Coding Standard
Download
Report
Transcript Overview of the H.264/AVC Video Coding Standard
Overview of the H.264/AVC
Video Coding Standard
T. Wiegand, G.J. Sullivan, G. Bjøntegaard and A. Luthra,
IEEE Transaction on Circuits and Systems for Video
Technology, Vol. 13, no. 7, Jul. 2003.
Presented by Peter
H.264/AVC
Latest Video coding standard
Basic design architecture similar to MPEG-x
or H.26x
Better compression efficiency
Up to 50% in bit rate savings
Subjective quality is better
Advance functional element
History of H.264/AVC
Initiate by the Video Coding Experts Group (VCEG)
in early 1998
Previous name H.26L
Target to double the coding efficiency
First draft was adopted in Oct. of 1999
In Dec. of 2001, VCEF and the Moving Pictures
Experts Group (MPEG) formed a Joint Video Team
(JVT)
Approved by the ITU-T as H.264 and ISO/IEC as
International Standard 14496-10 (MPEG-4 part 10)
Advanced Video Codec (AVC) in Mar. 2003
Timeline of Video Development
Design Features Highlights
Features for enhancement of prediction
Directional spatial prediction for intra coding
Variable block-size motion compensation with small block
size
Quarter-sample-accurate motion compensation
Motion vectors over picture boundaries
Multiple reference picture motion compensation
Decoupling of referencing order form display order
Decoupling of picture representation methods from picture
referencing capability
Weighted prediction
Improved “skipped” and “direct” motion inference
In-the-loop deblocking filtering
Design Features Highlights
Features for improved coding efficiency
Small block-size transform
Exact-match inverse transform
Short word-length transform
Hierarchical block transform
Arithmetic entropy coding
Context-adaptive entropy coding
Design Features Highlights
Features for robustness to data errors/losses
Parameter set structure
NAL unit syntax structure
Flexible slice size
Flexible macroblock ordering (FMO)
Arbitrary slice ordering (ASO)
Redundant pictures
Data Partitioning
SP/SI synchronization/switching pictures
Directional spatial prediction for intra
coding
Intra prediction is to predict the texture in current block using
the pixel samples from neighboring blocks
Intra prediction for 44 and 16 16 blocks are supported in
H.264
Figs. from [2]
Directional spatial prediction for intra
coding - 4 4 example
Mode 7 is selected
Figs. from [2]
Directional spatial prediction for intra
coding – 16 16 example
Mode 3 is selected
Figs. from [2]
Variable block-size motion compensation
with small block size
Partitioned in 2 stages
In the 1st stage, determine first 4
modes
If mode 4 (88) is chosen, further
partition into smaller blocks for
every 88 block
1616, 168, 816, 88
84, 48, 44
At most 16 motion vectors may
be transmitted for a 1616
macroblock
Large computational complexity
to determine the modes
Fig. from [3]
Variable block-size motion compensation
with small block size
Multiple reference picture motion
compensation – P Slices
More than one prior coded
picture can be used as reference
for MC prediction
Reference index parameter is
Fig. from [1]
transmitted for each MC 1616, 168, 816 or
88
For smaller blocks within the 88 use 1 reference
index
P macroblock can also be coded in P-Skip type
Multiple reference picture motion
compensation – B Slices
Utilize two distinct lists of reference pictures
Four different types of inter-picture predict
Bi-predictive
weighted average of MC list 0 and list 1
Direct prediction
List 0, list 1, bi-predictive, and direct prediction
Inferred from previously transmitted syntax
Either list 0 or list 1 prediction or bi-predictive
Similar macroblock partitioning as P slices is utilized
B_Skip mode is supported
Small block-size transform
Transformation is applied on 44 blocks
Close to 44 DCT transform
Inverse-transform mismatches are avoided
The transform matrix is given as
Short word-length transform
Post-scaling matrix in forward transform
Pre-scaling matrix in inverse transform
Only integer operations and shifting are needed in transformation and
quantization
Hierarchical block transform
For macroblock is coded in
1616 Intra mode and
chrominance blocks
DC coefficients are further
grouped and transformed
Hadamard transform is used for
chrominance block
Intended for coding of smooth
areas
Figs. from [4]
Some results – Foreman QCIF @ 10 Hz
Fig. from [1]
Some results – Foreman CIF @ 30 Hz
Fig. from [1]
Profiles
3 profiles - Baseline, Main and Extended
Profile
15 levels
Picture size: up to 250M pixels/s
Bit Rate: up to 240M bps
Potential Applications
Baseline (low latency)
Main (moderate latency)
Modified H.222.0/MPEG-2
Broadcast via satellite, cable, terrestrial or DSL
DVD and VOD
Extended
H.320 conversational video services
3GPP conversational H.324/M services
H.323 with IP/RTP
3GPP using IP/RTP and SIP
3GPP streaming using IP/RTP and RTSP
Streaming over wired Internet
Any (no requirement on latency)
3GPP MMS
Video mail
References
1.
2.
3.
4.
T. Wiegand, G.J. Sullivan, G. Bjøntegaard and A.
Luthra, “Overview of the H.264/AVC Video Coding
Standard,” IEEE Transaction on Circuits and
Systems for Video Technology, Vol. 13, no. 7, Jul.
2003.
I.E.G. Richardson, “H.264/MPEG4 Part 10: Intra
Prediction,” available at http://www.vcodex.com
I.E.G. Richardson, “H.264/MPEG4 Part 10: Inter
Prediction,” available at http://www.vcodex.com
I.E.G. Richardson, “H.264/MPEG4 Part 10:
Transform and Quantization,” available at
http://www.vcodex.com