PowerPoint 簡報 - National Tsing Hua University

Download Report

Transcript PowerPoint 簡報 - National Tsing Hua University

Li Liu, Robert Cohen, Huifang Sun, Anthony Vetro, Xinhua Zhuang
BMSB 2010
1
•
•
•
•
•
•
•
•
•
Introduction
Existing New Techniques
Weighted Prediction (WP)
Localized Weighted Prediction for Video Coding
Second Order Prediction on H.264/AVC
Proposed Second-Order Prediction for Inter Coding
Proposed Reduced Resolution Update for Intra Coding
Experimental Results
Conclusion
2
• Increasing popularity of high definition TV, video delivery on
mobile devices , and other multimedia applications creates new
demands for video coding standards.
• Both MPEG and VCEG launched their next-generation video
coding project, which potentially could be either an extension of
H.264/AVC or a brand new standard.
• In January 2010, MPEG and VCEG have established a Joint
Collaborative Team on Video Coding (JCT-VC) to develop the
proposed High Efficiency Video Coding (HEVC) standard.
3
• To provide a software platform to gather and evaluate these
new techniques, a Key Technique Area (KTA) platform was
developed based on JM11.
•
•
•
•
•
•
Intra Prediction : BIP, MDDT
Inter prediction : increasing resolution to 1/8-pel , MB size to 64x64
Quantization : RDOQ, AQMS
Transform : 16x16 transform
In-loop Filter : QALF, BALF
Internal bit-depth increase : 12 bits of internal bit depth for 8-bit source
4
• An early form of second-order prediction.
• For scenes with temporal brightness variations (illumination changes, fadein/out effects, camera flashes)
• Multiplicative weighting factor a and an additive weighting offset b are
used to enhance motion compensation :
𝐼 𝑥, 𝑦, 𝑡 = 𝑎𝐼 𝑥 + 𝑚𝑣𝑥, 𝑦 + 𝑚𝑣𝑦, 𝑡 − 1 + 𝑏
• 𝐼 𝑥, 𝑦, 𝑡 ∶ brightness intensity of pixel (𝑥, 𝑦) at time t
• 𝑚𝑣𝑥, 𝑚𝑣𝑦 : 𝑚𝑜𝑡𝑖𝑜𝑛 𝑣𝑒𝑐𝑡𝑜𝑟
5
• Lighting conditions may vary not only between frames but also
within a frame, to handle local lighting variation.
• Assume the spatial variance of the intensity in a region is small , represent
the brightness variation only using a weighting offset b.
𝑠 𝑥, 𝑦 = 𝑟 𝑥 + 𝑚𝑣𝑥, 𝑦 + 𝑚𝑣𝑦 + 𝑏
[𝑥, 𝑦] ∈ 𝐵𝑘
• 𝑠 𝑥, 𝑦 : pixels in the source picture
• 𝑟 𝑥, 𝑦 : pixels in the reference frame
𝑏 = 𝑚𝑒𝑎𝑛 𝑠𝐵𝑘 − 𝑚𝑒𝑎𝑛(𝑟𝐵𝑘 (𝑚𝑣))
• Assume the correlation between the neighboring samples and the current
block is high.
𝑏 ≈ 𝑚𝑒𝑎𝑛 𝑠𝐸
•
𝐵𝑘
− 𝑚𝑒𝑎𝑛 𝑟𝐸
𝐵𝑘
(𝑚𝑣)
𝐸 𝐵𝑘 : reconstructed neighboring samples of 𝐵𝑘
[1] Peng Yin, Alexis Michael Tourapis , Jill Boyce, “Localized Weighted Prediction for Video
Coding,” IEEE Circuits and Systems, 2005.
6
7
• For image without brightness change, the proposed method can
also reduce coding efficiency.
• LWP Adaption
• Calculate weighting factor
• 𝑎 = 𝑚𝑒𝑎𝑛(𝑠[𝑥, 𝑦])/𝑚𝑒𝑎𝑛(𝑟[𝑥, 𝑦])
• Comparing the distance between the current picture and its closet
reference picture
• 𝑎𝑏𝑠_𝑑𝑖𝑓𝑓 = 𝑚𝑒𝑎𝑛 𝑎𝑏𝑠 𝑠 𝑥, 𝑦 − 𝑟 𝑥, 𝑦
• 𝑎𝑏𝑠_𝑑𝑖𝑓𝑓_𝑤𝑝 = 𝑚𝑒𝑎𝑛 𝑎𝑏𝑠 𝑠 𝑥, 𝑦 − 𝑐𝑙𝑖𝑝(𝑎 ∗ 𝑟 𝑥, 𝑦 )
• If 𝛼 ∗ 𝑎𝑏s_diff < 𝑎𝑏𝑠_𝑑𝑖𝑓𝑓_𝑤𝑝, LWP is not used
𝛼 = 0.8
• Otherwise, use LWP to code current picture
8
• Implement:
• Step1: decide if LWP should be
used for current slice. If not,
perform normal coding as
H.264 does, otherwise, go to
step2.
• Step2 : for each MB, first
calculate the mean of the
reconstructed neighboring pixel
of the current MB. Then perform
ME and mode decision using
LWP.
9
• The predicted blocks generated by MCP will result in low
coding efficiency when the video containing complex movements
such as shape transforming, rotation or fading.
• Weighed prediction in H.264/AVC is presented to deal with the
fading sequences with global illumination change between
frames.
• Utilizes only temporal correlation but no spatial correlation.
• Can’t handle motion like shape transforming and rotation.
• This paper proposes a Second Order Prediction (SOP) to
exploit remaining signal correlation after MCP.
[2] Shangwen Li, Sijia Chen, Jianpeng Wang and Lu Yu, “Second Order Prediction on
H.264/AVC,” Picture Coding Symposium, 2009.
10
※ All-black blocks indicate the MBs applying P-skip mode in the bit-stream
• Slight rotation
• Visible residual
• Residual exhibit high spatial correlation
11
• Residual Subjective-Textured MBs (RST MBs) : MBs with
relatively large residuals.
More than
twice
12
• Apply intra-prediction of H.264/AVC to residuals of interprediction.
• The reconstructed pixel values of an SOP MB are derived as
follow :
• Reconstructed pixel-value
= Motion-compensated prediction (first-predictor)
+ Prediction of first order residuals (second-predictor)
+ Second order residuals (need to be coded)
• It seems straightforward to use the previously reconstructed
first-order residuals of the neighboring blocks as reference for
the current block.
13
• The discontinuity caused by motion between blocks will prohibit
the efficient utilization of the remaining correlation of the firstorder residuals.
14
• Reference generation :
• Get reconstructed pixel values R(𝑥 + 𝑑𝑥 , 𝑦 + 𝑑𝑦 ) in the current frame, and
reconstructed pixel values R1 𝑥 + 𝑑𝑥 + 𝑚𝑣𝑥 , 𝑦 + 𝑑𝑦 + 𝑚𝑣𝑦 in the
temporal reference frame
• 𝑑𝑥 is integer within [-1, 2*(n-1)] when 𝑑𝑦 = -1
• 𝑑𝑦 is integer within [-1, (n-1)] when 𝑑𝑥 =-1
• 𝑚𝑣𝑥 , 𝑚𝑣𝑦 is the motion vector of the current block.
• Get the reference first-order residual RFR :
𝑅𝐹𝑅 𝑥 + 𝑑𝑥 , 𝑦 + 𝑑𝑦 =
𝑅 𝑥 + 𝑑𝑥 , 𝑦 + 𝑑𝑦 −
R1 𝑥 + 𝑑𝑥 + 𝑚𝑣𝑥 , 𝑦 + 𝑑𝑦 + 𝑚𝑣𝑦
15
• SOP may take 4x4 or 8x8 block as its second prediction unit.
• Nine 4x4 intra prediction modes of AVC Baseline profile
• Nine 8x8 intra prediction modes of AVC High profile
• Transform of the second prediction residuals takes the same
block size as the second prediction. The block size might be
chosen adaptively based on rate-distortion criterion.
16
• Coding of the additional side information of SOP
• Indicator of SOP
• An SOP flag to indicate the usage of SOP at MB level.
• Mode indicator of the second prediction mode
• A second prediction mode is calculated for each MB, and the coding
procedure is the same as that of 4x4 or 8x8 intra-prediction modes
encoding in H.264/AVC.
• The decision of whether an MB will be coded in SOP mode
follow the rate-distortion criterion.
17
Environment
Low-delay IPPP encoding on
H.264/AVC JM10.1 Baseline and
Benchmarking with its P-picture
coding
Stable
improvement at
most 0.41dB
gain
18
• It is not efficient for blocks whose size is smaller than 8x8, as
too much side information needs to be coded.
• Partition size larger than 8x8 will be divided into multiple 8x8
sub-blocks, each with its own second-order prediction mode.
19
• Reduced resolution update (RRU) is a technique that
aims to save coding bits by resize image/prediction
residuals to a reduced spatial resolution.
• At low bit rates, it’s known that down-sampling an
image to a low resolution, then compressing the lower
resolution, and interpolating the result to the original
resolution can improve the overall PSNR.
20
JPEG
Reduced resolution
• Blocks of 8x8 pixels
• Allocate too few bits (4 bits per
block on average)
• Only DC coefficients are coded
• Blocking artifacts
21
22
• Modified the framework of H.264/AVC so that residual after intra
prediction can be optionally down-sampled before the transform and
quantization steps.
• For instance : a 16x16 block can be down-sampled by a factor of 2 so that
only an 8x8 block needs to be encoded
• Decoder shall up-sample the down-sampled residual to reconstruct full
resolution picture.
• The choice of RRU should be considered under RDO.
23
• Second-Order Prediction for Inter Coding
• Gains are not significant
• The optimal motion vector position may be different from first-order
motion vector. Perform motion vector search for each individual
second-order prediction mode.
• Increase computational complexity.
Environment
H.264/AVC JM15.1
Compare with original H.264/AVC inter coding
First frame : I picture Remaining : P pictures
Only 4x4 DCT is allowed
QP : 23,28,33,38
24
• Proposed Reduced Resolution Update for Intra Coding
Environment
H.264/AVC JM15.1
Compare with original H.264/AVC intra coding
16x16 blocks
Down-sampling : 5-tap filter [-1 2 6 2 -1]/8
Up-sampling : 7-tap filter [-1 0 9 16 9 0 1]/16
All frames are I frames
QP : 23,28,33,38
Each sequence is coded using RRU
25
• RRU improves the coding
efficiency for medium
content complexity.
• H.264/AVC is efficient for
flat areas.
• RRU may bright too much
loss for areas with high
frequency content.
26
• RRU works well for 16x16 blocks, the contribution of RRU to
overall intra coding shall depend on the percentage 16x16
block size is used over 4x4 and 8x8 modes.
27
• Both the new techniques listed and our experiments on secondorder prediction and RRU prove that there is still room for
performance improvement of current coding standard.
• The Call for Evidence for HVC provided results that averaged a
15-25% gain in coding efficiency.
28