下載/瀏覽

Download Report

Transcript 下載/瀏覽

Rate-distortion
modeling of scalable
video coders
指導教授:許子衡 教授
學生:王志嘉
Introduction (i)


R-D models can be classified into two categories based
on the theory they apply: models based on Shannon's ratedistortion theory and those derived from high-rate
quantization theory
These two theories are complementary , converge to the
same lower hound D~ e-αR when the input block size goes
to infinity.
2
Introduction (ii)


Block length cannot be infinite in real coding systems, it
is widely recognized that classical rate-distortion theory is
often not suitable for accurate modeling of actual R-D
curves.
Adjustable parameters are often incorporated into the
theoretical R-D models to keep up with the complexity of
coding systems and the diversity of video sources
3
Introduction (iii)



Recall that most current R-D models are built for images
or non-scalable video coders.
In this paper, we complete the work and examine R-D
models from a different perspective.
We first derive a distortion model based on approximation
theory and then incorporate the ρ-domain bitrate model
into the final result.
4
Introduction (iv)


We also show that the unifying ρ-domain model is very
accurate in both Fine Granular Scalability (FGS) and
Progressive FGS (PFGS) coders.
Our work demonstrates that distortion D can be modeled
by a function of function of both bitrat R its logarithm log
R:
variance of the source
constants
5
Motivation (i)



A typical scalable coder includes one base layer and one
or more enhancement layers.
We examine the accuracy of current R-D models for
scalable coders. with R representing the bitrate of the
enhancement layer.
Without loss of generality, we use peak signal-to-noise
ratio (PSNR) to measure the quality of video sequences.
6
Motivation (ii)

With the PSNR measure, it is well-known that the
classical model becomes a linear function of coding rate
R:
constants

Fig. I shows that PSNR is linear with respect to R only
when the bitrate is sufficiently high and also that model (6)
has much higher convexity than the actual R-D curve.
7
Fig.1
8
Motivation (iii)

This bound is specifically developed for wavelet-based
coding schemes. Mallat extend it to transform-based low
bitrate images:
parameter
constant
9
R-D Model For Scalable Coders—
Preliminaries (i)



Uniform quantizers are widely applied to video coders
due to their asymptotic optimality.
We show the lower bound on distortion in quantization
theory assuming seminorm-based distortion measures
and uniform quantizers.
If X, X are k-dimensional vectors and the distortion
between X and X is d(X,X ) = || X- X ||τ, the minimum
distortion for uniform quantizers is
10
R-D Model For Scalable Coders—
Preliminaries (ii)

2
△ is the quantization step.
Gamma function
11
R-D Model For Scalable Coders—
Preliminaries (iii)

When r= 2, k = 1. we obtain the popular MSE formula
for uniform quantizers:
β is 12 if the quantization step is much smaller than the
signal variance
12
Distortion Analysis (i)



In the transform domain, distortion D consists of two
parts:
1) distortion Di from discarding the insignificant
coefficients in (-△, △)
2) distortion Ds from quantizing the significant
coefficients
Given this notation, we have the following lemma.
Lemma 1: Assuming that the total number of transform
coefficients U is N and the number of significant
coefficients is M,MSE distortion D is:
13
Distortion Analysis (ii)

In Fig. 2, the left side shows an example of actual
distortion D and simulation results of model (10) for
frame 3 in FGS-coded CIF Foreman, and the right side
shows the average absolute error between model (10) and
the actual distortion in FGS-coded CIF Foreman and
Carphone sequences
14
Fig.2
15
R-D Modeling (i)



To improve the unsatisfactory accuracy of current R-D
models in scalable coders. we derive an accurate R-D
model based on source statistical properties and a recent ρ
domain model.
Bitrate R is a linear function of the percentage of
significant coefficients z in each video frame.
We extensively examined the relationship between R and
z in various video frames and found this linear model
holds very well for scalable coders.
16
R-D Modeling (ii)



Fig. 3 demonstrates two typical examples of the actual
bitrate Rand its linear estiniation in FGS and PFGS video
frames.
Using the ρ-domain model, we have our main result as
following.
Theorem 1:The distortion of scalable video coders is
given by:
17
Experimental Results (i)


We apply the proposed model (14) to various scalable
video frames to evaluate its accuracy. Fig. 4 shows two
examples of R-D curves for I (left) and P (right) frames of
FGS-coded CIF Foreman.
All results shown in this paper utilize videos in the CIF
format with the base layer coded at 128 kb/s and 10
frames/s. We contrast the performance of the proposed
model with that of the other two models in FGS-coded
Foreman and Carphone in Fig. 5.
18
Fig.4
19
Fig.5
20
Experimental Results (ii)

Additionally, Fig. 6 shows the same comparison in
PFGS-coded Coastguard and Mobile.
21
Conclusion


This paper analyzed the distortion of scalable coders and
proposed a novel R-D model from the perspective of
approximation theory.
Given the lack of R-D modeling of scalable coders, we
believe this work will benefit both Internet streaming
applications and theoretical discussion in this area.
22