Transcript 下載/瀏覽
Rate-distortion
modeling of scalable
video coders
指導教授:許子衡 教授
學生:王志嘉
Introduction (i)
R-D models can be classified into two categories based
on the theory they apply: models based on Shannon's ratedistortion theory and those derived from high-rate
quantization theory
These two theories are complementary , converge to the
same lower hound D~ e-αR when the input block size goes
to infinity.
2
Introduction (ii)
Block length cannot be infinite in real coding systems, it
is widely recognized that classical rate-distortion theory is
often not suitable for accurate modeling of actual R-D
curves.
Adjustable parameters are often incorporated into the
theoretical R-D models to keep up with the complexity of
coding systems and the diversity of video sources
3
Introduction (iii)
Recall that most current R-D models are built for images
or non-scalable video coders.
In this paper, we complete the work and examine R-D
models from a different perspective.
We first derive a distortion model based on approximation
theory and then incorporate the ρ-domain bitrate model
into the final result.
4
Introduction (iv)
We also show that the unifying ρ-domain model is very
accurate in both Fine Granular Scalability (FGS) and
Progressive FGS (PFGS) coders.
Our work demonstrates that distortion D can be modeled
by a function of function of both bitrat R its logarithm log
R:
variance of the source
constants
5
Motivation (i)
A typical scalable coder includes one base layer and one
or more enhancement layers.
We examine the accuracy of current R-D models for
scalable coders. with R representing the bitrate of the
enhancement layer.
Without loss of generality, we use peak signal-to-noise
ratio (PSNR) to measure the quality of video sequences.
6
Motivation (ii)
With the PSNR measure, it is well-known that the
classical model becomes a linear function of coding rate
R:
constants
Fig. I shows that PSNR is linear with respect to R only
when the bitrate is sufficiently high and also that model (6)
has much higher convexity than the actual R-D curve.
7
Fig.1
8
Motivation (iii)
This bound is specifically developed for wavelet-based
coding schemes. Mallat extend it to transform-based low
bitrate images:
parameter
constant
9
R-D Model For Scalable Coders—
Preliminaries (i)
Uniform quantizers are widely applied to video coders
due to their asymptotic optimality.
We show the lower bound on distortion in quantization
theory assuming seminorm-based distortion measures
and uniform quantizers.
If X, X are k-dimensional vectors and the distortion
between X and X is d(X,X ) = || X- X ||τ, the minimum
distortion for uniform quantizers is
10
R-D Model For Scalable Coders—
Preliminaries (ii)
2
△ is the quantization step.
Gamma function
11
R-D Model For Scalable Coders—
Preliminaries (iii)
When r= 2, k = 1. we obtain the popular MSE formula
for uniform quantizers:
β is 12 if the quantization step is much smaller than the
signal variance
12
Distortion Analysis (i)
In the transform domain, distortion D consists of two
parts:
1) distortion Di from discarding the insignificant
coefficients in (-△, △)
2) distortion Ds from quantizing the significant
coefficients
Given this notation, we have the following lemma.
Lemma 1: Assuming that the total number of transform
coefficients U is N and the number of significant
coefficients is M,MSE distortion D is:
13
Distortion Analysis (ii)
In Fig. 2, the left side shows an example of actual
distortion D and simulation results of model (10) for
frame 3 in FGS-coded CIF Foreman, and the right side
shows the average absolute error between model (10) and
the actual distortion in FGS-coded CIF Foreman and
Carphone sequences
14
Fig.2
15
R-D Modeling (i)
To improve the unsatisfactory accuracy of current R-D
models in scalable coders. we derive an accurate R-D
model based on source statistical properties and a recent ρ
domain model.
Bitrate R is a linear function of the percentage of
significant coefficients z in each video frame.
We extensively examined the relationship between R and
z in various video frames and found this linear model
holds very well for scalable coders.
16
R-D Modeling (ii)
Fig. 3 demonstrates two typical examples of the actual
bitrate Rand its linear estiniation in FGS and PFGS video
frames.
Using the ρ-domain model, we have our main result as
following.
Theorem 1:The distortion of scalable video coders is
given by:
17
Experimental Results (i)
We apply the proposed model (14) to various scalable
video frames to evaluate its accuracy. Fig. 4 shows two
examples of R-D curves for I (left) and P (right) frames of
FGS-coded CIF Foreman.
All results shown in this paper utilize videos in the CIF
format with the base layer coded at 128 kb/s and 10
frames/s. We contrast the performance of the proposed
model with that of the other two models in FGS-coded
Foreman and Carphone in Fig. 5.
18
Fig.4
19
Fig.5
20
Experimental Results (ii)
Additionally, Fig. 6 shows the same comparison in
PFGS-coded Coastguard and Mobile.
21
Conclusion
This paper analyzed the distortion of scalable coders and
proposed a novel R-D model from the perspective of
approximation theory.
Given the lack of R-D modeling of scalable coders, we
believe this work will benefit both Internet streaming
applications and theoretical discussion in this area.
22