Transcript Document

LAYERED QUALITY ADAPTATION
for
INTERNET VIDEO STREAMING
by
Reza Rejaie, Mark Handley and Deborah Estrin
Information Science Institute (ISI), University of Southern California
published in
IEEE Journal On Selected Areas In Communications
Vol. 18., NO.12, December 2000
Review:
- 2 types of flows: TCP vs. UDP
- TCP
: responsive to the congestion
: reliable (provide retransmission)
- UDP
: non-responsive
: non-reliable
Problem:
- Explosive growth of audio & video
streaming
- Streaming applications: rate-based,
delay-sensitive, semi-reliable
- Lack of effective congestion
control mechanism in such applications
To support streaming applications
over the Internet, two conflicting
requirements have to be addressed:
- Application Requirement
* require relatively constant BW
to deliver a stream with a certain
quality
- Network Requirement
* end systems are expected to
react to congestion properly and
promptly
To satisfy these two requirements
simultaneously, Internet streaming
applications should be quality adaptive
- the application should adjust the
quality of the delivered stream such
that the required BW matches the
congestion controlled rate-limit
- the main challenge is to minimize
the variations in quality, while obeying
the congestion controlled rate-limit
This paper presents:
- the novel quality adaptation mechanism
which can adjust the quality of congestion
controlled video playback on-the-fly
- the key feature of this mechanism is
the ability to control the level of
smoothing (i.e., frequency of changes) to
improve quality of the delivered stream
Primary Assumption:
- the congestion control protocol is Rate
Adaptation Protocol (RAP)
- RAP is a rate-based congestion control
mechanism that employs an Additive
Increase Multiplicative Decrease (AIMD)
algorithm in a manner similar to TCP
RAP Increase/Decrease Algorithm
Base Equation: Si = PacketSize / IPGi
To increase the rate additively,
IPGi+1 = (IPGi * C) / (IPGi + C)
where C is a constant with the dimension
of time
Upon detecting congestion, the tx rate
is decreased multiplicatively,
Si+1 =  * Si, IPGi+1 = IPGi / 
where  = 0.5
RAP Decision Frequency
- how often to change the rate
- depends on the feedback delay
- feedback delay in ACK-based schemes
is equal to one RTT
- it is suggested that rate-based schemes
adjust their rates not more than once
per RTT
Target Environment
-
heterogeneous clients
Video on demand
dominant competing traffic is TCP
short startup latency expected
Client#4
Client#3
remote
video
server
INTERNET
Client#2
Client#1
Quality Adaptation Mechanisms
- “Adaptive Encoding”:
requantize stored encoding on-the-fly
based on the network feedback
disadv: CPU-intensive task
- “Switching among multiple pre-encoded
versions”:
the server keeps several versions of
each stream with different qualities. As
available BW changes, the server chooses
the appropriate version of the stream
disadv: need large server’s buffer
- “Hierarchical Encoding”:
the server maintains a layered encoded
version of each stream. As more BW is
available, more layers of the encoding are
delivered. If the available BW decreases,
the server may then drop some of the
layers being transmitted
*** layered approaches usually have the
decoding constraint that a particular
enhancement layer can only be decoded if
all the lower quality layers have been
received
adv: - less storage at the server
- provides an opportunity for selective
selective repair of the more important
information
*** the design of a layered approach for
quality adaptation primarily entails the
design of an efficient add and drop
mechanism that maximizes quality while
minimizing the probability of base-layer
buffer underflow
NOTE !
- Hierarchical Encoding provides an
effective way for a video server to coarsely
adjust the quality of a video stream without
transcoding the stored data
- However, it does not provide the
fine-grained control over BW, that is, BW
only changes at the granularity of a layer
- there needs to be a quality adaptation
mechanism with the ability to control the
level of smoothing
“short-term improvement”
vs.
“long-term smoothing”
- in the aggressive approach, a new layer is added as
a result of minor increase in available BW. However, this
additional bandwidth does not last long. Thus, the
aggressive approach results in short-term improvement
- in contrast, the conservative approach does not adjust
the quality due to the minor changes in BW. Thus, this
results in long-term smoothing
- Hierarchical encoding allows video
quality adjustment over long periods of time,
whereas congestion control changes the tx
rate rapidly over short time intervals
- The mismatch between the two
time scales is made up for by buffering data
at the receiver to smooth the rapid
variations in available BW and allow a near
constant number of layers to be played
The Proposed Architecture
all the streams are layered-encoded and
stored at the server
- all active layers are multiplexed into a single
RAP flow by the server
-
- at the client side, layers are de-multiplexed
and each one goes to its corresponding buffer
- the decoder drains data from buffers and
feeds the display
- assume that, each layer has the same BW
and all buffers are drained with the same
constant rate (C)
- the congestion control module continuously
reports available BW to the quality
adaptation module
- the quality adaptation module then adjusts
# of active layers and allocated share of
congestion controlled BW to each active layer
Challenging Questions:
- When is it suitable to add a new layer ?
- When should we drop a layer ?
- What is the optimal allocation of each
layer ?
Assumption:
- would like to survive a single backoff
with all the layers intact
Adding a Layer:
The server may add a new layer when:
1.the instantaneous available BW is
greater than the consumption rate of the
existing layers plus the new layer
R > (na + 1) C, and
2. there is sufficient total buffering
at the receiver to survive an immediate
backoff and continue playing all the
existing layers plus the new layer
 i=0 to na –1 bufi  ([na + 1] C – [R/2])2
2*S
where R
na
bufi
S
:
:
:
:
the
the
the
the
current tx rate
number of currently active layer
amount of buffered data for layer i
rate of linear increase in BW
Dropping a Layer:
- once a backoff occurs, if the total
amount of buffering at the receiver is
less than the estimated required buffering
for recovery (i.e., the area of triangle
cde in fig.5), the correct action is to
immediately drop the highest layer
- if the buffering is still insufficient, the
server should interactively drop the
highest layer until the amount of buffering
is sufficient
while ( naC > R +  2S  i=0 to na-1 bufi )
do
na = na - 1
Optimal Interlayer Buffer Allocation:
- due to the decoding constraint in
hierarchical encoding, each additional
layer depends on all the lower layers
- a buffer allocation mechanism should
provide higher protection for lower layers
by allocating a higher share of total
buffering to them
- the minimum number of buffering layers
that are needed for successful recovery
from short-term reduction in available BW
can be determined as:
nb = na – [R/(2*C)] , naC > R/2
nb = 0
, naC <= R/2
where
nb : # of buffering layers
na : # of active layers
- the consumption rate of a layer must be
supplied either from the network or from
the buffer or a combination of the two
-If it is supplied entirely from the buffer,
that layer’s buffer is draining at
consumption rate C.
The optimal amount of buffering for
layer i is:
if i < nb – 1,
Bufi,opt = C [C (2na – 2i - 1) - R]
2S
if i = nb – 1 (the top layer),
Bufi,opt = C [naC – (R/2) – (i*C)]2
2S
Fine-Grain Bandwidth Allocation:
- the server can control the filling and
draining pattern of receiver’s buffers
by proper fine-grain bandwidth allocation
among active layers
- fine-grain bandwidth allocation is
performed by assigning the next packet
to a particular layer
- the main challenge is that the optimal
interlayer buffer allocation depends on
the transmission rate at the time of a
backoff (R), which is not known a priori
because a backoff may occur at any
random time
- to tackle this problem, during the
filling phase, the server utilizes extra
BW to progressively fill receiver’s buffers
up to an optimal state in a step-wise
fashion
- the server maintains an image of the
receiver’s buffer state, which is
continuously updated based on the
playout information included in ACK pkts
- During the filling phase, the extra BW
is allocated among buffering layers on a
per-packet basis through the following
steps assuming a backoff will occur
immediately:
1. if we keep only one layer (L0), is there sufficient
buffering with optimal distribution to recover ?
* if there is no sufficient buffering, the next packet
is assigned to L0 until this condition is met and then the
second step is started
2. if we keep only two layers (L0, L1), is there sufficient
buffering with optimal distribution to recover ?
* if there is no sufficient buffering, the next packet
is assigned to L0 until it reaches its optimal level. Then,
the server starts sending packets for L1 until both layers
have the optimal level of buffering to survive
- we then start a new step and increase the number of
expected surviving layers, calculate a new optimal buffer
distribution and sequentially their buffers up to the new
optimal level
- this process is repeated until all layers can survive a
single backoff
-During the draining phase, BW share
plus draining rate for each layer is equal
to its consumption rate.
- Thus, maximally efficient buffering
results in the upper layers being supplied
from the network during the draining
phase, while the lower layers are supplied
from their buffers
If we would like to survive more than
one backoff (Kmax > 1):
- During the filling phase:
- During the draining phase:
* due to one or more backoffs
* reverse the filling phase
* identify between which two steps we are
currently located. This determines how
many layers should be dropped due to lack
of sufficient buffering
* then, we traverse through the steps in
the reverse order to determine which
buffering layers must be drained and by
how much
* the amount and pattern of draining is
then controlled by fine-grain interlayer
BW allocation by the server