slides - network systems lab @ sfu
Download
Report
Transcript slides - network systems lab @ sfu
Flexible Transport of 3-D Videos
Over Networks
Ahmed Hamza
Network Systems Lab
Simon Fraser University
July 15, 2013
Outline
Introduction
State of the Art
3D Video Representation
3D Video Coding
Transport Protocols
P2P Streaming
Adaptive 3D Video Streaming
Stereo Video
Multi-view Video
Case Study: DIOMEDES
Outline
Introduction
State of the Art
3D Video Representation
3D Video Coding
Transport Protocols
P2P Streaming
Adaptive 3D Video Streaming
Stereo Video
Multi-view Video
Case Study: DIOMEDES
Introduction
Introduction
In the near term, popular 3-D media will most likely be in the
form of stereoscopic and multi-view video.
Transmission of 3-D media, via broadcast or on-demand, to
end users with varying 3-D display terminals (e.g., TV, laptop,
and mobile devices) and bandwidths is one of the biggest
challenges to bring 3-D media to the home and mobile
devices.
Two main platforms for 3-D video delivery:
digital television (DTV) platforms
Internet Protocol (IP) platforms
Platform for 3D Media Transport
IP-based Delivery Platforms
IPTV
multimedia services delivered over IP-based managed
networks that provide the required level of quality of
service (QoS) and experience, security, interactivity, and
reliability
WebTV
services offered over Internet connections that support
best effort delivery with no QoS guarantees, making them
accessible anytime, anywhere as opposed to IPTV
Hybrid DTV-IP Approach
The DVB channel is constrained by the physical channel
bandwidth to allow transmitting multi-view video (MVV).
The IP platform is more flexible in terms of bandwidth but is
not reliable.
A more recent research direction is to consider a
combination of DVB and IP platforms to deliver MVV to
provide free-view TV/video experience.
Outline
Introduction
State of the Art
3D Video Representation
3D Video Coding
Transport Protocols
P2P Streaming
Adaptive 3D Video Streaming
Stereo Video
Multi-view Video
Case Study: DIOMEDES
Stereoscopic Video
The most simple 3D video data representation
Each of the two captured views is presented to one of
the eyes
Can be multiplexed either spatially (passive) or
temporally (active)
Temporal multiplexing has the advantage of maintaining the full
resolution of each view
Disadvantage:
hardware representation dependency (acquisition process is
tailored to a specific type of displays, baseline distance between
the two cameras is fixed)
Multiplexing Stereo Video
Spatial Multiplexing
(half the resolution)
Time Multiplexing
(double the frame rate)
Video Plus Depth
2D video signal along with geometry information of the scene
texture
depth map
Multi-view Plus Depth (MVD)
Cam-6
Cam-3
Cam-0
3D Image Warping
Example
Ismaël Daribo and Hideo Saito, “A Novel Inpainting-Based Layered Depth Video for 3DTV,” IEEE
Transactions on Broadcasting, vol. 57, no. 2, June 2011
Layered Depth Video (LDV)
Main Layer
(central color view and depth map)
Enhancement Layer
(color and depth occlusions)
projected on central viewpoint
Outline
Introduction
State of the Art
3D Video Representation
3D Video Coding
Transport Protocols
P2P Streaming
Adaptive 3D Video Streaming
Stereo Video
Multi-view Video
Case Study: DIOMEDES
Three-Dimensional Video Coding
3-D video encoding depends on the transport option and raw
video format.
Simulcast encoding:
encode each view and/or depth map independently using a
scalable or non-scalable monocular video codec
enables streaming each view over separate channels
clients can request as many views as their 3-D displays require
Dependent encoding:
encode views using MVC to decrease the overall bit rate by
exploiting the inter-view redundancies
a special inter-view prediction structure must be employed to
enable view-scalable and view-selective adaptive streaming
Multi-view Video Coding (MVC)
Multi-view extension of H.264/AVC
Enables inter-view prediction
Prediction structure is simplified by restricting interview prediction to anchor pictures only
Large disparity or different camera calibration affects
coding efficiency
Reference MVC software (JMVC)
temporal and view scalability
Multi-view Video Coding (MVC)
Multi-view Plus Depth Coding
Independently code views and depth maps
Dependent encoding is also possible
Exploit correlation between texture and depth map
Examples:
sharing the texture video MVs with the depth map
utilizing inter-layer motion prediction tool in SVC
Outline
Introduction
State of the Art
3D Video Representation
3D Video Coding
Transport Protocols
P2P Streaming
Adaptive 3D Video Streaming
Stereo Video
Multi-view Video
Case Study: DIOMEDES
Transport Protocols
Transmission Control Protocol (TCP)
may not be suitable for streaming live video with a strict
end-to-end delay constraint
lack of control on delay (retransmissions)
rapidly changing transmission rate (congestion control)
provides good performance when available network
bandwidth is about twice the maximum video rate (few
seconds pre-roll delay)
Transport Protocols
Datagram congestion control protocol (DCCP)
implements bidirectional unicast connections
both data and acknowledgements can flow in both directions
congestion-controlled, unreliable datagrams
congestion control mechanism selected at connection
startup
outperforms TCP under congestion when a video
streaming scenario is considered
Outline
Introduction
State of the Art
3D Video Representation
3D Video Coding
Transport Protocols
P2P Streaming
Adaptive 3D Video Streaming
Stereo Video
Multi-view Video
Case Study: DIOMEDES
P2P Streaming
Traditional client-server unicast streaming model is not
scalable by nature.
Advantage of P2P solutions
scalable media distribution (reduce the bandwidth requirement
of the server by utilizing the network capacity of the
clients/peers)
P2P solutions use overlay networks (data are redirected to
another peer by the application)
Tree-Based Approach
Efficient for delivering content from the server that is at
the top of the tree to peers that are connected to each
other in parent–child fashion.
Shortcomings:
ungraceful peer exit leads its descendants to starvation
replicating the content for feeding multiple trees leads to
redundancy within the network
Tree-Based Approach
Mesh-Based Approach
Data are distributed over an unstructured network in
which each peer can connect to multiple peers.
Increased connectivity alleviates the problem of
ungraceful peer exit.
building multiple connections dynamically requires a
certain amount of time (initiation interval)
More suitable for applications that may tolerate some
initiation interval.
Example: BitTorrent
Outline
Introduction
State of the Art
3D Video Representation
3D Video Coding
Transport Protocols
P2P Streaming
Adaptive 3D Video Streaming
Stereo Video
Multi-view Video
Case Study: DIOMEDES
Adaptive Streaming
A mechanism should exist to estimate the network
conditions so as to adapt the video rate accordingly, in
order to optimize the received video quality.
Estimation can be performed by
requesting receiver buffer occupancy status (to prevent
buffer underflow/overflow)
combining receiver buffer status with bandwidth
estimation
Adaptive Streaming
DCCP + TCP-friendly rate control (TFRC)
TFRC rate calculated by DCCP can be utilized by the sender
to estimate the available network rate
When the video is streamed over TCP, an average of the
transmission rate can be used to determine the
available network bandwidth
Basic method in DASH
Video Rate Adaptation Methods
Adapting video rate to available bandwidth depends on
the encoding characteristics of the views.
One or more views can be encoded multiple times with
varying bit rates, sender can switch between these
streams according to the network conditions
Similar to HTTP live streaming
Encoding views once with multiple layers using SVC and
switching between these layers
Real-time encoding with source rate control
Difficult with MVV
Adaptive Stereoscopic
Video Streaming
The behavior of the human visual system is another
paradigm for QoE-aware rate adaptation.
Exploit the suppression theory
human visual system (HVS) tolerates lack of highfrequency components in one of the views
One of the views may be presented at a lower quality
without degrading the 3-D video perception.
Asymmetric quality allocation
Just Noticeable Distortion for
Asymmetric Stereo Coding
Asymmetry can be achieved by scaling the quality in
one of the views (secondary view)
in spatial, signal-to-noise ratio (SNR) or temporal
dimensions
Questions
Which method should be used?
What is the level of asymmetry before observers start
noticing visible degradations?
Just Noticeable Distortion for
Asymmetric Stereo Coding
Video Sequence
Threshold PSNR (dB)
Parallax Barrier
Polarized Projector
Adile
31.9
33.07
Iceberg
31.64
33.05
Flower Pot
31.19
33.2
Train Tunnel
31.74
32.88
Results show that the “just noticeable” threshold PSNR is
33 dB for the polarized projection display
31.5 dB for the parallax barrier display
Asymmetric Encoding
for Adaptive Streaming
Asymmetric Coding at a Fixed Rate Using MVC
Spatial asymmetry
using additional down-sampling steps in the encoding
process
Temporal asymmetry
skipping frames skipping from secondary view
SNR (quality) asymmetry
straightforward compared to other types of asymmetry
(encoding quality of a view depends on the quantization
parameter used)
Asymmetric MVC Coding
Alternating views are coded at high and low quality.
Inter-view dependencies should be carefully
constructed (predict only from high-quality views).
Asymmetric MVC Coding
Asymmetric MVC Coding
Asymmetric Encoding
for Adaptive Streaming
Scalable Asymmetric Coding Using SVC
It is possible to obtain spatial and/or quality scalable right
and left views if they are simulcast coded using the SVC
standard.
Two encoding options for achieving scalable asymmetric
stereoscopic video bitstreams when simulcast coding is
used:
encoding both views using SVC
encoding one view with SVC and the other with H.264/AVC
Scalable Video Coding (SVC)
Asymmetric Encoding
for Stereoscopic 3D Video
Can be done in two ways:
encode both views using SVC
base layer of each view is encoded with a quality ~32 dB
enhancement layers are encoded at the maximum quality
according to channel capacity
only one view (the first) is scalably encoded
second view is encoded using non-scalable H.264/AVC
When the available link capacity is high, the scalable coded
view (with the enhancement layer) becomes the high-quality
view.
Asymmetric Encoding
for Stereoscopic 3D Video
Outline
Introduction
State of the Art
3D Video Representation
3D Video Coding
Transport Protocols
P2P Streaming
Adaptive 3D Video Streaming
Stereo Video
Multi-view Video
Case Study: DIOMEDES
Adaptive Multi-view
Video Streaming
Straightforward approach:
extend the concept of asymmetric coding to MVV streaming (for
relatively small number of views)
A more efficient (in terms of bandwidth consumption) and
flexible (in terms of number of views) approach:
streaming the MVD representation (includes view scalability)
View-selective encoding and interactive streaming of multiview video
requires computer vision methods for real-time head/gaze
tracking, can be used to limit the number of views transmitted
View Scaling
Discarding one view entirely and falling back to 2D video is
not a good choice.
switching from 3D to 2D results in significant viewing discomfort
With multi-view video (MVV) format, view scaling is a
possible option
missing view(s) may be outside of the user’s field of view or can
be replaced by an artificial view generated at the client side
Challenge
How to determine which view should be discarded for minimum
degradation in perceived quality?
QoE-based Adaptation Policy
Subjective tests to evaluate the performance of scaling
methods in terms of delivered QoE under different
network conditions.
5-view 3D display at 1920x1200 screen resolution
12 male and 4 female assessors (7 experts)
Description
Method #
Detail
Symmetric Quality
Scaling
Asymmetric Quality
Scaling
View Scaling
1
2
3
4
5
6
SNR
Spatial
SNR
Spatial
3c+3d
2c+2d
QoE-based Adaptation Policy
Recommended adaptation policy:
State
Method
1
All views transmitted at max quality
2
Asymmetric SNR scaling of intermediate views
3
Keep only edge views (+ depth) and use DIBR
Adaptation-ready Encoding
Introduce quality difference between adjacent views.
View that are either transmitted or not are encoded with
H.264/AVC for high coding efficiency.
Views that may have different qualities to achieve
asymmetry are encoded using SVC.
Example:
For a five-view display, can perform this efficiently using SVC for
views 2 and 4.
MVV Adaptation Example
a)
b)
c)
High link capacity (4.5 Mbps)
Low link capacity (3.3 Mbps)
Very Low capacity (2.1 Mbps)
Outline
Introduction
State of the Art
3D Video Representation
3D Video Coding
Transport Protocols
P2P Streaming
Adaptive 3D Video Streaming
Stereo Video
Multi-view Video
Case Study: DIOMEDES
Case Study: Project DIOMEDES
European project
Peer-assisted multi-view video broadcast
Scalable architecture that utilizes the upload capacity of
peers to assist distribution of up to 200 views and
associated 3-D audio
Main Idea:
DVB-T signal provides stereoscopic 3-D media as a
baseline
P2P distribution of remaining MVV views over IP to enable
immersive free-view TV experience
DIOMEDES Architecture
Three modules:
3-D content server
master peers
3-D media streaming server
Peers that use both DVB
and IP channels synchronize
the received signals.
DIOMEDES Client
BitTorrent Protocol
Adopts a mesh-based topology
Flat connections with no hierarchy
Adopts a divide-and-conquer approach and splits
content into equally sized chunks
Peers are of two types:
Seeders: have the whole content and upload chunks to
other peers
Leechers: have some missing chunks
BitTorrent Protocol
Chunk exchange is managed by two governing policies:
Rarest-first chunk scheduling
Determines the chunks to be requested
Favors chunks that least distributed
Tit-for-Tat
Determines which chunk requests are to be accepted
Sort neighbours based on their level of contribution
May deny requests from neighbours at lower ranks
Optimistic unchoking
Modifications to BitTorrent for
3D Video Streaming
Chunk Mapping
Variable-size layered chunks
All chunks are self-decodable
Each chunk contains multiple GoPs
Adaptive video streaming
in 3D video streaming, rate adaptation is not
straightforward and may depend on external information
such as the user’s field of view, the encoding scheme, and
the display properties
Modifications to BitTorrent for
3D Video Streaming
P2P engine determines when to perform adaptation (discard/add a
stream).
Adaptation module determines which streams should be affected first.
Modifications to BitTorrent for
3D Video Streaming
Chunk Downloading
ready-to-play buffer
buffer duration is a variable that provides feedback on the
overall content retrieval rate
Modifications to BitTorrent for
3D Video Streaming
Chunk uploading
Request prioritization
Favor requests that belong to streams of high priority
Depth map streams should have the highest priority
because they are used to generate multiple views at the
client side.
Base and enhancement layers may be prioritized similar to
the case of 2D video streaming
Other Use Cases
Full-resolution stereoscopic 3D video delivery
Full resolution source video is encoded as an
enhancement layer to the base stream in a framecompatible format that is transmitted over the DTV
channel.
The enhancement layer is transmitted to enable full
resolution 3D video for users with Internet access
Other Use Cases
Other Use Cases
Head tracking system for multi-view video delivery
head tracking system coupled with a stereoscopic display
View pairs change according to a user’s viewing position
if the available link capacity is low, only required video
streams are received, based on the feedback from the
head tracking device
increase efficiency of rapid view selection by using a
sparse camera arrangement and transmitting
corresponding depth maps
Conclusions
Digital TV platforms are not flexible to support multi-view
video (cannot provide sufficient bandwidth).
Three adaptive streaming solutions:
Asymmetric streaming
Streaming using MVD
Selective streaming
Combining adaptation methods with adaptive P2P video
streaming will provide a successful 3D video services solution
in the near future.
Streaming holographic 3D video over IP might be possible on
the long term.
References
Flexible Transport of 3-D Video Over Networks, Proceedings of the
IEEE, 2011
Peer-to-peer system design for adaptive 3D video streaming, IEEE
Communications Magazine, 2013
DIOMEDES: Content Aware and Adaptive Delivery of 3D Media
over P2P/IP and DVB-T2, Networked & Electronic Media (NEM)
Summit, 2011
Evaluation of Asymmetric Stereo Video Coding and Rate Scaling for
Adaptive 3D Video Streaming, IEEE Transactions on Broadcasting,
2011
Thank You!