Multimedia Communications

Download Report

Transcript Multimedia Communications

Video Compression
1
Video Compression Standards

JPEG: ISO and ITU-T



Moving JPEG (MJPEG)
H.261: ITU-T SG XV


for videophone at a bit-rate below 64Kbps
JBIG: ISO


for compression of combined video and audio
H.263: ITU-T SG XV


for audiovisual service at p x 64Kbps
MPEG-1, 2, 4, 7: ISO IEC/JTC1/SC29/WG11


for compression of still image
for compression of bilevel images
Non-standardized techniques


DVI: de facto standard from Intel for storage compression and real-time
decompression
QuickTime: Macintosh
2
Frame/Picture Types

I frame: Intra-coded frame




P frame: Predictively coded
frame


based on the reference frame
(previous I or P frame)
B frame: Bidirectionally
predictively coded frame


points for random access
used as a reference for
coding other frames
Use JPEG except
quantization threshhold
values are same for all DCT
components
based on the previous and
following I and/or P frames
D frame: DC coded frame


intra-coded frame, neglecting
AC coefficients
used for fast forward and
rewind mode
3
Group of Picture (GOB) Structure
4
Display and Transmission Order
Transmission order and display order may differ


Reference frames must be transmitted first
Forward prediction
1
2
3
4
5
6
I
B
B
B
P
B
7
B
8
9
B
I
Bidirectional prediction
Transmission Order : 1 5 2 3 4 9 6 7 8
IPBBBIBBB
5
Motion Estimation and Compensation


Macroblock: motion
compensation unit
Motion Estimation


Motion information



extracts the motion
information from a video
sequence
one motion vector for
forward-predicted
macroblock
two motion vectors for
bidirectionally predicted
macroblocks
Motion Compensation

reconstructs an image
using blocks from the
previous image along with
motion information, I.e.,
motion vectors
6
Implementation Issues

In case of P-frames, encoding of each
macrobock is dependent on the output of motion
estimation unit



If two contents are the same, only the address of the
MB in the reference frame is encoded
If very close, both the motion vector and the
difference matices are encoded
If no close match is found, encode in the same way
as in I-frame
7
Implementation Schematics
Bitsteram format
8
Performance

I-frame



P-frame


Similar to JPEC
10:1 – 20:1
20:1 – 20:1
B-frame

30:1 – 50:1
9
Video Compression
H.261
10
H.261 Overview

ITU-T standard for the compression/
decompression of digital video (1990)



to facilitate video conferencing and video phone over
ISDN at the rate of p x 64 kbps; p = 1,2, ... ,30
real-time encoding-decoding ( 150ms)
low-cost VLSI implementation
11
Picture preparation

An image: 3 rectangular matrices (components)




Luminance Y
Chrominance Cb (blue), Cr (red)
4:1:1 format
Image format

CIF(common intermediate format) : 352x288



QCIF(Quarter CIF) : 176x144




Used for video telephony
15 / 7.5fps, progressive scanning
QCIF is mandatory. CIF is optional
Bandwidth requirement of CIF with 15 fps




Used for video conferencing
30fps, progressive scanning
Y = 352 x 288 x 8bits/pixel x 15frame/sec
Cb + Cr = 2 x ¼ x Y
18.3 Mbps  need more than 50:1 compression for transmitting at 384
Kbps (p=6)
I, P-frames are used in H.261

3 P-frames between each pair of I-frame
12
H.261 Encoding Format
Frame format
GOB structure
Macro block format
13
H.261 Video Encoder
14
Entropy Encoding

Run-length encoding


(run, amplitude)
Huffman encoding

Huffman table is predefined by the H.261 standard


table for motion vectors
table for quantized DCT coefficient
15
Video Compression
H.263
16
H.263

Low-bit rate standard for teleconferencing
applications



An extension of H.261





Optimize H.261 so as to operate on below 64Kbps or
V.34 Modem
2.5 times more compressed than H.261
2 image formats  5 image formats
Motion-compensated prediction has been refined
supports B frame( has only P frame as a reference)
Used in IETF RTSP(Real Time Streaming
Protocol)
Used in RealPlayer G2
17
Picture Preparation

Digitization format

QCIF(Quarter CIF) : 176x144





Used for video telephony
15 / 7.5fps, progressive scanning
Sub-QCIF (S-QCIF): 128 x 96
Progressive scanning, 15 / 7.5fps
Frame types

I, P, B frames
18
Picture Processing

Unrestricted motion vectors


For those pixels of a potential close-match MB that
fall outside of the frame boundary, the edge pixels
themselves are used instead
The resulting MB produce a close match, then the
motion vector, if necessary is allowed to point outside
of the frame area
19
Error resilience


Target network for H.263 is a wireless network or PSTN  relatively
high error rate
Error propagation


Due to the resulting errors in the motion estimation vectors and motion
compensation information, errors within a GOB may propagate to other
regions of the frame
To minimize error propagation



Error tracking
Independent segment decoding
Reference picture selection
20
Error tracking

Error detection methods




Out-of-range motion vectors
Invalid variable length codewords
Out-of-range DCT coefficients
Excessive number of coefficients within a MB
21
Independent Segment Decoding


Each GOB is treated as
a separate subvideo
which is independent or
the other GOBs in the
frame
Motion estimation and
compensation is limited
to the boundary pixels of
a GOB rather than a
frame
Effect of a GOB being corrupted
Used with error tracking
22
Reference Picture Selection
NAK mode
ACK mode
23
MPEG
Video Compression
24
MPEG

MPEG(Moving Picture Experts Group)



ISO/IEC JTC1/SC29/WG11
standard for synchronized video and audio
consists of System, Video, Audio, …


MPEG-1




ISO Recommendation 11172
Intended for the storage of VHS-quality audio-visual information on CD-ROM at
bit rates up to 1.5Mbps
Video resolution: SIF (up to 352 x 288 pixels)
Compressed bandwidth  1.5 Mbps



Allows random access, fast forward, rewind
Intended for the recording and transmission of studio-quality audio and video
MPEG-4



about 1.1Mbps for video, 128Kbps for audio, remainder for system
MPEG-2


System: for multiplexing and synch.
Initially, concerned with a similar range of applications to those of H.263, at very
low bit rate 4.8 – 64 kbps
Later interactive multimedia applications over the Internet and the various types
of entertainment networks
MPEG-7


To describe the structure and features of the content of the (compressed MM
information
Used in search engine
MPEG-1
26
MPEG-1 frames



Spatial resolution: 352 x 288 pixels (SIF)
Progressive scanning with refresh rate of 30Hz (for NTSC) and 25Hz
(for PAL)
Standard allows use of




I-frames only
I- and P-frames only
I-, P-, B- frames
No D frames are supported


I-frame is used for random-access functions
Example sequence


IBBPBBPBBI… for PAL
IBBPBBPBBPBBI… for NTSC
27
Use of B Frame
28
Overview

Compression algorithm is based on H.261

MB


Cb, Cr plane: 8x8
Differences from H.261


Time-stamps (temporal references) to enable the
decoder to resynchronize more quickly in the event of
one or more corrupted or missing MBs
Introduction of B-frames,



Y plane: 16x16,
Search window in the reference frame is increased
To improve the accuracu of the motion vectors, a finer
resolution is used
Typical compression ration



I-frame: 10:1
P-frame: 20:1
B-frame: 50:1
29
30
MPEG System

MPEG Standard




Video coding
Audio coding
System coding
Timing and
Synchronization



Presentation Time
Stamps(PTS)
Decoding Time
Stamps(DTS)
System Clock
Reference(SCR)
31
MPEG-1 Video Bitstream Structure
Composition
Format

GOP layer: video coding unit





First picture must start with I frame for edting
Picture layer: primary coding unit
Slice layer: resynchronization unit
Macroblock layer: motion compensation unit
Block layer: DCT unit
32
MPEG Frame Structure
MPEG-1
MPEG-2
33
Constrained Parameter set







horizontal size <= 720 pels
vertical size <= 576 pels
total number of macroblocks/picture <= 396
total number of macroblocks/second <= 396*25
= 330*30
picture rate <= 30 fps
bit rate <= 1.86 Mbps
decoder buffer <=376,832 bits
34
MPEG Encoding Scheme
35
MPEG Decoding Scheme
36
MPEG-2
37
MPEG-2 Video






jointly developed by ISO/IEC (IS 13818-2) and
ITU-T (H.262)
permits data rates up to 100Mbps
supports interlaced video formats
supports HDTV,
can be used for video over satellite, cable, and
other broadband channels
backward compatibility with MPEG-1 and H.261
38
MPEG-1 and MPEG-2
Parameter
Standardized
Main
application
Spatial
resolution
Temporal
resolution
Bit rate
Quality
Compression
ratio over
PCM
MPEG-1
1992
Digital video on CDROM
SIF format (1/4 TV)
360x288 pixels
MPEG-2
1994
Digital TV (and HDTV)
1.5 Mbps
VHS
TV (4xTV)
720x576 (1440x1152)
50/60 fields/s
(100/120 fields/s)
4 Mbps (20 Mbps)
NTSC/PAL for TV
20-30
30-40
25/30 frame/s
39
MPEG-2 Profile and Levels
40
Main Profile at Main Level (MP@ML)


Target application: digital TV broadcasting
Interlaced scanning: 2 fields
Field mode
Suitable for live sports
Frame mode
Suitable for studiobased program
41
HDTV

3 Standards




ITU-R HDTV specification






ITU-R spec + 1280 x 720, 16/9 aspect ratio
Video compression: MP@HL
Audio compression: Dolby AC-3
DVB standard



16/9 aspect ratio
1920 sample/line, 1152(1080 visible) lines/frame
Interlaced scanning with 4:2:0 format
ATV standard: Grand Alliance standard


ATV (advance television) in North America
DVB (digital video broadcast) in Europe
MUSE (multiple sub-Nyquist sampling encoding) in Japan and rest of
Asia
4/3 aspect ration, 1440 x 1152(1080 visible)
Video compression: SSP@H1140 (spatially-scalable profile)
MUSE standard


16/9 aspect ratio, 1920 x 1034
Video compression: similar to MP@HL
42
MPEG-4
43
Goal of MPEG-4 (1)


Initial goal was to refine H.261 with a
compression ratio 10 times better. But, failed.
Consequently, the focus was shifted to
development of standard for




Flexible bitstreams that are scalable for receivers with
different capabilities such as resolutions
Extendable configuration for transmitters to download
new applications and algorithms into receivers
Content-based interactivity for multimedia data
access, manipulations and bitstream editing, and
hybrid, natural and synthetic data
Network independence, so that it can be used with
any communication network to provide universal
44
accessibility
Goal of MPEG-4 (2)

MPEG-4 standards for




Multimedia content generation
Network interface for multimedia transport
Interactivity for users
Content-based interactivity







Defined by SNHC (Synthetic and Natural Hybrid
Coding) group
Coding for a synthetic human face and body
Animation of the face and body
Media integration of text and graphics
Texture coding for view-dependent applications
Static and dynamic mesh coding with texture mapping
Interface for text-to-speech synthesis and synthetic 45
audio
AVO: Audio/Visual Object

Primitive AVOs




Compound AVO


e.g) AVO that contains both the audio and visual components of
a talking and walking person
MPEG-4 treats the audiovisual activities and associated
operations, including compression, decompression,
multiplexing and synchronization of audiovisual activities,
as objects – similar to OOP


2D fixed background
Picture of a walking and talking lady without the background
Voice associated with that person
View as a configuration, communication, and instantiation of
classes of objects
VOP (Video Object Plane)


a video object at any given time
Video encoder encodes each VOP separately
46
Content-based Video Coding
47
User Interaction

User interaction operations with the decoded
scene following the design of the scene’s author:




Changing view/listening point of the scene by
navigating through a scene
Dragging objects to different positions
Triggering a sequence of events by clicking on a
specific object, including the starting and stopping of
a video stream
Selecting the desired language when multiple
language tracks are available
48
Scalability and Accessibility

MPEG-4 video object coding supports spatial
and temporal scalability




This allows the receiver to decode only a part of a
bitstream and reconstruct images or image
sequences
Good for video delivery over multimedia networks due
to bandwidth limitation
Good for displaying limited resolution due to
receiver’s capability
Universal accessibility to support various
communication media


MPEG-4 provides error robustness and resilience for
a noisy environment such as mobile networks
Supports audio and video compression algorithms in
error-prone environments at low bit-rates ( < 64 Kbps)
49
Audio Compression

Compressed using one of algorithms, depending
on available bit rate of the transmission channel
and sound quality required, e.g.


G.723.1 (CELP) for interactive MM applications over
Internet
Dolby AC-3, or MPEG Layer 2 for interactive TV
applications over entertainment networks
50
MPEG-4 Encoder/Decoder
VOP endcoder
MPEG-4 decoder
51
Error Resilience Techniques

Use of fixed-length video
packets (VP: 188B)
instead of GOBs
Convential
GOB approach

New variable-length
coding (VLC) scheme
based on reversible
VLCs
Using fixed-length VP
52
Applications of MPEG-4








Real-time communication systems
Mobile computing
Content-based storage and retrieval
Streaming video on the Internet
Collaborative scene visualization
High-quality broadcasting
Studio and TV post-production
Interactive movie, travel guide, computer-based
teaching, Karaoke
53
MPEG-7
Multimedia Content Description
Interface
54
Overview


Description, identification and access of AV information
Used to perform a search for AV information



MPEG-7 description can be attached to any kind of
multimedia material independent of the format of the
representation
Visual description based on


Search picture using characteristics such as color, texture or
shape of objects
Color, texture, sketch, 2D and 3D shape, still images, 3D visual
data, spatial composition relations, temporal composition
information
Audio description base on

Frequency contour, frequency profile, prototypical soound, souce
of sound, stereo of 5.1-channel or binaural sounds
55
MPEG-7 Applications





Medical diagnosis
Home shopping
Search for video and audio database
Architecture, interior design
Multimedia directory services
56
MHEG
57
Overview



Standardized by ISO/IEC/JTC1/SC29 WG12
Describes how video is displayed, audio is
replayed and the means by which a user can
interact with the ongoing presentation
Also addresses multiplatform issue


Uses ASN.1 for representing data structure
More functionality than HTML


Multimedia handling capabilities such as
synchronization of stream, replay speed control,
user’s interactivity with stream events
Uses 3 spatial coordinates and time to synchronize
the presentation
58
MHEG Applications



Video on demand
Interactive multimedia service
Interactive TV
59