ECE-C490 Winter 2000 Image Processing Architecture

Download Report

Transcript ECE-C490 Winter 2000 Image Processing Architecture

ECEC 453
Image Processing Architecture
Lecture 11, 2/19/2004
MPEG and Friends
Oleh Tretiak
Drexel University
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 1
Lecture Outline





Basic Video Coding
Features of MPEG-1
Features of H261
MPEG-2
Introduction to MPEG-4
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 2
Picture of Layers
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 3
Video Compression: Picture Types

Group of Pictures: Three types



Lecture 11
I — intraframe coding only
P — predictive coding
B — bi-directional coding
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 4
Typical MPEG coding parameters

Typical sequence

IPBBPBBPBBPBBPBB (16 frames)
Picture
I
P
B
Average Compsize
ression
156000
6.5
62000
16.4
15000
67.6
BitsPerFrameU  N FramesPerGOP
BitsPerCodedGOP
BitsPerCodedGOP  N I frames  (Bits / Iframe)  NPframes  (Bits / Pframe) 
Compression (GOP) =
N Bframes  (Bits / Bframe)
Bits / Iframe  BitsPerFrameU / CI , Bits / Pframe  BitsPerFrameU / C P
Bits / Bframe  BitsPerFrameU / C B
N FramesPerGOP
Compression (GOP) =
N Iframes / CI  N Pframes / CP  N Bframes / C B

Lecture 11
16
 26.4
1/ 6.5  5 / 16.4  10 / 67.6
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 5
MPEG2 features



Schemes for ‘frame’ and field coding.
There are two fields in a frame, T (top) B (bottom)
Either can be first

Frame prediction for frame pictures
 What’s there to say?

Field prediction for field pictures
 Target macroblock is in one field
 Prediction pixels come from one field
 Can be the same of different parity as target field




Field prediction for frame pictures
Dual prime for P-pictures
16x8 macroblock for field pictures
Motion vectors coded at half-pel resolution
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 6
MPEG2 - Alternate Scan
Zig-zag scan
Lecture 11
Alternate scan
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 7
MPEG-4
Multimedia Standard
Thumbnail Description
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 8
What Is Left for MPEG-4?

Initial goals


Coding standards for lower-than-MPEG-1 rates
Hidden agenda: Incorporate new coding methods
 Wavelet, fractal


Revised agenda: Object-based coding
MPEG-4 Architecture



Lecture 11
Input to coder consist of audio, video, and stored objects
Decoder combines encoded objects with local objects
Example: send text by sending character codes, receiver uses
character generator.
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 9
Schematic Overview of MPEG-4
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 10
MPEG-4 Ideas

Video Object Plane (VOP)







A VOP can be a natural image from video camera or from a
graphics database
A VOP can consist of several visual object. Visual objects do not
have to have rectangular outline (arbitrary shape)
A scene consists of several VO’s and VOP’s with appropriate
compositing
Different VOP’s can have their own motion
In principle, a visual scene can be decomposed into video
objects by segmentation.
Color and texture can be attributes of visual objects
A viewer can manipulate VO’s.
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 11
Animation Objects



Facial animation
Body animation
2-D animation meshes
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 12
2-D Animation Mesh
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 13
Sprite coding

Sprite
Background Plane
Sprite
Composite
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 15
Teleconferencing Standards

Digital video areas



Lecture 11
Broadcast television
Recorded programs
Two-way communications
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 16
Review: Action in the Video Arena



The sponsors: ITU/T SG 15 and ISO/IEC MPEG
The players: H.x standards and MPEG-x standards
Standards, ITU-T (Telecom Guys)




H.261 (1990)
H.263 (draft March 1995)
New standards in the works
Standards, ISO/IEC (Entertainment Video)

Lecture 11
MPEG family
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 17
Review: Video Telephone System
H.261
H.221
H.200/AV.250 -Series
H.320
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 18
Review: H.261 Features

Common Interchange Format





Interoperability between 25 fps and 30 fps countries
252 pix/line, 288 line, 30 fps noninterlace
Terminal equipment converts frame and line numbers
Y Cb Cr components, color sub-sampled by a factor of 2 in both
directions
Coding




Lecture 11
DCT, 8x8, 4 Y and 2 chrominance per masterblock
I and P frames only, P blocks can be skipped
Motion compensation optional, only integer compensation
(Optional) forward error correction coding
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 19
H.324/H.263

H.324: Like H.320
H.261/H.263
H.223
G.723.1
H.245
signaling
H.253, H.234
encryption
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 20
Parts of H.324




H.263: Video coding for low rate communications
G.723.1: Audio and speech for multimedia, 5.3 and 6.3 kbps
H.223: Multiplexing protocol
H.245: Control protocol. Can be used to specify standard, LAN,
and ATM networks
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 21
Features of H.263







Intended for lower rates than H.261, including 28.8 kbit/sec
modem
Includes QCIF(176 x144) and sub-QCIF format (128 x 96 in Y
channel)
Optional error correction for mobile channels
Half-pixel accuracy motion compensation
Differential encoding of motion vectors
Improved coding of DCT coefficients
Optional advanced coding options


Lecture 11
better SNR at the same rate, lower rate at the same SNR
50% more complex than basic H.261
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 22
Picture Formats for H.263
Image Size
Format
sub-QCIF
QCIF
CIF
ACIF
16CIF
Lecture 11
Y
128 x 96
176 x 144
352 x 288
704 x 576
1408 x 1152
Cb, Cr
64 x 48
88 x 72
176 x 144
352 x 288
704 x 576
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 23
All JPEG, ~ 12 Kbytes
Lecture 11
551x369
389x261
327x219
231x155
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 24
Experimental Procedure




Original image subsampled (using ® Photoshop) to various
resolutions (pixel number from max to max/8)
Each subsampled image JPEG coded to various quality levels
with ® Matlab
A group of images with ~ 12 Kbytes per image is compared
Result: Subsampling + JPEG coding is better, at given total bits,
than just JPEG coding
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 25
Future of Low-Rate Video


Solution looking for a user
‘Picturephone’ - not popular



Liked by inventors, surveys of the public less then enthusiastic
Videoconferencing: some success, but limited acceptance
What is needed to make it successful?
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 26
Video Coding Trials

MPEG-1 encoder


Set encoder parameters





http://bmrc.berkeley.edu/frame/research/mpeg/mpeg_encode.html
Picture sequence
Motion compensation search range
Motion compensation algorithm
Quantizer parameters for I, P, B
Three trials



Lecture 11
ibbpbbpbbp I=8, P=10, B=25
ibbpbbpbbp 31 31 31
ippppppppp 31 31 31
795096
311856
209952
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 27
‘High quality’

PATTERN: ibbpbbpbbp




I FRAME SUMMARY



I Blocks: 89 ( 19554 bits) ( 219 bpb)
P Blocks: 890 (111443 bits) ( 125 bpb)
Skipped: 11
Compression: 46:1 ( 0.5182 bpp)
B FRAME SUMMARY





Blocks: 330 ( 94083 bits) ( 285 bpb)
Compression: 21:1 ( 1.1150 bpp)
P FRAME SUMMARY





RANGE: +/-10, HALF
PSEARCH: LOGARITHMIC, BSEARCH: CROSS2
QSCALE: I=8, P=10, B=25
I Blocks: 1 ( 148 bits) ( 148 bpb), B Blocks: 1883 ( 38486 bits) ( 20 bpb)
B types: 173 ( 14 bpb) forw 291 ( 15 bpb) back 1419 ( 22 bpb) bi
Skipped: 96
Compression: 309:1 ( 0.0775 bpp)
Total Compression: 76:1 ( 0.3137 bpp) 795096 bits/sec @ 30 fps
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 28

Show MPEG
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 29
‘Low Quality’





QSCALE: 31 31 31
Compression 195:1 ( 0.1230 bpp)
Total Frames Per Second: 0.714286 (235 mi per frame)
CPU Time: 1.388889 fps (458 mips)
Total Output Bit Rate (30 fps): 311856 bits/sec

Lecture 11
Show movie
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 30
All P frames



Sequence: ippppppppp
QSCALE: 31 31 31
P FRAME SUMMARY








I Blocks: 75 ( 6641 bits) ( 88 bpb)
P Blocks: 1559 ( 39500 bits) ( 25 bpb)
Skipped: 1336
Total Compression: 289:1 ( 0.0828 bpp)
Total Frames Per Second: 1.428571 (471 mi/frame)
CPU Time: 2.702703 fps (891 mips)
Total Output Bit Rate (30 fps): 209952 bits/sec
Show video
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 31
Digital Versatile Disk


Digital Video (Versatile?) Disc (DVD) is a medium for the
distribution of from 4.7 to 17 billion bytes of digital data on a
120-mm (4.75 inch) disc. This huge volume of data (today's CD
can store 680 million bytes of data) can be used to store up to
nine hours of studio quality video and multi-channel surroundsound audio, highly interactive multimedia computer programs,
30 hours of CD-quality audio, or anything else that can be
represented as digital data.
Same size as CD (compact disc)
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 32
Physical parameters
CD: 1.6 µm track spacing, 0.83 µm
bit spacing
Lecture 11
DVD: 0.74 µm track spacing, 0.5 µm
bit spacing
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 33
DVD: Thickness profile
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 34
Comparison: DVD vs. CD
Diameter
Thickness
Track Pitch
Minimum Pit L ength
Laser Wavelength
Data Capacity (per layer)
Layers
Lecture 11
DVD
120mm
0.6 mm
0.74 µm
0.40 µm
640 nm
4.7 GB
1,2,4
CD
120 mm
1.2 mm
1.6 µm
0.834 µm
780 nm
.68 GB
1
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 35
DVD production
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 36
DVD Player to Replace VHS
Estimated productions cost: $3.50 VHS, $1.00 DVD
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 37
Next Generation ‘DVD’

Consortium sets new DVD standard (Blu-Ray)

20 February 2002



By using a 405 nm semiconductor laser, the new video-recording format
enables 27 (23?) Gbyte - equivalent to thirteen hours of TV broadcasting to be contained on a single-sided, single-layer 12 cm DVD.
Increased recording density is achieved using a 0.85 numerical aperture
lens in combination with the 405 nm laser. A 0.1 mm optical transmittance
protection layer is also used to minimize aberration caused by disc-tilt and
give a better readout.
The companies involved are: Hitachi, LG Electronics, Matsushita, Pioneer,
Philips, Samsung, Sharp, Sony and Thomson Multimedia. Notably absent
from the consortium are Toshiba, one of the first companies to
commercialize DVDs, and JVC which has a vested interest in the
conventional video format.
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 38
News

Nine Blu-ray Disc Founder Companies Begin Licensing
of Disc

February 14, 2003 (9:32 a.m. EST)

/PRNewswire-FirstCall/ -- Hitachi, Ltd., LG Electronics Inc., Matsushita
Electric Industrial Co., Ltd., Pioneer Corporation, Royal Philips Electronics,
Samsung Electronics Co. Ltd., Sharp Corporation, Sony Corporation, and
Thomson today announced the start of licensing of the rewritable format of
"Blu-ray Disc", the large capacity optical disc utilizing blue-violet laser.
Licensing will commence as of February 17, 2003. The introduction of
products based on "Blu-ray Disc", the first optical disc format capable of
recording High Definition broadcasts, will enable the enjoyment of even
greater picture quality within the home.

http://www.blu-ray.com/
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 39
DVD(?) Format War

HD DVD

Lecture 11
HD-DVD, also known as AOD (Advanced Optical Disc) is the name
of a competing next-generation optical disc format developed by
Toshiba and NEC. The format is similar to Blu-ray and also utilizes
blue-laser technology to achieve a higher storage capacity. The
rewritable versions of the discs will be able to hold 20GB on a
single-layer disc and 32GB on a dual-layer disc, while the read-only
discs only will be able to hold 15GB on a single-layer disc and
30GB on a dual-layer disc. The read-only version of the format has
been approved by the DVD Forum as the successor to the current
DVD technology.
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 40
Comparison
Parameters
BD
DVD
HD-DVD
Recording capacity
27GB
4.7GB
20GB
Number of layers
single-layer
single-layer
single-layer
Laser wavelength
405nm
650nm
405nm
Numerical aperture (NA)
0.85
0.60
0.65
Protection layer
0.1mm
0.6mm
0.6mm
Data transfer rate
36Mbps
11Mbps
36Mbps
Video compression
MPEG-2
MPEG-2
MPEG-4 AVC
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 41
Red Laser



Pixonics Inc.
Backward-compatible technology (new disc plays on
standard DVD palyer).
Pixonics boasts that 3.5 hours of high-definition
programming can be stored on a DVD-9 disc with a 9
gigabyte capacity.
Lecture 11
Image Processing Architecture, © 2001-2004 Oleh Tretiak
Page 42