ECE-C490 Winter 2000 Image Processing Architecture

Download Report

Transcript ECE-C490 Winter 2000 Image Processing Architecture

ECEC-453
Image Processing Architecture
3/11/2004
Exam Review
Oleh Tretiak
Drexel University
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 1
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 2
Announcements
•
•
•
Final on March 20
Cumulative
Extra credit problem - write plugin for ImageJ (everybody does a
different plugin)
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 3
Architecture for the DCT
•
Separable DCT
Y TXT T  (T (TX )T )T
Options
•
-
-
•
Fast DCT ~ conventional computer
Vector DCT ~ parallel hardware
8x8 1-D DCT
Z  TX
•
Z
T 
X
Unit operation: Multiply 8x8 matrix with 8x1 matrix ~ 64 ops
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 5
Computational Complexity
•
1D DCT
-
•
2D DCT - direct implementation
-
•
•
-
•
M = N2 input values, M output values -> M2 = N4
2D DCT - separable implementation, Y = TXTT = ZTT, where
Z = TX, all matrices are NxN -> 2N3 operations
For N = 8
-
•
N input and output samples ~ N2= 64 operations (additions +
multiplications)
2D DCT direct — 4096 operations, 64 operations per pixel
2D DCT separable — 1024 operations, 16 ops/pixel
Big savings due to separable transform
Inverse DFT — same story.
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 6
DCT: Encoding in JPEG,
MPEG
•
•
•
•
Take 8x8 blocks of pixels
Subtract range mean value
Compute 8x8 DCT
Quantize the DCT coefficients
-
•
•
Typically, many of the samples are equal to zero
Lossless entropy coding of the quantized samples
Different quantization step is used for different DCT coefficients
-
-
ykl — DCT coefficients, qkl — quantizer steps
zkl — quantized values
y 
z kl  round  kl 
qkl 
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 7
Optimized (fast) DCT
•
1-D Chen DCT diagram.
Dashed lines indicate
subtraction, — multiplication by a constant,
— multiplication by
0.5 (shift).
DCT or IDCT Method
Characteristics
of optimized
DCT algorithms
Lecture 15
Multiplications
1-D 2-D
1-D Chen
16
256
1-D Lee
12
192
1-D Loeffler, Ligtenberg
11
176
2-D Kamangar, Rao
128
2-D Cho, Lee
96
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Additions
1-D 2-D
26
416
29
464
29
464
430
466
Page 8
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 9
Huffman Coding - Block Diagram
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 10
Coding AC Coefficients
•
•
•
•
AC coefficients are coded in zig-zag (called ZZ in standard)
order to maximize possible runs of zeros.
Code unit consist of run length
followed by coefficient size.
Baseline coding of size category
is the same as for DC differences
(Table 2.9)
Example: run of 6 zeros,
size = -18. In the table, -18
is in category 5. Code is
(6/5, 01101). If the Huffman
code for 6/5 is 1101,
codeword = 110101101
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 11
Example of JPEG compression
Very high quality:
compression = 2.33
Photoshop Image
Lecture 15
Very low quality:
compression = 115
Produced by MATLAB
with Quality = 0
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 12
Compression = 64
JPEG
Lecture 15
JPEG2000
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 13
Predictive Coding of Video
•
E(x, y, t) = I(x, y, t) - P(x, y, t)
-
•
•
I ~ image, P ~ prediction, E ~ error
P(x, y, t+1) = P(x, y, t) + Code(E(x, y, t))
At receiver, Ie(x, y, t) = P(x, y, t+1)
-
Ie(x, y, t) ~ estimate of image at time t
qi
xˆi
+
Predictor
pi
Encoder
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Decoder
Page 14
Generic Encoder - simplified
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 15
Motion Estimation Methods
No
compensation
Full search
logarithmic
search
Lecture 15
3 level
hierarchical
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 16
Full-Search Method
•
•
•
Compute for (2p+1)2 values of (i, j).
Each location requires 3MN operations
Picture dimensions IxJ, F pictures per second
-
-
•
•
3IJF(2p + 1)2 operations per second
I = 720, J = 480, F = 30, p = 15 —> 30 GOPS
Guaranteed to find best (MAE) displacement
How to do it?
-
-
-
Lecture 15
Special computers
Smaller p
Faster (suboptimal) algorithm
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 17
Hierarchical Search
•
Full
Prepare downsampled versions of current and reference
images
-
-
-
•
Full macroblock 16x16
Down 2 macroblock 8x8
Down 4 macroblock 4x4
Full search in Down 4 reference image
-
-
16 x speedup, smaller macroblock
16 x speedup, fewer displacement vectors
o p = ±16, p’ = ±4
Down 2
•
•
Around point of best match, do local search in Down 2
reference image (3x3 search zone)
Repeat for Full reference image (3x3 search zone)
Down 4
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 18
Comparison
Search Method
Operations per Macroblock
Full Search
3(2 p  1)2 NM
3(8 log 2 p  1)NM
Logarithmic
PHODS
Hierarchical
Lecture 15

3(4 log 2 p  1)NM

3 ( 2 p / 4  1)  180 NM / 16
2
Operations for video
720x480 at 30 fps, GOPS
p = 17
p=7
29.89
6.99
1.02
0.78
0.53
0.40
0.51
0.40
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 19
MPEG-1: ‘1.5’ Mbps
•
•
Sample rate reduction in spatial and temporal domains
Spatial
-
-
Block-based DCT
Huffman coding (no arithmetic coding) of motion vectors and
quantized DCT coefficients
o 352 x 340 pixels, 12 bits per pixel, picture rate 30 pictures per second
—> 30.4 Mbps
o Coded bit stream 1.15 Mbps (must leave bandwidth for audio)
o Compression 26:1
o Quality better than VHS!
•
Temporal
-
-
Lecture 15
Block-based motion compensation
Interframe coding (two kinds)
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 20
Picture Types
•
MPEG-1 is designed to support random access & editing
-
-
-
Lecture 15
I — intraframe coding only
P — predictive coding
B — bi-directional coding
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 21
Picture of Layers
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 22
Coding Image Blocks
•
B pictures
-
-
-
-
Inter or intra?
Forward, backward, interpolational?
Code block or skip?
Quantization step?
Macroblock type
Picture
Type
I
P
B
Zero MV Skipped
I
3300
P
897
8587
5128
568
B
60
7356 22845
429
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Total
3300
15180
30690
Page 23
MPEG-1 Wrap-up
•
•
•
Data below for decoder, SIF pictures, 2 B pictures per P
IDCT must be precise, because of inter-frame coding
MPEG-1 does not deliver quality acceptable for broadcast —>
MPEG-2
Decoding Function
Load (% )
Bit-stream header parsing 0.44
0.44
Huffman decoding and dequantization
19.00
Inverse DCT
22.10
Motion compensation
38.64
Color transformation and display
19.82
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 24
Typical MPEG coding parameters
•
Typical sequence
-
IPBBPBBPBBPBBPBB (16 frames)
Picture
I
P
B
Average Compsize
ression
156000
6.5
62000
16.4
15000
67.6
BitsPerFrameU  N FramesPerGOP
BitsPerCodedGOP
BitsPerCodedGOP  N I frames  (Bits / Iframe)  NPframes  (Bits / Pframe) 
Compression (GOP) =
N Bframes  (Bits / Bframe)
Bits / Iframe  BitsPerFrameU / CI , Bits / Pframe  BitsPerFrameU / C P
Bits / Bframe  BitsPerFrameU / C B
N FramesPerGOP
Compression (GOP) =
N Iframes / CI  N Pframes / CP  N Bframes / C B

Lecture 15
16
 26.4
1/ 6.5  5 / 16.4  10 / 67.6
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 25
MPEG-2 Goals
•
•
•
•
•
•
•
•
Compatibility with MPEG-1
Good picture quality
Flexibility in input format
Random access capability (I pictures)
Capability for fast forward, fast reverse play, stop frame
Bit stream scalability
Low delay for 2-way communications (videoconferencing)
Resilience to bit errors
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 26
MPEG-2 profiles
•
A profile is a subset of the entire MPEG-2 bit-stream syntax
-
•
Simple
Main
4:2:2
SNR
Spatial
High
Multiview
Each profile has several levels (resolution quality)
-
Lecture 15
Low — MPEG1
Main — CCIR 601
High-1440 (Video Editing)
High (HDTV)
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 27
MPEG2 - Alternate Scan
Zig-zag scan
Lecture 15
Alternate scan
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 28
MPEG2 — Subsampling
•
Suppose picture is 720x480
-
4:4:4
o Luminance and chrominance @ 720x480
-
4:2:2
o Luminance @ 720x480, chrominance 360x480
-
4:2:0
o Luminance 420x480, chrominance 360x240
•
Weird terminology
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 29
Teleconferencing Standards
•
Digital video areas
-
-
-
Lecture 15
Broadcast television
Recorded programs
Two-way communications
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 30
Review: Video Telephone System
H.261
H.221
H.200/AV.250 -Series
H.320
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 31
Review: H.261 Features
•
Common Interchange Format
-
-
-
-
•
Interoperability between 25 fps and 30 fps countries
252 pix/line, 288 line, 30 fps noninterlace
Terminal equipment converts frame and line numbers
Y Cb Cr components, color sub-sampled by a factor of 2 in both
directions
Coding
-
-
-
-
Lecture 15
DCT, 8x8, 4 Y and 2 chrominance per masterblock
I and P frames only, P blocks can be skipped
Motion compensation optional, only integer compensation
(Optional) forward error correction coding
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 32
Picture Formats for H.263
Image Size
Format
sub-QCIF
QCIF
CIF
ACIF
16CIF
Lecture 15
Y
128 x 96
176 x 144
352 x 288
704 x 576
1408 x 1152
Cb, Cr
64 x 48
88 x 72
176 x 144
352 x 288
704 x 576
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 33
Encoder: Where’s the meat?
10%
Video
Color,
Dow nsample
+
DCT
-
63%
Quantization
Entropy
coding
Inv.
quantization
Buff er
Motion
Estimation
Bitstream
IDCT
10%
Predicted Picture
Motion
Compensation
Lecture 15
+
Reference
Memory
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 34
Advanced Video Coding
•
•
•
•
•
H.263 and MPEG-4 based on ~1995 technology
After 1995, MPEG and VCEG (video coding) started working on
a new low-rate standard (H.26L)
Rec H.264 released in September 2002
Information on http://www.vcodex.com/ (some is on our web
site)
Site maintained by Ian Richardson, who has written books about
video coding
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 35
New Features
•
•
•
•
•
Prediction in I pictures
Different block transform
Different Block Sizes
Changes in motion compensation
VLC and arithmetic coding
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 36
I Picture Prediction
•
System operates with 4x4 blocks and 16x16 macroblocks
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 37
9 Prediction Modes for 4x4 Blocks
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 38
4 Modes for 16x16 Macroblocks
•
•
•
•
Mode 0: Vertical, extrapolate from upper samples
Mode 1: Horizontal, extrapolate from left samples
Mode 2: DC, mean of upper and left-hand samples
Mode 3: Plane, linear fit to left and upper samples
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 39
Different Block Transform
•
•
•
Basically, 4x4 DCT
Scanning sequence for 16x16 macroblock is shown below
4x4 and 2x2 DC coefficients transformed (again)
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 40
4x4 DCT Tricks
a a a a 


A  b c c b 
a a a a

c b b c

•
Y = AXAT
•
a = 1/2, b = 0.707 cos(π/8), c = cos(3π/8)
•
Trick: Y = (CXCT).*E
1

C  1
1

1


Lecture 15

1
1
1
2
 a 2

E  ab 2/2
a

ab /2
1
1
1
2
1 
2
1 
1

ab /2 a 2
b 2 / 4 ab /2
ab /2 a 2
b 2 / 4 ab /2
ab /2
b 2 / 4 
ab /2
b 2 / 4 

Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 41
Motion Compensation Ideas
•
Adaptive motion compensation blocks:
-
Lecture 15
16x16, 16x8, 8x16, 8x8, 8x4, 4x8, 4x4
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 42
Coding Ideas
•
•
•
•
Constant quantizer value
Zig-zag scan with novel run-length code
Arithmetic coding an option
Motion vectors to 1/4 pixel
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 43
Loop Filter
•
•
•
Concept to overcome block artifacts
Average across inter-block lines if difference
is too big
Difference threshold depends on coding
mode (intra or inter) and quantization step
size
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 44
Example of Loop Filter
Lecture 15
Image Processing Architecture, © 2001, 2002 Oleh Tretiak
Page 45