Motion compensated inter

Download Report

Transcript Motion compensated inter

Motion compensated inter-frame
prediction
• The previous chapter concentrated on the removing
of the spatial redundancy, so the redundancy in one
frame or field.
• Next topic will be the removing of temporal
redundancy between frames.
• The technique relies on the fact that within a short
sequence of the same general image, most objects
remain in the same location, while others move only
a short distance.
Motion Compensated
Inter-Frame
tMyn
1
• Difference coding is a very simple interframe
compression process during which each frame of a
sequence is compared with its predecessor and only
pixels that have changed are updated.
• If the number of pixels to be updated is large, then
this overhead can adversely affect compression.
• How to make it better?
• Firstly, the intensity of many pixels will change only
slightly and when coding is allowed to be lossy, only
pixels that change significantly need be updated.
• Thus, not every changed pixel will be updated.
Motion Compensated
Inter-Frame
tMyn
2
• Secondly, difference coding need not operate at the
pixel level, but at the block level.
• If the frames are divided into non-overlapping blocks
and each block is compared with its counterpart in
the previous frame, then only blocks that change
significantly need be updated.
• Updating whole blocks of pixels at once reduces the
overhead required to specify where updates take
place.
Motion Compensated
Inter-Frame
tMyn
3
• If pixels are updated in blocks, some pixels will be
updated unnecessarily, especially if large blocks are
used.
• Also, in parts of the image where updated blocks
border parts of the image that have not been
updated, discontinuities might be visible and this
problem is worse when larger blocks are used.
• Block based difference coding can be further
improved upon by compensating for the motion
between frames.
• Difference coding, no matter how sophisticated, is
almost useless where there is a lot of motion.
Motion Compensated
Inter-Frame
tMyn
4
• Only objects that remain stationary within the image
can be effectively coded.
• If there is a lot of motion or indeed if the camera itself
is moving, then very few pixels will remain
unchanged.
• Even a very slow pan of a still scene will have too
many changes to allow difference coding to be
effective, even though much of the image content
remains from frame to frame.
• To solve this problem it is necessary to compensate
in some way for object motion.
Motion Compensated
Inter-Frame
tMyn
5
• The basic coding unit for removal of spatial
redundancy was defined to be an 8*8 block.
• However with MPEG 2 motion-compensation is
usually based on 16*16 block, termed a macroblock.
• This size is a trade-off between the requirement to
have a large macroblock size in order to minimize the
bit rate needed to transmit the motion representation
or motion vectors, and the requirement to have a
small macroblock in order to be able to vary the
prediction process within the picture content and
motion.
Motion Compensated
Inter-Frame
tMyn
6
• There are many methods available to generate a
motion compensated prediction.
• These include forward prediction, where a
macroblock is predicted from a past block, backward
prediction, where a block is predicted from a future
block, and intracoding where no prediction is made
from the macroblocks.
• These prediction modes are applied to MPEG 2
pictures depending on the picture type.
Motion Compensated
Inter-Frame
tMyn
7
• The motion is described as a two-dimensional motion
vector that specifies where to retrieve a macroblock
from a previously decoded frame to predict the
sample values of the current macroblock.
• After a macroblock has been compressed using
motion compensation, it contains both the spatial
difference (motion vectors) and content difference
(error terms) between the reference macroblock and
macroblock being coded.
Motion Compensated
Inter-Frame
tMyn
8
• Note that there are cases where information in a
scene cannot be predicted from the previous scene,
such as when a door opens.
• The previous scene doesn’t contain the details of the
area behind the door.
• Motion-compensated interframe prediction is based
on techniques similar to the well-known differential
pulse-code modulation (DPCM) principle, Figure 1.
Motion Compensated
Inter-Frame
tMyn
9
prediction based on the previous locally-decoded output
Input
_
s(n)-s(n-1)
QUANTIZER
Quantized prediction
error to channel
s(n)
+
PREDICTOR
Locally-decoded
output
Figure 1. Basic DPCM coder.
Motion Compensated
Inter-Frame
tMyn
10
• Using DPCM means that what is quantized and
transmitted is only the differences between the input
and a prediction based on the previous locallydecoded output.
• Note that the prediction cannot be based on previous
source pictures, because the prediction has to be
repeatable in the decoder (where the source pictures
are not available).
• Consequently, the coder contains a local decoder
which reconstructs pictures exactly as they would be
in the actual decoder.
Motion Compensated
Inter-Frame
tMyn
11
• The locally-decoded output then forms the input to
the predictor.
• In interframe prediction, samples from one frame are
used in the prediction of samples in other ”reference”
frames.
Motion Compensated
Inter-Frame
tMyn
12
• Many ”moving” images or image sequences consist
of a static background with one or more moving
foreground objects.
• In this simplified case it is easy to see how some
coding advantage can be gained.
• Figure 2 shows part of two temporary adjacent
images from a sequence.
• Most of the image is unchanged from one time
instant to the next, but a foreground object moves.
• Suppose that the first image has been encoded using
DCT and quantization.
Motion Compensated
Inter-Frame
tMyn
13
previous frame
stationary
background
t
current frame
x
time t
y
Moving object
displacement vector
dx 
 
d 
 y
shifted object
Prediction for the luminance signal S(x,y,t) within the moving object:
S next ( x, y, t )  S ( x  d x , y  d y , t  t ) .
Figure 2. Two adjacent frames, stationary backgroung, moving object.
Motion Compensated
Inter-Frame
tMyn
14
• That image has been transmitted and reconstructed
at the decoder.
• Now let’s look at the second image, but keep the first
image in store at both the encoder and decoder.
• This is referred to as the reference image.
Motion Compensated
Inter-Frame
tMyn
15
• We now treat the second image one block of pixels at
a time, but before performing the DCT, we compare
the block with the same block in the reference image.
• If the block is part of the static background, it will be
identical to the corresponding block in the reference
image.
• Instead of encoding this block, we can just tell the
decoder to use the block from its copy of the
reference image.
Motion Compensated
Inter-Frame
tMyn
16
• All we need is one special code, and we can avoid
sending any data from the background blocks.
• Where the pixel block in either image includes part of
the moving foreground object, we will probably not
find a match, so we can encode and transmit that
block using DCT and quantization just as we did with
the first image.
• The benefit gained from this approach obviously
depends on the picture content, but it is certainly
possible that in an image with a static background
half or perhaps three-quarters or more of the blocks
will need no coding other than the ”same as previous
image” code.
Motion Compensated
Inter-Frame
tMyn
17
• The previous example is clearly a very special case.
• Let’s consider an example where the camera pans
slightly to one side between the two image, Figures 3
and 4.
• Now if we try the test of comparing a block to the
corresponding block in the previous image we will not
get any matches.
• However, we know that for most of the blocks the
data exists in the previous image.
• It is just not in quite the same place!
Motion Compensated
Inter-Frame
tMyn
18
Figure 3. Camera panning, n:th frame.
Motion Compensated
Inter-Frame
tMyn
19
Figure 4. Camera panning, (n+1):th frame.
Motion Compensated
Inter-Frame
tMyn
20
• We must send to the decoder the instruction to use
data from the previous image, plus a motion vector to
inform the decoder exactly where in the previous
image to get the data.
• It would be sensible to make this a relative measure,
so that the motion vector would be zero for a static
background.
• The motion vector is a two-dimensional value,
normally represented by a horizontal (x) component
and a vertical (y) component.
Motion Compensated
Inter-Frame
tMyn
21
• The process of obtaining the motion vector is known
as motion estimation.
• Using the motion vector to eliminate or reduce the
effects of motion is known as motion compensation.
• Static backgrounds and moving backgrounds provide
a simple visualization of how motion vectors may be
used to identify correlation between images in a
sequence.
Motion Compensated
Inter-Frame
tMyn
22
• Block based motion compensation uses blocks from
a past frame to construct a replica of the current
frame.
• The past frame is a frame that has already been
transmitted to the receiver.
• For each block in the current frame a matching block
is found in the past frame and if suitable, its motion
vector is substituted for the block during
transmission.
• Depending on the search threshold some blocks will
be transmitted in their entirety rather than substituted
by motion vectors.
Motion Compensated
Inter-Frame
tMyn
23
• Block based motion compensated video compression
takes place in a number of distinct stages.
• Figure 5 illustrates how output from the earlier
processes form the input to later processes.
• Consequently choices made at early stages can have
an impact of the effectiveness of later stages.
Motion Compensated
Inter-Frame
tMyn
24
Past/Future Frame
Current Frame
Frame Segmentation
Search Threshold
Block Matching
Motion Vector Correction
Figure 5a. Flow of information through the motion compensation process.
Motion Compensated
Inter-Frame
tMyn
25
Prediction Error Coding
Vector Coding
Block Coding
Transmission
Figure 5b. Flow of information through the motion compensation process.
Motion Compensated
Inter-Frame
tMyn
26
Frame segmentation
• The current frame of video to be compressed is
divided into equal sized non-overlapping rectangular
blocks.
• Ideally the frame dimensions are multiples of the
block size and square blocks are most common.
• Block size affects the performance of compression
techniques.
• The larger the block size, the fewer the number of
blocks, and hence fewer motion vectors need to be
transmitted.
Motion Compensated
Inter-Frame
tMyn
27
• However, borders of moving objects do not normally
coincide with the borders of blocks and so larger
blocks require more correction data to be transmitted.
• Small blocks result in a greater number of motion
vectors, but each matching block is more likely to
closely match its target and so less correction data is
required.
• If the block size is too small then the compression
system will be very sensitive to noise.
Motion Compensated
Inter-Frame
tMyn
28
• Thus block size represents a trade off between
minimising the number of motion vectors and
maximising the quality of the matching blocks.
• For architectural reasons block sizes of integer
powers of 2 are preferred.
• Both the MPEG 2 and H.261 video compression
standards use blocks of 16*16 pixels.
Motion Compensated
Inter-Frame
tMyn
29
Search Threshold
• If the difference between the target block and the
candidate block at the same position in the past
frame is below some threshold then it is assumed
that no motion has taken place and a zero vector is
returned.
• Most video codecs employ a threshold in order to
determine if the computational effort of a search is
warranted.
Motion Compensated
Inter-Frame
tMyn
30
Block Matching
• Block matching is the most time consuming part of
the encoding process.
• During block matching each target block of the
current frame is compared with a past frame in order
to find a matching block.
• When the current frame is reconstructed by the
receiver this matching block is used as a substitute
for the block from the current frame.
Motion Compensated
Inter-Frame
tMyn
31
• Block matching takes place only on the luminance
component of frames.
• The colour components of the blocks are included
when coding the frame but they are not usually used
when evaluating the appropriateness of potential
substitutes or candidate blocks.
• The search can be carried out on all of the past
frame, but is usually restricted to a smaller search
area centered around the position of the target block
in the current frame, Figure 6.
Motion Compensated
Inter-Frame
tMyn
32
Target block
Search area
Past frame
Current frame
Figure 6. Corresponding blocks from a current and past frame, and the search
area in the past frame.
Motion Compensated
Inter-Frame
tMyn
33
• This practice places an upper limit, known as the
maximum displacement, on how far objects can
move between frames, if they are to be coded
effectively.
• The maximum displacement is specified as the
maximum number of pixels in the horizontal and
vertical directions that a candidate block can be from
the position of the target block in the original frame.
• The quality of the match can often be improved by
interpolating pixels in the search area, effectively
increasing the resolution within the search area by
allowing hypothetical candidate blocks with fractional
displacements.
Motion Compensated
Inter-Frame
tMyn
34
• The search area need not be square.
• Because motion is more likely in the horizontal
direction than vertical, rectangular search areas are
popular.
Motion Compensated
Inter-Frame
tMyn
35
• The problem is the lack of adequate temporal
sampling.
• If the temporal sampling obeyed Nyquist, we would
have a very easy task in tracking an object from one
sample to the next.
• The temporal sampling of any common imaging
system is much less than Nyquist.
• This fact leads to a simple conclusion: given the
position of an object in one image of the sequence,
we have no idea where it will be in the next!!
Motion Compensated
Inter-Frame
tMyn
36
• A sharp edge must not move by more than one
spatial sample between two temporal samples, there
are 720 columns per line and 50 samples per
second.
• The fastest permissible motion for a sharp edge is
that which travels from one side of the screen to the
other in 720/50=14.4 seconds!!! (in order to obey
Nyquist criterion).
• At first this seems a terrible result.
• It implies that all current motion imaging systems
suffer from gross temporal aliasing.
Motion Compensated
Inter-Frame
tMyn
37
• If we aimed at tracking an object that traverses the
screen in about half a second we would need to
accommodate a displacement of about 50
pixels/image (720/(0,5*25)=57,6) in standarddefinition television.
• If we wish to predict for, say, three frames, that
means a search range of about 150 pixels in each
direction.
• In real-world scenes, there is usually more or faster
motion horizontally than vertically, and research has
shown that for a given search area it is optimal for the
width to be about twice the height.
• This means a total search area of 300 pixels * 150
pixels.
Motion Compensated
Inter-Frame
tMyn
38
• Full-search block matching tests every possible block
within a defined search range against the block it is
desired to match.
• The technique is accurate and exhaustive – if there is
a match within the search range, this method will find
it.
• It is, however, computationally demanding.
Motion Compensated
Inter-Frame
tMyn
39
Matching Criteria
• In order for the compressed frame to look like the
original, the substitute block must be as similar as
possible to the one it replaces.
• Thus a matching criterion, or distortion function, is
used to quantify the similarity between the target
block and candidate blocks.
• If, due to a large search area, many candidate blocks
are considered, then the matching criteria will be
evaluated many times.
Motion Compensated
Inter-Frame
tMyn
40
• If the matching criterion is slow, then the block
matching will be slow.
• If the matching criterion results in bad matches then
the quality of the compression will be adversely
affected.
• The mean absolute difference (MAD) is the most
popular block matching criterion.
• Corresponding pixels from each block are compared
and their differences summed.
• Blocks A and B are of size n*m.
• A[p,q] is the value of the pixel in the p:th row and q:th
column of block A.
Motion Compensated
Inter-Frame
tMyn
41
1 m n
MAD 
A[ p, q]  B[ p, q] .

m n p 1 q 1
• The lower the MAD the better the match and so the
candidate block with the minimum MAD should be
chosen.
• The function is alternatively called Mean Absolute
Error (MAE).
Motion Compensated
Inter-Frame
tMyn
42
• The mean square difference function (MSD) is similar
to the mean absolute difference function, except that
the difference between pixels is squared before
summation:
2
m
1
 A[ p, q]  B[ p, q] .
MSD 

mn p1
• The mean square difference is more commonly
called the Mean Square Error (MSE) and the lower
this value the better the match.
Motion Compensated
Inter-Frame
tMyn
43
• The Pel Difference Classification (PDC) distortion
function compares each pixel of the target block with
its counterpart in the candidate block and classifies
each pixel pair as either matching or not matching.
• Pixels are matching if the difference between their
values is less than some threshold and the greater
the number of matching pixels the better the match.
PDC   ord  A[ p, q]  B[ p, q]  t  .
n
m
q 1 p 1
ord(e) evaluates to 1 if e is true and 0 if false.
Motion Compensated
Inter-Frame
tMyn
44
• Integral projections (IP) are calculated by summing
the values of pixels from each column and each row
of a block.
• The most attractive feature of this criterion is that
values calculated for a particular candidate block can
be reused in calculating the integrals for overlapping
candidate blocks.
• This feature is of particular value during an
exhaustive search, but less useful in the case of suboptimal searches.
m
n
n
n
m
m
IP    A[ p, q]   B[ p, q]    A[ p, q]   B[ p, q] .
p 1 q 1
Motion Compensated
Inter-Frame
q 1
q 1 p 1
tMyn
p 1
45
Sub-Optimal Block Matching Algorithms
• The exhaustive search is computationally very
intensive and requires the distortion function
(matching criteria) to be evaluated many times for
each target block to be matched.
• Considerable research has gone into developing
block matching algorithms that find suitable matches
for target blocks but require fewer evaluations.
• Such algorithms test only some of the candidate
blocks from the search area and choose a match
from this subset of blocks.
Motion Compensated
Inter-Frame
tMyn
46
• Hence they are known as sub-optimal algorithms.
• Because they do not examine all of the candidate
blocks, the choice of matching block might not be as
good as that chosen by an exhaustive search.
• The quality-cost trade-off is usually worthwhile
however.
Motion Compensated
Inter-Frame
tMyn
47
• Signature based algorithms successfully reduce the
complexity of block matching while preserving many
of the advantages of the exhaustive search.
• Signature based algorithms reduce the number of
operations required to find a matching block by
performing the search in a number of stages using
several matching criteria.
• During the first stage every candidate block in the
search area is evaluated using a computationally
simple matching criteria (e.g. pel difference
classification).
Motion Compensated
Inter-Frame
tMyn
48
• Only the most promising candidate blocks are
examined during the second stage when they are
evaluated by a more selective matching criteria.
• Signature based algorithms may have several stages
and many different matching criteria.
Motion Compensated
Inter-Frame
tMyn
49
• Coarse quantization of vectors.
• While signature based algorithms reduce complexity
by minimising the complexity of the criteria applied to
each block in the search space, it is also possible to
reduce complexity by reducing the number of blocks
to which the criterion is applied. These algorithms
consider only a subset of the search space.
• The decision on which candidate blocks to examine
and which candidate blocks to ignore is never
arbitrary.
• Research indicates that humans cannot perceive fast
moving objects with full resolution, which results in
fast moving objects appearing blurred.
Motion Compensated
Inter-Frame
tMyn
50
• Thus if the quality of an image portion containing a
fast moving object was to drop whilst the object was
in motion, this reduction in quality might go
unnoticed.
• All candidate blocks close to the centre of the search
area (i.e. around the zero vector) are evaluated as
potential matches, but only a subset of the candidate
blocks far from the centre are considered.
• This results in slow moving or stationary objects
being coded with the best available match, whereas
fast moving blocks might be coded with less ideal
matches.
Motion Compensated
Inter-Frame
tMyn
51
• The Figure 7 illustrates the search pattern (Gilge) that
could be used in place of full search with a maximum
displacement of +/-6 pixels in both the horizontal and
vertical directions.
• For a search area of this size the number of matching
criteria evaluations is reduced from 169 to 65.
Motion Compensated
Inter-Frame
tMyn
52
-6
-5
-4
-3
-2
1
0
1
2
3
4
5
6
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
Figure 7. Search pattern for a matching algorithm that coarsely quantizes motion vectors
of fast moving objects.
Motion Compensated
Inter-Frame
tMyn
53
• The principle of locality suggests that very good
matches, if they exist, are likely to be found in the
neighbourhood of other good matches.
• For example, the assertion that ”if the wine from a
particular winery is very good then the wine from a
nearby winery is likely to be good” is based on the
principle of locality.
• Although such assumptions can prove false, they are
nonetheless useful.
• Block matching algorithms that are based on the
principle of locality first examine a sparsely spaced
subset of the search area and then narrow the search
to only those areas that show promise, Figure 8.
Motion Compensated
Inter-Frame
tMyn
54
5
-5
-10
0
10
First hierarchical level
Best match at the first level
Second hierarchical level
Figure 8. Two-level hierarchical search, the principle of locality.
Motion Compensated
Inter-Frame
tMyn
55
• By adopting a quadrant monotonic model of the
image data to be compressed, it is possible to
significantly decrease the number of computations
required to find a matching block, without significantly
decreasing the suitability of the match.
• This model assumes that the value of the distortion
function increases as the distance from the point of
minimum distortion increases.
• Therefore, not only are candidate blocks close to the
optimal block better matches than those far from it,
but the value of the distortion function is a function of
the distance from the optimal position.
Motion Compensated
Inter-Frame
tMyn
56
• Thus the quadrant monotonic assumption is a special
case of the principle of locality.
• The strength of a radio signal at various distances
from the transmitter is an example of quadrant
monotonic data.
• The quadrant monotonic assumption allows for the
development of sub-optimal algorithms that, like
others, examine only some of the candidate blocks in
the search area.
• In addition they use the values of the distortion
function to guide the search towards a good match.
Motion Compensated
Inter-Frame
tMyn
57
• There are a number of sub-optimal block matching
algorithms that use the quadrant monotonic
assumption:
–
–
–
–
–
–
2-D Logarithmic Search
Three Step Search
Orthogonal Search Algorithm
One at a Time Search
Cross Search Algorithm
Greedy Algorithms
Motion Compensated
Inter-Frame
tMyn
58
• The Dynamic Search Window Algorithm attempts to
achieve improved performance by directly addressing
the problem of convergence on local minima.
• Although this algorithm starts with a step size in the
normal way, the extent to which the area of the
search window decreases depends on the difference
between the minimum distortion and the second
lowest distortion.
• There could be for example three convergence
modes: fast, normal, and slow.
Motion Compensated
Inter-Frame
tMyn
59
• If the difference between the two lowest distortions
was small then the outcome of a stage of the
algorithm was deemed inconclusive and so the step
size was reduced only a little (slow).
• This gave the algorithm the opportunity to recover if,
in fact, the wrong point was chosen.
• Conversely, if the difference between the two lowest
distortions was large, then the algorithm converged
more quickly on the minimum (fast).
Motion Compensated
Inter-Frame
tMyn
60
• If the difference between the two lowest distortions
fell between the thresholds for fast and slow modes
then the algorithm reduced the search area in the
normal way.
Motion Compensated
Inter-Frame
tMyn
61
•
•
•
•
Dependent Algorithms
There are two primary sources of motion in image
sequences.
They are camera motion (zoom, pan, and tilt) and the
motion of objects within the scene.
Since blocks are generally smaller than the objects it
is reasonable to assume that there is correlation
between the motion of adjacent blocks.
Dependent block matching algorithms calculate the
motion of a given block with the aid of the motion
vectors of its neighbouring blocks.
Motion Compensated
Inter-Frame
tMyn
62
• The motion vectors of the neighbouring blocks are
used to calculate a prediction of the block’s motion
and this prediction is used as a starting point for the
search.
• Dependency introduces memory into the block
matching algorithm, but usually results in a more
ordered set of motion vectors.
• Dependency can be spatial or temporal or both.
Motion Compensated
Inter-Frame
tMyn
63
• Spatial dependency exploits the correlation between
the motion vectors of neighbouring blocks to provide
a prediction to matching algorithms as to the likely
position of a block’s match.
• Frequently the prediction is formed by taking a
weighted average of the neighbouring blocks’ motion
vectors.
• This is an inefficient way to go about doing it, so
different patterns are used in order to achieve the
same thing.
• Especially since using all neighbouring blocks’ motion
vectors requires that they all have to be determined
first.
Motion Compensated
Inter-Frame
tMyn
64
• In Figure 9 a number of alternatives are presented.
• In the figure the green squares are blocks whose
motion vector will be used to predict the motion
vector of the red block, or target block.
Motion Compensated
Inter-Frame
tMyn
65
Figure 9. Examples of spatial dependency patterns.
Motion Compensated
Inter-Frame
tMyn
66
• To circumvent the problems associated with spatial
dependency at object boundaries, some spatially
dependent algorithms do not average the
neighbouring vectors.
• Instead they consider how many of the neighbouring
vectors are similar.
• If, for example, the motion vectors of four
neighbouring blocks are considered and the motion
vectors of three of these are the same, then the
motion vector similar to theirs might be chosen as a
starting position for the search.
Motion Compensated
Inter-Frame
tMyn
67
• If the motion vectors of the neighbouring blocks are
not sufficiently uniform then the search for the target
block might be carried out as normal, as though no
spatial dependency was being exploited.
Motion Compensated
Inter-Frame
tMyn
68
Temporal dependency
• If the assumption is made that moving objects in a
scene move with constant velocity then a large
degree of temporal redundancy can be expected.
• This redundancy can be exploited in a manner similar
to spatial redundancy by providing predictions to
block matching algorithms as to likely positions of
matching blocks.
• By examining the motion of blocks in the previous
frame(s), predictions about their behaviour in the
current frame can be made.
Motion Compensated
Inter-Frame
tMyn
69
• The distance and direction of object motion between
frames tends to remain the same from frame to
frame.
• Temporal redundancy exploits this tendency by
assuming that motion vectors from previous frames
indicate good starting points for searches in
subsequent frames.
Motion Compensated
Inter-Frame
tMyn
70
• The effectiveness of compression techniques that
use block based motion compensation depends on
the extent to which the following assumptions hold:
1. Objects move in a plane that is parallel to the
camera plane. Thus the effects of zoom and object
rotation are not considered, although tracking in the
plane parallel to object motion is.
2. Illumination is spatially and temporally uniform.
That is, the level of lighting is constant throughout the
image and does not change over time.
3. Occlusion of one object by another, and uncovered
background are not considered.
Motion Compensated
Inter-Frame
tMyn
71
• Bidirectional motion compensation uses matching
blocks from both a past frame and a future frame to
code the current frame.
• A future frame is a frame that is displayed after the
current frame.
• Bidirectional compression is much more successful
than compression that uses a single past frame,
because information that is not to be found in the
past frame might be found in the future frame.
• This allows more blocks to be replaced by motion
vectors.
Motion Compensated
Inter-Frame
tMyn
72
• Bi-directional motion compression, however, requires
that frames be encoded and transmitted in a different
order from which they will be displayed.
Motion Compensated
Inter-Frame
tMyn
73
Motion vector correction
• Once the best substitute, or matching block, has
been found for the target block, a motion vector is
calculated.
• The motion vector describes the location of the
matching block from the past frame with reference to
the position of the target block in the current frame.
• Motion vectors, irrespective of how they are
determined, might not correspond to the actual
motion in the scene.
Motion Compensated
Inter-Frame
tMyn
74
• This may be due to noise, weaknesses in the
matching algorithms, or local minima.
• The property that is exploited in spatially dependent
algorithms, can be utilised after the vectors have
been calculated in an attempt to correct them.
• Smoothing techniques can be applied to the motion
vectors that can detect erratic vectors and suggest
alternatives.
• The alternative motion vectors can be used in place
of those suggested by the block match algorithm.
Motion Compensated
Inter-Frame
tMyn
75
• Usually the candidate blocks to which they refer are
first evaluated as potential matches and corrected
motion vector used only if the block is suitable.
• Smoothing motion vectors, however, can add
considerable complexity to a video compression
algorithm and should only be used where the benefits
outweigh these costs.
• If frames are going to be interpreted by the receiver
then motion vector correction is likely to be
worthwhile.
Motion Compensated
Inter-Frame
tMyn
76
• Smoothing can also reduce the amount of data
required to transmit the motion vector information,
because this information is subsequently compressed
and smooth vectors can be compressed more
efficiently.
• Vector smoothing causes problems of its own.
• Smoothing can cause small objects to be coded
badly because their motion vectors might be
considered erroneous when they are in fact correct.
• Smoothing such motion vectors can adversely affect
the quality of the compressed image.
Motion Compensated
Inter-Frame
tMyn
77
Vector coding
• Once determined, motion vectors must be assigned
bit sequences to represent them.
• Because so much of the compressed data will consist
of motion vectors, the efficiency with which they are
coded has a great impact on the compression ratio.
• In fact up to 40% of the bits transmitted by a codec
might be taken up with motion vector data.
• Fortunately, the high correlation between motion
vectors and their non-uniform distribution makes
them suitable for further compression.
Motion Compensated
Inter-Frame
tMyn
78
• This compression must be lossless.
• Any one of the lossless general purpose compression
algorithms are suitable for coding vectors.
• The ISO/OSI video compression standard known as
MPEG specifies variable length codes to be used for
motion vectors.
• The zero vector, for example, has a short code,
because it is the most frequently occurring.
Motion Compensated
Inter-Frame
tMyn
79
Prediction Error Coding
• Although the battery of techniques described thus far
can code video very successfully, they rarely
generate perfect replicas of the original frames.
• Thus the difference between a predicted frame and
the original uncompressed frame might be coded.
• Generally this is applied on a block by block basis
and only where portions of the coded frame are
significantly different from the original.
• Transform coding is the most frequently used to
achieve this and completely lossless coding is rarely
a goal.
Motion Compensated
Inter-Frame
tMyn
80
• H.264/AVC/MPEG-4 Part 10 contains a number of
new features that allow it to compress video much
more effectively than older standards and to provide
more flexibility for application to a wide variety of
network environments. In particular, some such key
features include:
 Multi-picture inter-picture prediction including the
following features:
Motion Compensated
Inter-Frame
tMyn
81
• Using previously-encoded pictures as references in a
much more flexible way than in past standards,
allowing up to 16 reference frames (or 32 reference
fields, in the case of interlaced encoding) to be used
in some cases. This is in contrast to prior standards,
where the limit was typically one; or, in the case of
conventional “B pictures", two. This particular feature
usually allows modest improvements in bit rate and
quality in most scenes. But in certain types of scenes,
such as those with repetitive motion or back-and-forth
scene cuts or uncovered background areas, it allows
a significant reduction in bit rate while maintaining
clarity.
Motion Compensated
Inter-Frame
tMyn
82
• Variable block-size motion compensation (VBSMC)
with block sizes as large as 16 *16 and as small as 4
*4, enabling precise segmentation of moving regions.
The supported luma prediction block sizes include 16
*16, 16 *8, 8 *16, 8 *8, 8 *4, 4 *8, and 4 *4, many of
which can be used together in a single macroblock.
Chroma prediction block sizes are correspondingly
smaller according to the chroma subsampling in use.
• The ability to use multiple motion vectors per
macroblock (one or two per partition) with a
maximum of 32 in the case of a B macroblock
constructed of 16 4 *4 partitions. The motion vectors
for each 8 *8 or larger partition region can point to
different reference pictures.
Motion Compensated
Inter-Frame
tMyn
83
• The ability to use any macroblock type in B-frames,
including I-macroblocks, resulting in much more
efficient encoding when using B-frames. This feature
was notably left out from MPEG-4 ASP.
• Six-tap filtering for derivation of half-pel luma sample
predictions, for sharper subpixel motioncompensation. Quarter-pixel motion is derived by
linear interpolation of the halfpel values, to save
processing power.
Motion Compensated
Inter-Frame
tMyn
84
• Quarter-pixel precision for motion compensation,
enabling precise description of the displacements of
moving areas. For chroma the resolution is typically
halved both vertically and horizontally (see 4:2:0)
therefore the motion compensation of chroma uses
one-eighth chroma pixel grid units.
• Weighted prediction, allowing an encoder to specify
the use of a scaling and offset when performing
motion compensation, and providing a significant
benefit in performance in special cases—such as
fade-to-black, fade-in, and cross-fade transitions.
This includes implicit weighted prediction for Bframes, and explicit weighted prediction for P-frames.
Motion Compensated
Inter-Frame
tMyn
85