Transcript MPEG

MPEG

MPEG-Video This deals with the compression of video signals to about 1.5 Mbits/s; MPEG-Audio This deals with the compression of digital audio signals at a rates of 64, 128 or 192 kbits/s per channel; MPEG-System This deals with s

y

nchronisation and multiple

x

ing of multiple compressed audio and video bit streams.

MPEG Versions

• MPEG 1 for CD ROM. • MPEG 2 for broadcast quality.

• Other MPEG versions (eg MPEG7) are not compression systems.

• We will be talking about MPEG 2

Data and Compression rates

• MPEG1 video has data rates up to 1.5 Mbits/s.

• MPEG2 video has data rates between 3-15 Mbits/s for broadcast and 15-30 Mbits/s for high definition.

• The commercial uncompressed digital video data stream (SDI) has a data rate of 270 Mbits/s, although this includes audio and facility for 10 bit video coding.

Data and Compression rates

• However even if we consider transmitting monochrome television pictures of size (576 x 720) 25 frames per second at 8 bits resolution. We have a data rate of 576 x 720 x 25 x 8 = 82.944 Mbits/s.

• We have to double this (at least) for colour giving nearly 166 Mbits/second.

• Therefore MPEG can give 100:1 compression or more.

Spatial and temporal redundancy

• MPEG makes use of temporal and spatial redundancy.

• Temporal redundancy means that we are unnecessarily transmitting the same information (data) over time.

– Eg Backgrounds do not need to be sent every frame.

Spatial and temporal redundancy

• Spatial redundancy .means we are unnecessarily transmitting detail information (spatial information) which cannot be perceived by the eye.

• This is what JPEG does on still images. • By avoiding to carry this unnecessary (redundant) information we can achieve compression.

• Note that while the spatial compression is lossy, the temporal compression is not.

Comparison with JPEG

• MPEG the same spatial compression method as JPEG.

• The temporal compression uses other techniques.

Overview of MPEG

• MPEG takes incoming frames and produces a spatially compressed image.

• MPEG also predicts motion in the scene and estimates where blocks of pixels have moved to in another frame.

• MPEG can then transmit vector (or motion) information only to predict the next frame.

• However since the prediction can be inaccurate MPEG also transmits an error picture (spatially compressed) with the vector predictions.

Prediction and macroblocks

• MPEG divides each frame into blocks of size 16 x 16 pixels called macroblocks.

• The idea is to find which block in the predicted frame have the pixels in the reference frame moved to.

• This can be done by comparing each macroblock in the reference frame with possible position in the predicted frame and finding the closest match.

• We then send a prediction “vector” which describes the movement of each block.

Difference pictures

• Unfortunately, 16 x16 blocks are quite large so it is unlikely that all the pixels in one block will have moved to another.

• There will generally, therefore, be errors in the prediction made by moving blocks around.

• We know the error at the sending end, because it is simply the difference between the actual picture and the predicted picture.

• So if we send the error as well as the prediction, we can reconstruct the actual picture.

I-Frames

• I (Intrapicture) – I-frames do not any motion prediction. They use spatial compression only. That is, the complete frame is transmitted in a JPEG like form. They are needed for several reasons including: – To start an MPEG sequence off (since there is nothing to predict one the first frame) – So that an MPEG stream may be joined at a point other than the start.

– To recover from errors and degradation caused by repeated reference to previous frames.

• Sometimes called keyframes.

I-Frames

• I (Intrapicture) – I-frames do not any motion prediction. They use spatial compression only. That is, the complete frame is transmitted in a JPEG like form. They are needed for several reasons including: – To start an MPEG sequence off (since there is nothing to predict one the first frame) – So that an MPEG stream may be joined at a point other than the start.

– To recover from errors and degradation caused by repeated reference to previous frames.

• Sometimes called keyframes.

P-Frames

• P (Predicted Picture) – P-frames send only motion prediction information and a spatially compressed error picture. • The actual frame is constructed from a previous frame, with the pixels in the “macroblocks” moved to their new location. • Since this may be far from perfect the compressed error picture is added to compensate.

• The previous frame could be an I frame or another P-frame.

• In the situation where nothing moves in the scene then the P-frame information is zero and the actual constructed frame is the same as the previous one. (maximum compression).

B-Frames

• Imagine the situation where an object moves to reveal a (stationary) background.

• Since this background may be fully revealed in a later frame. We could use this future frame as a reference and backwardly predict previous frames.

• Also, if we now the positions of blocks in future and previous frames we can predict intermediate frames. • B (Bi-directional prediction) – Allows interpolation and prediction from both previous and future (I and P) frames.

• B-frames allow the most compression.

B-Frames

• There are clearly associated problems with bi-directional frames.

• We have to wait for future incoming video before they can be coded. This causes delay.

• We have to transmit future frames before intermediate B-frames so that the decoder has the future and previous references available to construct the actual frame from the B-frames.

Groups of pictures (GOP)

• The MPEG sequence therefore consists of a combination of I-, P and B-frames.

• This sequence is called a group of pictures (GOP) • Usually the group repeats (but it does not have to); for example a typical group of 12 frames.

– B 1 B 2 I 3 B 4 B 5 P 6 B 7 B 8 P 9 B 10 B 11 P 12 – The subscripts indicate the original video frame order.

Groups of pictures (GOP) (order of sending)

• However, as indicated above the order is different in the actual bit stream because frames cannot be predicted without the appropriate reference.

• The corresponding sending order (bitstream) would therefore be: – I 3 B 1 B 2 P 6 B 4 B 5 P 9 B 7 B 8 P 12 B 10 B 11

Exercise

• A video sequence is coded using the following GOP: – B 3 B 4 P 1 P 2 I 5 • Suggest a suitable corresponding bitstream sequence.

Quality of service and variable quantisation.

• The amount of redundancy (both spatial and temporal) in moving video pictures varies, depending on the programme content.

• Sometimes almost zero data is transmitted. For example a still frame. While in action sequences the amount of data produced is large.

• It is desirable to produce a constant data rate.

Quality of service and variable quantisation.

• The data is therefore buffered (stored) and often transmitted at a constant rate.

• This allows the system to nearly fill the buffer when the data produced is large, but operate with an empty buffer when little data is produced.

• Sometimes, when there is a lot of change between one frame and the next, the buffer would overflow if some action where not taken to prevent this from happening.

.

Quality of service and variable quantisation.

• The system therefore produces larger quantisation steps to the DCT co-efficients (rejecting more high frequency components) when this happens to prevent system failure.

• Sometimes only the dc component remains.

• This results in poorer quality pictures (blocking and smearing) at times of low spatial and temporal redundancy.

Quality of service and variable quantisation.

• This can be seen on most digital television systems.

• Therefore the quaility of service depends on the (previously agreed) output data rate.

Further reading.

• www.mpeg.org

• Art of Digital Video, Watkinson, Focal press.

• www.snellwilcox.com/referen ce/pdfs/ecomp.pdf