[Class 6: Multimedia] IN350: Document Management

Download Report

Transcript [Class 6: Multimedia] IN350: Document Management

[Week 10: Multimedia Management]
IN350: Document Handling and Information
Management
Judith Molka-Danielsen, Oct. 29, 2001
7/17/2015
Multimedia Services: Stages of Processing
1. Capture
Synchronize audio, video, and other media streams
Compression in real-time
2. Presentation
Decompression in real-time
Low end-to-end delay (< 150ms )
3. Storage
Quick access for forward and backward retrieval
Random access (< 1/2 s)
7/17/2015
Multimedia Services need Compression
• Typical Example of a Video Sequence
–
–
–
–
–
24 image frames/sec
24 bits/pixel = 3 byte/pixel
Image resolution 1280 rows x 800 pixels
24x1280x800=24576000 bits=3072000 bytes
Data rate at 24 frames/sec could store 8
seconds of a movie, or 200 images on a 600
Mbyte CD ROM
– Therefore, Compression is required
7/17/2015
Multimedia Services need Compression
• Transmission
– At 56 kbps access to the Internet, it
would take 439 seconds (7 minutes) to
download the previous image.
– Both fast access to the software
and fast transmission is
needed.
7/17/2015
General Requirements for Real Time
Multimedia Services
• Low Delay (end-to-end)
• Low Complexity of compression
algorithm
• Efficient Implementation of compression
algorithm (in software, on hardware)
• High quality output after decompression
7/17/2015
What allows compression of Images?
1. Redundancy – (Unlike text) 2 adjacent
rows of a picture (image) are almost
identical (the same).
2. Tolerance – The human eye is tolerant to
approximation errors in an image. Some
information loss can be irrelevant. (Not
true for text! You cannot have missing
information in financial data.)
7/17/2015
Compression Coding Categories
1. Entropy and Universal Coding – (Lossless)
Run Length Coding – code represents a string of pixels
as a compressed symbol. In a fax a line of white pixels is
represented as one symbol. (Show example)
Huffman Coding andAdaptive Algorithms – use the
probability of occurance of a symbol to assign symbol.
Arithmetic Coding – assume a stream of bits in a file is
typical string, code prefix strings. (Lempel-Ziv)
2. Source Coding – (Can be lossy)
Prediction, DCT (using FFT-fourier frequency
transform), subband coding, quantization
File formats that use both - JPEG, MPEG, H.261, DVI
RTV, DVI PLV
7/17/2015
Why not use only Lossless compression on
Images (as we do with text)?
1. Image files are bigger than text. They
need to be compressed more.
2. The human eye cannot detect small losses
in resolution.
3. We can get more compressed files by
using multiple compression techniques.
(Both Lossy and Lossless on the same
file.)
7/17/2015
What is Lossy Compression?
1. Simplification – The pattern is a string of pixels,
where the neighbor pixels might increase in
brightness by 1 level or decrease in brightness by
one level. It is hard to see this.
So if you had prefix (1,-1) to order changes in
brightness, you replace this by (0) no change.
Then you have longer strings of 0s. Use RLE next.
2. Median filters – Replace a whole neighborhood of
pixels by the average pixel value. This reduces
details. Then use RLE. (Most changes are at the
edges of objects.)
7/17/2015
Source Coding – Discrete Cosine Transform
This type of coding takes into account specific image
characteristics and the human sensitiveness (vision).
Luminance (brightness): is the overall response of the
total light energy dependant on the wavelength of the
light. (How light or dark is the pixel.)
Saturation: is how pure the color is. Or how diluted it
is by the addition of white.
The human eye is more sensitive to certain
wavelengths than to others. (more to:yellow, green;
less to: red, violet).
DCT – basic image block is made of 8x8 pixel blocks.
64 pixel values are transformed to a set of DCT
weights for these characteristics.
7/17/2015
Discrete Cosine Transformation (DCT)
DCT: transforms the image from the spacial
domain to the frequency domain.
typical pictures have minimal changes in colour or
luminance between two adjacent pixels
the frequency representation describes the amount
of variation. There are small coefficients for high
frequencies. The big coefficients of the low
frequencies contain most of the information.
Coefficients are the weights. Remove small
coefficients and Round-off coefficient values.
Less data can represent the whole image.
7/17/2015
Discrete Cosine Transformation (DCT)
DCT can be used for photos.
DCT can not be used for vector graphics or two
coloured pictures (black text, white background)
DCT can be a lossless technique: Use all
coefficients. Do not round off values.
Or Lossy information can be cut out.
At the quantization step, the divisor selected for the
matrix can reduce precision.
7/17/2015
Compression for images, video and audio
Basic Capture Encoding Steps:
1. Image Preparation - choose resolution, frame
rate. (Choose compression format).
2. Image Processing - DCT. Lossy or lossless.
3. Quantization - analog to digital conversion of
data flow.
4. Second stage Encoding step – Use Run Length
Encoding or Huffman on the digital string.
Lossless.
7/17/2015
JPEG (Joint Photographics Experts Group)
JPEG allows different resolution of individual
components (Interleaved / non-interleaved) using lossless
and lossy modes.
There are 29 modes for compressing images).
Lossy JPEG uses 1. Simplification (DCT) (lossy stage) and 2.
Predictive coding (by Huffman or RLE) (lossless stage).
7 predictive methods for lossless JPEG, main categories:
7/17/2015
1.
Predict next pixel on a line is same as previous on same line.
2.
Next pixel on a line is same as one directly above it.
3.
Next pixel on a line is related to previous 3, average.
The step after the predicted pixel value is formed, is to compare
the value with the actual pixel value. Encode the differences
in values. If differences are large you don’t save. But info is
not lost.
JPEG image – original 921,600 byte file, Left is
252,906 bytes using JPEG lossless compression.
Right is 33,479 bytes at first level of lossy
compression that a person can detect.
7/17/2015
JPEG
22,418 bytes
8,978 bytes
Compression ratios:
41.1:1, 64.9:1, 102.6:1
Distortions apparent.
14,192 bytes
7/17/2015
GIF (Graphics Interchange Format)
GIF was developed by CompuServe to send images over
telephone lines.
It allows interlacing of image download to download
faster.
It supports only 256 colors or shades of gray.
It uses Lempel-Ziv-Welch Unversal coding to encode
whole bytes. This is faster than LZ that encodes at
the bit level.
7/17/2015
Digital Video – a sequence of images
It takes 50 images/second for full motion
video.
Each image in sequence closely resembles the
previous image.
Edges of moving objects change the most.
Different applications have different needs.
7/17/2015
Digital Video – Needs
Digital Movie Editing – accuracy
Digitized TV – low quality image, high bit
rate
HDTV – higher quality image, higher bit rate
Video Dics (DVD) –high capacity storage, TV
quality, (CD ROM) lower capcity storage.
Internet Video – low access speed, only
support low quality picture
7/17/2015
Video Teleconferencing – low quality picture,
but real time, audio cannot take delays.
MPEG (Moving Pictures Experts Group)
Algorithms for Compression of Audio and Video:
A movie is presented as a sequence of frames,
the frames typically have temporal
correlations
If parts of frame #1 do not change then they do not need
to be encoded in Frame #2 and Frame #3 again.
Moving vectors describe the re-use of a 16x16 pixel
area in other frames.
New areas are described as differences
MP3 is just the 3rd Layer of the MPEG standard used
for audio. Layer 1 and 2 are for video.
7/17/2015
MPEG
Single frame from a high quality 5.26 second video sequence,
compressed from original size of 5.648 Mbytes to 2.437 Mbytes
with the MPEG-1 compressor, set to preserve essentially all of
the quality of the video sequence. See the distortsions in the
single frame.
7/17/2015
MPEG frame types
MPEG frame types with different temporal
correlations:
(I-Frames) Intracoded full frames are transmitted
periodically.
(P-Frames) predictive frames and (B-Frames)
bidirectional frames are used for temporal correlations.
They can look to the “past” using (P-Frames) and to the
“future” using (B-Frames).
7/17/2015
MPEG modes
MPEG-1 compressed video, for storage on CDs,
40kbps – 1.2 Mbps per frame. On websites, can store
movies on CD discs.
MPEG-2 for high quality video or HDTV, 4-10 Mbps
per frame. Can store full length high quality movies
on DVD discs.
MPEG-4 for low quality video teleconferencing.
Supported on low bit rates 4.8 kbps – 64 kbps.
7/17/2015
Other Video Compression standards
H.261 (Px64):
•Video compression for video conferences.
• Compression in real time for use with ISDN.
• Compressed data stream= px64kbps where
p=1 to 30. There are 2 resolutions:
•CIF (Common Intermediate Format)
•QCIF (Quarter Intermediate Format)
Digital Video Interactive (DVI):
Intel/IBM technology for digital video.
•DVI PLV (Production Level Video, for VCR),
•DVI RTV (Real Time Video, lower quality).
7/17/2015
Storage Options – see additional notes.
Disc Technology:
• CD ROM, RW, etc.
• DVD
7/17/2015
System Requirements for real time
environments
1. The processing time for all system
components must be predictable
2. Reservation of all types of resources
must be possible
3. Short response time for time critical
processes
4. The system must be stable at high load
7/17/2015
Problems with continuous media
Computers handle data as discrete, so continuous
media, like video and audio must be viewed as
periodical and discrete by the computer.
There are few file system services for continuous
media applications.
Kernel, and hardware availability varies over
time. It is difficult to shedule processes. Many
interupts are to be handled by the system.
Reservation of resources, like CPU time, is not
possible on a standard desktop PC.
Delivery delays on then network make
scheduling periodical processing of the media
difficult.
7/17/2015
Possible reponses to Problems
The hardware architecture and the system
software in desktop computers are not
adapted for handling continuous media.
• Reserve: network bandwidth, if you can.
• Hardware support: use a dedicated server
or replacement of the single asynchronous
bus.
• Operating system software is not suited to
schedule multimedia services or reserve
resources. But, application level languages
such as SMIL can help.
• SMIL (Synchronized Multimedia Integration
Language) is a XML-based language to mix
media presentations over low speed
connections (>=28.8 kbps). See www.w3c.org
7/17/2015