Transcript Document
Media Data and File Formats Howell Istance © De Montfort University, 2001 1 Text Files • 1. Plain text (unformatted) – ASCII Character set is most common, 7 bits are used – This can represent 128 Code words • A = 1000001 • A = 1100001 • Computers store data in bytes • The extra bit can be used for Parity / Extended Character Sets : – Error detection • A parity bit is used • 10000011 (Odd Parity) – Extend codewords to 256 • IBM’s EBCDIC • 2. Formatted Text • Characters used to give text and formatting information • Bold, Italic, Position, etc • Also contains information on page numbers, version, index, etc – Formatted files are usually much larger than their plain text equivalent © De Montfort University, 2001 2 Vector Graphics and Bitmapped Images • Vector Graphics: image represented and stored as a collection of shapes, together with data (parameters) defining how the shapes will be produced and where they will be located • Bitmapped Images: image represented and stored as a collection of pixels which displayed make up the image © De Montfort University, 2001 3 Bitmapped images Pixels make up an image © De Montfort University, 2001 Colour - Depth #FF00A3 #255,0,163 #11111111,00000000,01100011 Each pixel has a colour depth. A certain number of ‘bits’ are used to define a pixels colour e.g. held as an RGB value. For 256 (or less) colours a palette is used as an index describing which 256 actual colours to use in an image. 240,200,171 © De Montfort University, 2001 Models for bitmapped images • Model consists a 2-D array of pixel values • May be of a different size and colour depth from the image which will be finally displayed. • A view of the image displayed on screen in an image editor is not the model, the view has been transformed and clipped and displayed • You do not see the model © De Montfort University, 2001 6 Models in Vector graphics y (0,0) x • Model is a series of mathematical constructs, together with data to define the location, size and attributes of each, such as colour, line style… • Constructs include – shapes (rectangle, oval, lines), – curves – polygons (sets of points, coordinate pairs, with lines joining consecutive points), polylines – polygon meshes (set points with instructions to show which points are to be joined by lines) © De Montfort University, 2001 7 Storing models….as bitmapped images 4.5 cm If the rectangles measure 4.5 cm, then on a 72 dpi (dots per inch) monitor, each side will contain 128 pixels • The image will contain 128 * 128 = 16384 pixels • If 3 bytes are used to store each pixel value, then 16384 * 3 bytes = 49152 bytes are required • Size is constant regardless of complexity of image within the 128 pixel square © De Montfort University, 2001 8 Storing models….as vector graphics 4.5 cm (Post Script) 0 1 0 setrgbcolour 0 0 128 128 rectfill 1 0 1 setrgbcolour 32 32 64 64 rectfill • 78 bytes required! • But Postscript renderer required, which slows the display process and has to be available on the host machine • Size increases as complexity of image increases, as more instructions are needed to define the image © De Montfort University, 2001 9 Representation as vector graphics… • Vector graphics enable images to be composed of filled shapes • Each object can be manipulated individually • Scaling objects is easy (by applying mathematical transforms to the object definition) © De Montfort University, 2001 10 Distorted poppy… Easy to manipulate individual elements of image here… © De Montfort University, 2001 11 Vector representation of complex images • To approach realistic image, complex definition of gradient meshes is required • File size approx. 10 MB • Generated in Illustrator • Taken from Wiley book site © De Montfort University, 2001 12 Rendered as a bitmapped image • File size of this image is 152K • No longer possible to interact with separate components • Edits and application of effects are done on the vector version and the end result is saved as a bitmapped image. © De Montfort University, 2001 13 GIF Files • Image files hold a lot of data – Image files tend to be large files • To reduce storage space COMPRESSION techniques are used • One solution is RUN LENGTH ENCODING – Count the number of pixels that are the same – Decoder uses this count to copy the original pixel X times © De Montfort University, 2001 14 GIF Files – Developed by Compuserve – Used for single or multiple images – Based on LZW compression • Lempel, Ziv invented original algorithm • Welch developed it further – Replaces multiple strings of data with a TOKEN…….. And a count value – LZW can give reasonable compression 50% © De Montfort University, 2001 15 Compression - LZW / RLE Runs of colour can be defined in a simple Run / Colour / Number format. e.g. R0206 for black, R0304 for gray, RF005 for green, R04FE for red, R0203 for black, R0614 for blue taken from the palette below. 1 2 3 4 5 6 The palette © De Montfort University, 2001 GIF Files • • • • Decompression is fairly quick Universal standard Not optimised for image compression UNISYS hold patent on LZW so there may be a problem with royalties © De Montfort University, 2001 17 Compression - LZW / RLE TIF uncompressed = 289k TIF lzw compressed = 248k TIF uncompressed = 90k TIF lzw compressed = 5k © De Montfort University, 2001 JPEG Files • Joint Photographic Experts Group • Uses a Fourier Transform technique to eliminate high frequency components in image • Uses several algorithms including run-length encoding • Can be lossy – blockiness – posterisation – Ringing © De Montfort University, 2001 19 Digital Video • We see a sequence of still images as a continuous movement if rate of presentation is greater than critical flicker fusion frequency (about 40 images/sec) • Film is shot at 24 frames/second but each frame effectively shown twice in projection giving refresh rate of 48 frames/second • 3 broadcast TV and video standards – NTSC – US, Canada – PAL – Europe, Australia – SECAM – France, Eastern Europe © De Montfort University, 2001 20 Interlacing and Frame rates • For each system, refresh rate is double broadcast frame rate, by showing half of one frame followed by the other half • NTSC (30 frames/second), PAL and SECAM (25 frames/second) © De Montfort University, 2001 21 Image sizes and Data Rates… • Consider the amount of data to represent a sequence of digital images at NTSC broadcast rate, if 3 bytes used to represent a pixel value – 640 * 480 * 3 = 900K per image – 1 second at 30 frames/second = 26 Mb – 1 minute at 30 frames/second = 1.6 Gb • For PAL/SECAM, similar – 768 * 576 * 3 – 1 second at 25 frames/second = 31 Mb – 1 minute at 25 frames/second = 1.85 Gb • Data (transfer) rate = data amount per unit time (e.g 26 Mb/sec for NTSC, 31 Mb/sec for PAL) © De Montfort University, 2001 22 Compression techniques • Can’t assume that devices (camera, video card) used to digitise images will be available for playback on end user machine • Need to provide software codec to apply a compression technique suited to capabilities of end user machine • All techniques operate on a sequence of bitmapped images • Video data normally compressed and recompressed twice, – when captured (hardware codec) –real-time compression needed – In order to be transmitted (software codec) © De Montfort University, 2001 23 Intra- and Inter-frame compression • Spatial (intra-frame) compression compresses each frame in isolation – Lossy techniques applied, leading to some loss of image quality • Temporal (inter-frame) compression calculates and compressed differences between sequence of frames – 1 Key frame + (succession of usually) 6 difference frames – Difference frame contains difference between original frame and preceeding key frame or preceeding difference frame • Time to compress may be (much) longer than time to decompress – asymetric codec • Fast decompression times important © De Montfort University, 2001 24 Software codecs • Four are main contenders for compressing video for delivery on CD-ROM, or via internet: Cinepak, Intel Indeo, Sorenson and MPEG-1 • Cinepak, Intel Indeo, Sorenson all use vector quantisation • Frame divided into small blocks (vectors), • Code book contains typical block patterns • closest approximation to code book entry worked out and index to code book is stored instead of original vector • Decompression (fast) obtained by replacing indices from data stream with code book entries • Compression (slow) as much as 150* decompress time © De Montfort University, 2001 25 Sound Files • Two main types • WAV files – Digital samples of analogue waveforms • Midi Files – Set of instructions to control computer © De Montfort University, 2001 26 WAV Files • Sound is sampled according to Nyquist Sampling Theorem • SAMPLE RATE – at least 2 X Highest frequency • Range of frequencies occurring in human voice is 300 3400Hz – Telephone Sampling rate is at least 6800 Hz – is actually 8Hz • Range of frequencies ear can detect is 20 - 20, 000 Hz – Sampling rate is at least 40,000Hz – is actually 44,100Hz © De Montfort University, 2001 27 15 10 Volts 5 0 Sound Level -5 -10 -15 Time The sound is sampled at regular intervals © De Montfort University, 2001 28 Conversion to digital • There are 21 signal levels – -10 to 0 to +10 • We need 5 bits to represent this range • Note 5 bits gives 32 combinations – Use 0XXXX for Positive values – Use 1XXXX for negative values © De Montfort University, 2001 29 3volts is represented by 00011 7volts is represented by 00111 10volts is represented by 01100 -3volts is represented by 10011 -7volts is represented by 10111 -10volts is represented by 11100 0volts is represented by 00000 • Each sample is transmitted to an output device sequentially © De Montfort University, 2001 30 Quantisation noise • • • • • The example uses a 1 volt step range What if the audio sample is 7.5 volts? The encoder gives a value of 8 volts The decoder outputs an 8 volt signal This error is called QUANTISATION NOISE © De Montfort University, 2001 31 Companding • Most audio signals are quiet – more signals at lower levels than high levels • Companding means using a non-linear scale – For example, 0-5 volts might have 20 values – 5- 8 volts might have 8 values – 8-10 volts might have 2 values • This gives better resolution at lower levels at the expense of high signal levels © De Montfort University, 2001 32 CD Quality WAV files • Use 16 X 2 bits to represent the audio signal • This gives 65536 X 2 “steps” – – – – – Quantisation noise is low A lot of bits will carry no information (low sound levels) This means a lot of data redundancy WAV file size becomes large 1Mbyte = 0.7 seconds of sound © De Montfort University, 2001 33 MIDI Files • • • • • These are digital sound files Control computers, sequencers, etc Each bit in the signal is used Must have a MIDI player to hear the sound File size is very small compared to WAV files © De Montfort University, 2001 34 Audio Compression • ADPCM – Predicts next sample value • TrueSpeech – Based on mathematical model of airflow over vocal tract – Highly efficient (1/16th) • MPEG Audio – Fits with MPEG Video files © De Montfort University, 2001 35 Psychoacoustic model Throw away samples which will not be perceived, ie those under the curve © De Montfort University, 2001 36 Zip Files • Popular file compression utility – Based on LZW • Used to transfer or store large files • Zipped files give good results for text and WAV files • Poor results for graphics / video (typically 3%) © De Montfort University, 2001 37 File Size / Performance • There is a trade-off between: Speed of loading File size Quality • There is no one correct solution for all multimedia applications © De Montfort University, 2001 38