Transcript Document
Media Types Text Image Graphics Audio Video Text Representation ASCII ISO Character Sets Marked-up Text Structured Text Hypertext Operations Character Operations String Operations Editing Formatting Pattern-matching & searching Sorting Compression Encryption Language-specific operations Text - Representation ASCII 7-bit code 128 values in ASCII character set use of 8th bit in text editors/word processors creates incompatibility ISO character sets extended ASCII to support non-English text ISO Latin provides support for accented characters à, ö, ø, etc. ISO sets include Chinese, Japanese, Korean & Arabic UNICODE 16 bit format 32768 different symbols Text - Representation Marked-up text nroff, troff LaTEX SGML Structured Text HTML HyTime XML, XSL, XLL structure of text represented in data structure, usually treebased ODA, structure embedded in byte-stream with content Hypertext non-linear graph or “web” structure : nodes and links currently subject of intensive ISO standards activity Text - Operations Character operations String operations basic data type with assigned value permits direct character comparison (a<b) comparison concatenation substring extraction and manipulation Editing perhaps the most familiar set of operations on text cut/copy/paste strings v. blocks, dependent on document structure Text - Operations Formatting interactive or non-interactive (WYSIWYG v. LaTEX) formatted output font management bitmap page description language (Postscript, PDF) typeface point size (1 point = 1/72 of an inch) TrueType fonts : geometric description + kerning Pattern-matching and Searching search and replace wildcards regular expressions for large bodies of text, or text databases, use of inverted indices, hashing techniques and clustering. Text - Operations Sorting numerous varieties of sort, all of them extensively studied in basic programming sort complexity is a major factor in data handling performance Compression ASCII uses 7 bits per character, though most word-processors actually use the 8th bit to use up a byte per character Information theory estimates 1-2 bits per character to be sufficient for natural language text This redundancy can be removed by encoding : Huffman : varies the numbers of bits used to represent characters, shortest codes for highest frequency characters Lempel-Ziv : identifies repeating strings and replaces them by pointers to a table Both techniques compress English text at a ratio of between 2:1 and 3:1 Text - Operations Encryption text encryption is widely used in electronic mail and networked information systems most widely-used techniques : subject of major controversy : DES RSA public-key PGP key escrow systems Clipper chip “strong” encryption now being legally outlawed in a number of countries Language-specific operations spell-checking parsing and grammar checking style analysis Image Representation Colour Model Alpha Channels Number of Channels Channel Depth Interlacing Indexing Pixel Aspect Ratio Compression Operations Editing Point operations Filtering Compositing Geometric transformations Conversion Image - Representation Colour Model 2 main types colour production on output device theory of human colour perception CIE colour space international standard used to calibrate other colour models developed in 1931, as CIE XYZ, based on tristimulus theory of colour specification Image - Representation RGB HSB numeric triple specifying red, green and blue intensities convenient for video display drivers since numbers can be easily mapped to voltages for RGB guns in colour CRTs Hue - dominant colour of sample, angular value varying from red to green to blue at 120° intervals Saturation - the intensity of the colour Brightness - the amount of gray in the colour CMYK displays emit light, so produce colours by adding red, green and blue intensities paper reflects light, so to produce a colour on paper one uses inks that subtract all colours other than the one desired printers use inks corresponding to the subtractive primaries, cyan, magenta and yellow (complements of RGB) Image - Representation additionally, since inks are not pure, a special black ink is used to give better blacks and grays YUV colour model used in the television industry also YIQ, YCbCr, and YPbPr Y represents luminance, effectively the black-and-white portion of a video signal UV are colour difference signals, form the colour portion of a video signal, and are called chrominance or chroma YUV makes efficient use of bandwidth as the human eye has greater sensitivity to changes in luminance than chrominance, so bandwidth can be better utilised by allocating more to luminance and less to chrominance Alpha Channels images may have one or more alpha channels defining regions of full or partial transparency Image - Representation can be used to store selections and to create masks and blends Number of channels the number of pieces of information associated with each pixel usually the dimensionality of the colour model plus the number of alpha channels Channel depth number of bits-per-pixel used to encode the channel values commonly 1,2,4 or 8 bits, less commonly 5,6,12 or 16bits in a multiple channel image, different channels can have different depths Interlacing storage layout of a multiple channel image could separate channel values (all R values, followed by all G, followed by all B) or could use interlacing (all RGB for pixel 1, all RGB for pixel 2.........) Image - Representation Indexing pixel colours can be represented by an index in a colour map or a colour lookup table (CLUT) Pixel aspect ratio ratio of pixel width to height square pixels are simple to process, but some displays and scanners work with rectangular pixels if the pixel aspect ratios of an image and a display differ the image will appear stretched or squeezed Compression a page-sized 24-bit colour image produced by a scanner at 300dpi takes up about 20 Mbytes many image formats compress pixel data, using run-length coding, LZW, predictive coding and transform coding many image formats : JPEG, GIF, TIFF, BMP most widely used Image - Operations These operations can operate directly on pixel data or on higher-level features such as edges, surfaces and volumes Operations on higher-level features fall into the domain of image analysis and understanding and will not be considered here Editing changing individual pixels for image touch-up, forms the basis of airbrushing and texturing cutting, copying and pasting are supported for groups of pixels, from simple shape manipulation through to more complex foreground and background masking and blending Point operations consists of applying a function to every pixel in an image Image - Operations only uses the pixels current value, neighbouring pixels cannot be used Thresholding Colour Correction a pixel is set to 1 or 0 depending on whether it is above or below a threshold value - creates binary images which are often used as masks when compositing modifying the image to increase or reduce contrast, brightness, gamma effects, or to strengthen or weaken particular colours Filtering like point operations, operate on every pixel in an image, but use values of neighbouring pixels as well used to blur, sharpen or distort images, producing a variety of special effects Image - Operations Compositing the combining of two or more images to produce a new image generally done by specifying mathematical relationships between the images Geometric Transformations basic transformations involve displacing, rotating, mirroring or scaling an image more advanced transformations involve skewing and warping images Conversions conversions between image formats are commonplace and a number of p.d, shareware and commercial tools exist to support these other forms of conversion include compression and decompression, changing colour models, and changing image depth and resolution Graphics Representation Geometric Models Solid Models Physically-based Models Empirical Models Drawing Models External formats for Models Operations Primitive Editing Structural Editing Shading Mapping Lighting Viewing Rendering Graphics - Representation The central notion of graphics, as opposed to image data, is in the rendering of graphical data to produce an image. A graphics type or model is therefore the combination of a data type plus a rendering operation Graphics Representation Please note - object in graphics modelling usually refers to an element of the scene being modelled, unless you are using object-oriented graphics programming Geometric Models consist of 2D and/or 3D geometric primitives 2D primitives include lines, rectangles, ellipses plus more general polygons and curves 3D primitives include the above plus surfaces of various forms. Curves and curved surfaces described by parameterised polynomials Graphics - Representation primitives are first described in local or object co-ordinates, then arranged in groups in a common world co-ordinate system by applying modelling transformations transformations include rotation, translation and scaling primitives can be used to build structural hierarchies, allowing each structure thus created to be broken down into lower-level structures and primitives (i.e. blueprinting) Several standard device-independent graphics libraries are based on geometric modelling GKS (Graphic Kernel System(ISO)) PHIGS (Programmers Hierarchical Interactive Graphic System (ISO)) - see also PHIGS+ and PEX OpenGL - portable version of Silicon Graphics library Solid Models Constructive Solid Geometry (CSG) : solid objects are combined using the set operators union, intersection and difference. Graphics - Representation Physically-based Models Surfaces of revolution : a solid is formed by rotating a 2D curve about an axis in 3D space - lathing Extrusion : a 2D outline is extended in 3D space along an arbitrary path Using the above techniques will produce models much faster than building them up from geometric primitives, but rendering them will be expensive realistic images can be produced by modelling the forces, stresses and strains on objects when one deformable object hits another, the resulting shape change can be numerically determined from their physical properties Empirical Models complex natural phenomena (clouds, waves, fire, etc.) are difficult to describe realistically using geometric or solid modelling Graphics - Representation Drawing Models while physically based models are possible, they may be computationally expensive or intractable the alternative is to develop models based on observation rather than physical laws, such models do not embody the underlying physical processes that cause these phenomena but they do produce realistic images fractals, probabilistic graph grammars (used for branching plant structures) and particle systems(used for fires and explosions) are examples of empirical models describing an object in terms of drawing or painting actions the description can be seen as a sequence of commands to an imaginary drawing device - Postscript, LOGO turtle graphics External formats for Models need for export/import formats between graphics packages CGM & CAD are OK. Postscript and RIB are render-only Graphics - Operations Primitive editing specifying and modifying the parameters associated with the model primitives e.g. specify the type of a primitive and the vertex coordinates and surface normals Structural editing creating and modifying collections of primitives establish spatial relationships between members of collections Shading the modelling techniques described so far have provided the means to specify the shape of objects, but shading provides further information for the image in describing the interaction of light with the object. This interaction is described in terms of the colour of an object, how it reflects light and if it transmits light Graphics - Operations several general-purpose methods exist to describe shading, most initially describe the surface of the object using meshes of small, polygonal surface patches flat shading - each patch is given a constant colour Gouraud shading - colour information is interpolated across a patch Phong shading - surface normal information is interpolated across a patch Ray tracing & Radiosity - physical models of light behaviour are used to calculate colour information for each patch, giving highly realistic results for photorealistic images extremely flexible shading is required, tools such as RenderMan actually provide programmable shaders which can be attached to objects, simulating different light effects and surface normals. Mapping techniques for enhancing the visual appearance of objects Graphics - Operations Texture mapping Bump mapping an image, the texture map, is applied to a surface requires a mapping from 3D surface coordinates to 2D image coordinates, so given a point on the surface the image is sampled and the resulting value used to colour the surface at that point shaders can also provide solid textures, where the texture is obtained from 3D rather than 2D space, and procedural textures, where the texture is calculated rather than sampled as texture mapping, but used to change the vector of the surface rather than the colour used to describe minor surface changes such as scratches or scrapes Displacement mapping local modifications to the position of a surface produces ridges or grooves Graphics - Operations Environment mapping Shadow mapping also known as reflection mapping, used to handle limited forms of reflection more primitive technique than ray-tracing similar to environment mapping in that it provides a primitive lighting effect without the expense of ray-tracing produces shadows Lighting within a model, in addition to the graphics objects, there are lights to illuminate the scene. There are various forms of light source, each of which can be parametrically specified ambient light - background lighting, comes from all directions with equal intensity point lights - come from specific points in space, intensity governed by inverse square law Graphics - Operations Viewing directional lights - located at infinity in some direction, intensity is constant spot lights - illuminating a cone-shaped volume to produce an image of a 3D model we require a transformation which projects 3D world coordinates onto 2D image coordinates transformation applied to viewing volume, that part of the model that appears in the image view specification consists of selecting the projection transformation, usually from parallel or perspective projections although camera attributes can be specified in some renderers, and the view volume Rendering rendering converts a model, including shading, lighting and viewing information, into an image software allows selection and fine-tuning of control parameters Graphics - Operations output resolution - the width and height of the output image in pixels, and the pixel depth rendering time - quick and low-quality v. slow and high resolution Digital Video Representation Analog formats sampled Sampling rate Sample size and quantisation Data rate Frame rate Compression Support for interactivity Scalability Operations Storage Retrieval Synchronisation Editing Mixing Conversion Digital Video - Representation Analog formats sampled Digital video frames can obtained in two ways : Synthesis - usually by a computer program Sampling - of an analog video signal. Since analog video comes in various different flavours, according to frame rate, scan rate, composite v component, sampling rate and size vary. Digital Video - Representation Sampling rate the value of the sampling rate determines the storage requirement and data transfer rate the lower limit for the frequency at which to sample in order to faithfully reproduce the signal, the Nyquist rate, is twice the highest frequency within the signal video processing is simplified if each frame and each scan line give rise to the same number of samples, requiring the sampling frequency to be an integer multiple of the scan rate Sample size and quantisation sample size is the number of bits used to represent sample values quantisation refers to the mapping from the continuous range of the analog signal to discrete sample values choice of sample size is based on : signal to noise ratio of sampled signal sensitivity of medium used to display frames Digital Video - Representation sensitivity of the human eye digital video commonly uses linear quantisation, where quantisation levels are evenly distributed over the analog range (as opposed to logarithmic quantisation) Data rate high data rate formats can be reduced to lower data rates by a combination of : compression reducing horizontal and vertical resolution reducing the frame rate for example : start with broadcast quality digital video at 10Mbytes/s divide the horizontal and vertical resolutions by 2, giving VHS quality resolution divide the frame rate by 2 compress at a rate of 10:1 data rate becomes 1Mbit/s, suitable for use on LANs and on optical storage devices (i.e. CD-ROM) Digital Video - Representation Frame rate Compression we have already considered compression techniques, in digital video we can compare methods by three factors : 25 or 30 fps equates to analog frame rate, or full-motion video at 10-15 fps motion is less accurately depicted and the image flickers, but the data rate is much reduced Lossy v. lossless Real-time compression - trade-off between symmetric models and asymmetric models with real-time decompression Interframe (relative) v. Intraframe (absolute) compression (i.e. MPEG-1 v. Motion JPEG) Support for interactivity random access to frames differential rate and reverse playback cut and paste capability Digital Video - Representation Scalability scalable video allows control over video quality, we can identify 2 forms : Transmit scalability - encoded data rate is chosen at compression time from a range of rates, governed by transmission and processing constraints and/or storage capacity. Currently in use for low rate digital video Receive scalability - decoded data rate is chosen at decompression time to match playback requirements. Attractive concept but not yet available in current video coding standards current approaches to low rate digital video include : DVI (Digital Video Interactive) - two forms, Production Level Video (PLV) and Real-Time Video (RTV). PLV only really intended for playback, RTV produces poorer quality but is intended for compression. Both use interframe compression to achieve rates of 1Mbit/s, but require costly hardware. MPEG-1 - 1Mbit/s Digital Video - Representation MPEG-2 - broadcast quality video at rates between 215Mbit/s MPEG-4 - low data rate video MPEG-7 - metadata standard for video representation Motion JPEG px64 (CCITT H.261) - intended for video applications using ISDN (Integrated Services Digital Network). Known as px64 since it produces rates that are multiples of ISDNs 64Kbits/s B channel rate. Uses similar techniques to MPEG but, since compressions and decompression must be real-time, quality tends to be poorer. H.263 - based on H.261, but offers 2.5 times greater compression, uses MPEG-1 and MPEG-2 techniques. Digital Video - Operations Storage to record or playback digital video in real-time, the storage system must be capable of sustaining data transfer at the video data rate 4 main forms of storage for digital video are : Magnetic tape - at present only magnetic tape can provide the vary high capacity storage required for digital video at practical costs ( 1 hour of CCIR 601 4:2:2 uses 72 Gbytes, while 1 hour of digital HDTV requires nearly 1 Tbyte) Special purpose magnetic storage systems - useful for short durations of high data rate digital video, can be connected direct to external equipment and are thus useful for capture and editing (see diagram) Video memory boards - specialist boards with large amounts of semiconductor memory (several hundred Mbytes or more), capable of storing short durations of uncompressed digital video, useful for capture and editing. Digital Video - Operations General purpose magnetic and optical storage systems - most low data rate video representations (MPEG, etc.) were designed to support the use of conventional storage media for real-time video playback. Problem is size of storage, even using MPEG-1 13 minutes of video will fill a 100Mbyte disk. Retrieval uses frame addressing, as in analog video, but there are some problems : low data rate formats result in variable sized frames, so an index giving frame offsets needs to be maintained to support random access interframe compression techniques, i.e. MPEG, only code key frames independently, other frames are derived from these key frames. So random access requires to first find the nearest key frame and then use this to decode the desired frame, again using the index but enhancing it with key frame locations Digital Video - Operations Synchronisation suffers same problems as analog video, so uses same techniques digital video also has some additional techniques not available in analog video, such as changing resolution to maintain frame rate Editing 2 types : tape-based - same procedures as with analog video, except no generation loss and the players are on the same machine nonlinear - basically a clips-library, using cut and paste techniques to build a video sequence Mixing real-time effects, such as tumbles, wipes and fades, are calculated in the same way as for analog video, in fact for the majority of such effects whether the original source is analog or digital, the effects are digitised Digital Video - Operations non-real-time effects are only possible using digital video, and obviate the need for specialist equipment, being only dependent on the speed of the processor and the patience of the user, storage considerations can be overcome with the use of pointers and single frame editing Conversion variety of formats demands conversion formats real-time conversion requires specialist hardware compression/decompression within a single format also requires specialist software/hardware Digital Audio Representation Sampling frequency Sample size and quantisation Number of channels (tracks) Interleaving Negative samples Encoding Operations Storage Retrieval Editing Effects and filtering Conversion Digital Audio - Representation Digital Audio Representation 2 main areas : telecommunications entertainment (audio CD) Produced by sampling a continuous signal generated by a sound source. An analog-to-digital converter (ADC) takes as input an electrical signal corresponding to the sound and converts it into a digital data stream. The reverse process, to generate the sound through an amplifier and speakers, involves a digital-toanalog converter (DAC) Sampling frequency (rate) sampling theory shows that a signal can be reproduced without error from a set of samples, providing the sampling frequency is at least twice the highest frequency present in the original signal Digital Audio - Representation telephone networks allocate a 3.4kHz bandwidth to voice-grade lines, thus a sampling rate of 8kHz is used for digital telecommunications the human ear is sensitive to frequencies of up to about 20kHz, so to digitise any perceivable sound a sampling rate of over 40kHz is required Sample size and quantisation during sampling, the continuously varying amplitude of the analog signal is approximated by digital values, this introduces a quantisation error, being the difference between the actual amplitude and the digital approximation quantisation error is apparent when the signal is reconverted to analog form as distortion, a loss in audio quality quantisation error can be reduced by increasing the sample size, as allowing more bits per sample will improve the accuracy of the approximation Digital Audio - Representation quantisation refers to breaking the continuous range of the analog signal into a number of unique digital intervals, based on one of a number of schemes : linear quantisation - uses equally spaced intervals, so if the sample size is 3 bits and the maximum signal variation is 5.0 then the quantisation interval would be 0.625 units of signal amplitude nonlinear quantisation (especially logarithmic quantisation) - uses non-equally spaced intervals, lower amplitude intervals are more closely spaced than higher amplitude, results in greater sensitivity to lower amplitude sound where the human ear is most sensitive Number of channels (tracks) speech quality audio is mono (1 track) stereo audio requires 2 tracks some consumer audio equipment use 4 tracks (quadrophonic) professional audio equipment uses 16, 32 or more Digital Audio - Representation Interleaving a multi-channel audio value can be encoded by interleaving channel samples or by providing separate streams for each channel the advantage of interleaving is in synchronisation, and it also offers some benefits in storage and transmission the disadvantages of interleaving are that it can be wasteful of space or bandwidth if not all channels are needed, it freezes the synchronisation between channels thus preventing temporal shifts, and it may not allow variation in the number of channels Negative samples the voltages found in analog audio signals alternate between positive and negative values negative values can be encoded successfully for processing in twos complement, ones complement or sign-magnitude representation Digital Audio - Representation Encoding encoding audio data reduces storage and transmission costs, and compressed audio also provides better quality when compared to uncompressed audio at the same data rate 2 commonly-used methods : PCM (Pulse Code Modulation) - uses the fact that a digital signal can be formed from a series of pulses. PCM values are simply sequences of uncompressed samples, so they provide a reference format for comparison with more complex coding methods ADPCM (Adaptive Delta Pulse Code Modulation) reduces PCM data rate by encoding the differences between samples. ADPCM is widely used and is associated with some encoding standards, such as CCITT G.721. Digital Audio - Operations Storage it is possible to record digital audio, even at the data rates of the high quality formats, on general purpose magnetic storage theoretically, a magnetic disk with a sustainable transfer rate of 5 Mbytes per second could playback 50 channels of CD-quality digital audio. In practice this would not be possible without a highly optimised layout, but one or two channels are easily within the reach of small computer systems since an hour of stereo digital audio, at the CD data rate, requires over half a Gigabyte of storage, tertiary storage in the form of DAT tapes, CD discs or optical disks is normally adopted, with the information being mounted onto the system manually or through a jukebox Retrieval need to support random access and ensure continuous flow of data to DAC Digital Audio - Operations portions of audio sequences, segments, are identified by their starting time and duration, these can be located is by mapping the starting time to a segment address, which the file system then maps to a physical address on disk where there is no direct mapping to enable segment location by time code, an index of segments must be separately maintained continuous flow of data is easy to maintain with a dedicated storage system, but requires careful control where storage is scheduled for a number of such tasks Editing as with digital video, 2 types : tape-based disk-based to avoid audible clicks when inserting one sample into another, cross-fades are used, where the amplitudes of the original segment and the inserted segment are added and scaled about the insertion point Digital Audio - Operations digital audio also supports non-destructive editing, where the segments of data are accessed through a data structure known as a play-list, which essentially contains a set of pointers to the data and details on ordering and other forms of edit to be performed on the data when it is joined Effects and filtering digital filtering techniques permit a number of effects on audio : Delay Equalisation & Normalisation Noise reduction & Time compression and expansion Pitch shifting Stereoisation Acoustic environments Conversion one format to another (uncompressing ADPCM->PCM) altering encoding parameters (i.e. resampling at lower frequency) Music Representation Operational v. Symbolic MIDI SMDL Operations Playback & Synthesis Timing Editing & Composition Music - Representation The existence of powerful, low-cost, digital signal processors mean that many computers can now record, generate and process music. Music is also widely used in multimedia applications, so we require a media type for music to focus on the computers musical capabilities. Representation of Music Operational v. Symbolic operational representations specify exact timings for music and physical descriptions of the sounds to be produced symbolic representations use descriptive symbolism to describe the form of the music and allow great freedom in the interpretation both types are described as structural representations, since instead of representing music by audio samples there is information about the internal structure of the music Music - Representation The existence of powerful, low-cost, digital signal processors mean that many computers can now record, generate and process music. Music is also widely used in multimedia applications, so we require a media type for music to focus on the computers musical capabilities. Representation of Music Operational v. Symbolic operational representations specify exact timings for music and physical descriptions of the sounds to be produced symbolic representations use descriptive symbolism to describe the form of the music and allow great freedom in the interpretation both types are described as structural representations, since instead of representing music by audio samples there is information about the internal structure of the music Music - Representation The existence of powerful, low-cost, digital signal processors mean that many computers can now record, generate and process music. Music is also widely used in multimedia applications, so we require a media type for music to focus on the computers musical capabilities. Representation of Music Operational v. Symbolic operational representations specify exact timings for music and physical descriptions of the sounds to be produced symbolic representations use descriptive symbolism to describe the form of the music and allow great freedom in the interpretation both types are described as structural representations, since instead of representing music by audio samples there is information about the internal structure of the music Music - Representation To illustrate the structural representations, we can consider two : MIDI - a widely use protocol allowing the connection of computers and musical equipment, an operational representation SMDL - a proposal for a standard structure for documents containing musical information, having both operational and symbolic aspects MIDI the Musical Instrument Digital Interface was developed in the early ‘80s by musical equipment makers Devices : electronic keyboards and synthesisers drum machines sequencers (to record and play back MIDI messages) music<->film and music<->video synchronisation equipment Music - Representation Connection ports : MIDI OUT - allows a device to send MIDI messages it has produced to other MIDI devices MIDI IN - receives MIDI messages from other MIDI devices MIDI THRU - repeats received messages, permitting daisy-chaining of MIDI devices MIDI devices process MIDI messages differently, according to their function or to the sound palette used by the device, hence different synthesisers can produce different sounds supplied with the same MIDI messages MIDI Concepts: Channel - a MIDI connection has 16 message channels, devices can be set to respond to all channels or only to specific channels Key number - notes are identified by key number, 128 compared with a standard keyboard of 88 Controller - 128 different controllers are available under the MIDI protocol, though not all are currently defined, changing the value of a controller typically alters sound production Music - Representation Patch/program - an audio palette is called a program or patch, a synthesiser capable of having a number of patches active at the same time is called multi-timbral Polyphony - the ability of a synthesiser to play many notes at a time Song - a recorded or preprogrammed MIDI sequence Timing clock - a MIDI sequencer timestamps messages using a timebase measured in parts per quarter note (PPQ). Typical timebase values are 24, 96 and 480 PPQ. To convert the timebase into actual time you use the tempo, measured in beats per minute (BPM) where we assume that one beat is equal to a quarter note. Thus if we have a tempo of 180 BPM, a time base of 96PPQ = 1/3 x 1/96 = 3.47ms MIDI synchronisation - MIDI devices can be set to internal synch or external synch, when set to internal synch a device is known as a master and produces a timing clock message on its MIDI OUT at 24PPQ which slave devices use for external synch MTC - MIDI Time Code is used to synchronise MIDI with film or video, used to trigger sound effects or musical sequences Music - Representation MIDI Protocol : based on 8-bit code for messages, each message consists of a single command byte and possibly one or more data bytes (see table) Channel voice messages (8c-Ec) - determine the actual notes played, speed of hit and release and the values of controllers Channel mode messages (Bc, with controllers 121-127) - selects the mode of a synthesiser, responding to one channel or all channels, each channel separately voiced or all voices used for one channel System messages (F0-FF) - general system functions, timing clock, MIDI time code messages, system reset, start device, stop device, etc. Limitations of MIDI : operates at 31250bps, allows 500 notes per second which may not be enough for complex pieces limited number of channels, lack of device addressing and other flaws make configuring large MIDI networks difficult device dependence of MIDI data Music - Representation SMDL the Standard Music Description Language was developed by the MIPS committee of ANSI SMDL encompasses representation of music for electronic dissemination and production by software, the representation of scores and musical examples in printed documents and the representation of musical annotation and attributes used for musical analysis or by music databases SMDL is a DTD of SGML, based on a document type called musical works or works. Each work has 4 hierarchically structured sections: core section - musical events, such as note sequences, which form the work gestural section - performances of the core, which may differ in interpretation visual section - displays the core in printed, includes formatting and lyrics analytical section - allows a number of theoretical analyses on the core, its score and performances to be included in the work Music - Operations In considering music representation, we can recognise several advantages over audio : music representation will be more compact than audio it is portable and can be synthesised with the fidelity and complexity appropriate to the output devices used while digital audio suffers from inherent noise, musical representations are noise free many operations can be performed on music that would be infeasible or require extensive processing on audio Playback & Synthesis during audio playback, the listener has limited influence over the musical aspects of the performance, beyond changing the volume or processing the audio in some way. If music is produced by synthesis from a structural representation the listener can Music - Operations independently change pitch and tempo, increase or decrease individual instruments volumes or change the sounds they produce musical representations offer greater potential for interactivity than audio Timing structural representation makes timing of musical events explicit the ability to modify tempo makes it possible to alter the timing of groups of musical events and adjust the synchronisation of those events with other events (film, video, etc.) Editing & Composition basic editing allows the user to modify primitive events and notes more complex editing operations operate on musical aggregates (chords, bars, etc.) to permit phrase-repetition, melody replacement and other such functions composition software simplifies the task of generating and combining or rearranging tracks, and prints the score Animation Representation Cel models Scene-based models Event-based models Key frames Articulated objects & hierarchical models Scripting & procedural models Physically-based & empirical models Operations Graphics operations Motion & parameter control Rendering Playback Animation - Representation Separating animation and video follows the same track we took in separating image and graphic, based on modelling. Animation types provide models which are rendered to produce video. Animation is distinct from graphic in that it is time-dependent, but as in the image<->video relationship, sampling an animation model at a particular time will result in a graphics model, which can be rendered to produce an image Animation Representation Cel models early animators drew on transparent celluloid sheets or cels, different sheets contained different parts of the scene, which was assembled by overlaying the sheets in animation, cels are digital images with a transparency channel Animation - Representation Scene-based models scenes are rendered by drawing the cels back to front, with movement being added by changing the position of cels from one frame to the next a cel model is therefore a set of images, their back to front order, and their relative position and orientation in each frame simply a sequence of graphics models, each representing a complete scene highly redundant and do not support continuity of activities Event-based models expresses the difference between successive scenes as events that transform one scene to the next still discrete rather than continuous, but permits the management of scenes by input devices (i.e. mouse, tablet, etc.) rather than each scene having to be entered manually Animation - Representation Key frames Articulated objects & hierarchical models in essence, the animator models the beginning and end frames of a sequence and lets the computer calculate the others by interpolation attempt to overcome the problems of key frames by developing articulated objects, jointed assemblies where the configuration and movement of sub-parts are constrained ensures proper relative positioning and constraint maintenance during interpolation (will not allow solid objects to pass through other solid objects) Scripting and procedural models current state-of-the-art animation modelling systems have tools allowing the animator to specify key frames, preview sequences in real time and control the interpolation of model parameters an additional feature in many such systems are scripting languages Animation - Representation scripting languages offer the animator the opportunity to express sequences in concise form, particularly useful for repetitive and structured motion and also provide highlevel operations intended specifically for animation Physically-based models & empirical models this approach is used to produce sequences depicting evolving physical systems a mathematical model of the system is derived from physical principles or empirical data and the model is then solved, numerically or through simulation, at a sequence of time points, each one resulting in a single frame for the sequence Animation - Operations Graphics operations since animation models are graphics models extended in time, all the graphics operations we have already covered are applicable here Motion and parameter control since the essential difference between graphics and animation operations is the addition of the temporal dimension, graphics objects become animations through the assignment of complex trajectories or behaviours over time commercial 3D animation systems provide modelling tools and animation tools, the modelling tools produce 3D graphic models and the animation tools add temporal transformations to these objects Rendering 2 basic forms : real-time - model is rendered as frames are displayed, 10+ frames per second are required to avoid jerkiness, so only appropriate for simple models or with special hardware non-real-time -frames are pre-rendered, taking as long as necessary to do so, provides higher visual quality and consistency of frame-rate Animation - Operations Playback non-real-time rendering offers the same operational possibilities in playback as digital video, over rate and direction real-time rendering is much more interactive and modifiable, objects can be added and removed, lights turned on and off, the viewpoint changed, and so on