Transcript Slide 1

Chapter 3
Representation
Key Concepts
• Digital vs Analog
• How many bits?
• Some standard representations
• Compression Methods
3-2
Chapter Goals
• Distinguish between analog and digital
information.
• Explain data compression and calculate
compression ratios.
• Describe the characteristics of the ASCII and
Unicode character sets.
3-3
Chapter Goals
• Perform various types of text compression.
• Explain the nature of sound and its
representation.
3-4
Chapter Goals
• Explain how RGB values define a color.
• Distinguish between raster and vector
graphics.
3-5
Data and Computers
• Computers are multimedia devices
• We need to represent many kinds of data,
using only bits
– Numbers
– Text
– Audio
– Images and graphics
– Video
3-6
Analog and Digital Information
• Computers are finite.
• Computer memory has only so much room to
store a certain amount of data.
• The goal, is to represent enough of the world
to satisfy our needs.
3-7
Analog and Digital Information
• Information can be represented in one of two ways:
analog or digital.
Analog data A continuous representation, analogous to
the actual information it represents.
INFINITE # of values
Digital data A discrete representation, breaking the
information up into separate elements.
FINITE # of values
3-8
Analog and Digital Information
Figure 3.1
A mercury thermometer
continually rises in direct
proportion to the
temperature
3-9
Digitization of Analog Information
• We digitize information by breaking it into pieces
and representing those pieces separately.
• Basically, analog data becomes digital when we
measure it.
• Why digitize?
– Because computers use bits
– Digital data can be kept “perfect”
3-10
Electronic Signals (Cont’d)
• Periodically, a digital signal is reclocked to
regain its original shape.
Figure 3.2
An analog and a digital signal
Figure 3.3
Degradation of analog and digital signals
3-11
3-12
Binary Representations
• We need to REPRESENT different kinds of
information using only bits
• So, how many bits do we need?? It depends
on how many things we need to represent.
Who has a pet bunny?
3-13
Binary Representations
• One bit can be either 0 or 1. Therefore, one bit
can represent only two things.
• Two bits can represent four things because
there are four combinations of 0 and 1 that
can be made from two bits:
00
01
10
11
3-14
Binary Representations
• Three bits can represent eight things because
there are eight combinations of 0 and 1 that
can be made from three bits.
000
010
100
110
001
011
101
111
3-15
Binary Representations
Figure 3.4
Bit combinations
3-16
Binary Representations
• In general,  bits can represent 2 things
because there are 2n combinations of 0 and 1
that can be made from n bits.
• Note that every time we increase the number
of bits by 1, we double the number of things
we can represent.
3-17
Representing Text
• We need to represent every possible character.
• The general approach is to list them all and assign
each an integer value.
• A character set is a list of characters and the integer
codes used to represent each one.
• This works because we already know how to
represent the integers (Chapter 2).
3-18
The ASCII Character Set
• ASCII stands for American Standard Code for
Information Interchange.
• ASCII uses eight bits to represent 256
characters.
3-19
The ASCII Character Set
3-20
The ASCII Character Set
• Note that the first 32 characters in the ASCII
character chart do not have a simple
character representation that you could print
to the screen.
• Thus, they don’t show up in Notepad.
3-21
The Unicode Character Set
• 8 bit ASCII is not enough. It can represent 2^8 = 256
characters.
• 16 bit Unicode can represent 2^16 = , over 65,536
characters.
• Unicode was designed to be a superset of ASCII. That
is, the first 256 characters in the Unicode character
set correspond exactly to the extended ASCII
character set.
3-22
The Unicode Character Set
Figure 3.6 A few characters in the Unicode character set
3-23
Data Compression
• Data compression Reduction in the amount of space
needed to store a piece of data.
• Compression ratio The size of the compressed data
divided by the size of the original data.
• A data compression techniques can be
– lossless, which means the data can be retrieved without
any loss of the original information,
– lossy, which means some information may be lost in the
process of compaction.
3-24
Text Compression
• It is important that we find ways to store and
transmit text efficiently, which means we must
find ways to compress text.
– keyword encoding
– run-length encoding
– Huffman encoding
3-25
Keyword Encoding
• Frequently used words are replaced with a
single character. For example,
3-26
Example: Uncompressed
The human body is composed of many
independent systems, such as the circulatory
system, the respiratory system, and the
reproductive system. Not only must all systems
work independently, they must interact and
cooperate as well. Overall health is a function of
the well-being of separate systems, as well as how
these separate systems work in concert.
3-27
Compressed with Keyword
Encoding
The human body is composed of many
independent systems, such ^ ~ circulatory system,
~ respiratory system, + ~ reproductive system. Not
only & each system work independently, they &
interact + cooperate ^ %. Overall health is a
function of ~ %- being of separate systems, ^ % ^
how # separate systems work in concert.
3-28
Keyword Encoding
• 349 characters in the
• 314 characters in the compressed
thus;
• The compression ratio for this example is
314/349 or approximately 0.9
• Or about 10% smaller. Not real good
3-29
Run-Length Encoding (RLE)
• Sometimes bytes repeat in a long sequence.
– This type of repetition doesn’t generally take place
in English text, but often occurs in large data
streams.
• In RLE, a sequence of repeated characters is
replaced by:
<flag><repeated character><number of repeats>
3-30
Run-Length Encoding Examples
• AAAAAAA would be encoded as *A7
• *n5*x9ccc*h6 some other text *k8eee would be decoded into
the following original text
nnnnnxxxxxxxxxccchhhhhh some other text kkkkkkkkeee
• Original: 51 characters
• Compressed 35 characters
• Compression ratio: of 35/51 or approximately 0.68.
• Better than keyword, and can be more better for certain kinds
of data
3-31
Huffman Encoding
• Why should the character “x”, which is seldom
used in text, take up the same number of bits
as the “e”, which is used very frequently?
• Huffman codes using variable-length bit
strings to represent each character.
• A few characters may be represented by five
bits, and another few by six bits, and yet
another few by seven bits, and so forth.
3-32
Huffman Encoding
• If we use only a few bits to represent
characters that appear often and reserve
longer bit strings for characters that don’t
appear often, the overall size of the document
being represented is smaller.
3-33
Huffman Encoding
• For example
3-34
Huffman Encoding
• DOORBELL would be encode in binary as
1011110110111101001100100.
• Doorbell in binary: 8*8 = 64 bits
• Doorbell in Huffman: 25 bits
• Compression: 25/64, or approximately 0.39
3-35
Representing Audio Information
3-36
Representing Audio Information
• To digitize the signal we periodically measure
the level of the signal and record the
appropriate numeric value. The process is
called sampling.
• In general, a sampling rate of around 40,000
times per second is enough to create a
reasonable sound reproduction.
3-37
Representing Audio Information
Figure 3.8 Sampling an audio signal
3-38
How big is an Audio File?
• Assume we want a 16 bit resolution on each
sample
• And there are 40,000 samples per second
• How big (in bytes) is a 3 minute song file?
3-39
COOL PICTURE, HUH!?
Figure 3.9
A CD player reading
binary information
3-40
Audio Formats
• Audio Formats
– WAV, AU, AIFF, VQF, and MP3.
– Some are compressed, some are not
• MP3 is common
– MP3 employs both lossy and lossless compression.
First it analyzes the frequency spread and compares it to mathematical models of human psychoacoustics (the study
of the interrelation between the ear and the brain), then it discards information that can’t be heard by humans. Then
the bit stream is compressed using a form of Huffman encoding to achieve additional compression. Your results may
vary. Don’t use MP3’s if you suffer from a preference for quality music. Ask your doctor if MP3’s are right for you.
3-41
Graphics
• Raster
• Vector
3-42
Raster Graphics
• Digitizing a picture is the act of representing it
as a collection of individual dots called pixels.
• 2 resolutions involved:
• The number of pixels used to represent a
picture is called the pixel resolution.
• Color resolution – how many bits are used for
each pixel
3-43
Raster Graphics
• The storage of image information on a pixelby-pixel basis is called a raster-graphics
format. Several popular raster file formats
including bitmap (BMP), GIF, and JPEG.
3-44
Digitized Images and Graphics
Figure 3.12 A digitized picture composed of many individual pixels
3-45
Digitized Images and Graphics
Figure 3.12 A digitized picture composed of many individual pixels
3-46
Representing Graphic Colors
• Color is our perception of the various
frequencies of light that reach the retinas of
our eyes.
• Our retinas have three types of color
photoreceptor cone cells that respond to
different sets of frequencies. These
photoreceptor categories correspond to the
colors of red, green, and blue.
3-47
Representing Graphic Colors
• Color is often expressed in a computer as an
RGB (red-green-blue) value, which is actually
three numbers that indicate the relative
contribution of each of these three primary
colors.
• For example, an RGB value of (255, 255, 0)
maximizes the contribution of red and green,
and minimizes the contribution of blue, which
results in a bright yellow.
3-48
Representing Images and Graphics
Figure 3.10 Three-dimensional color space
3-49
Representing Images and Graphics
• The number of bits that is used to represent a
color is called the color depth.
• TrueColor indicates a 24-bit color depth.
– 8 bits are used for Red
– 8 bits are used for Green
– 8 bits are used for Blue
3-50
Representing Images and Graphics
3-51
How big is a Raster Graphic File?
• Assume we want a 24 bit color depth
• And the image is X pixels by Y pixels
• How big is the file?
3-52
Vector Graphics
• Vector-graphics format describe an image in
terms of lines and geometric shapes.
• Keywords describe a lines:
– Direction
– Thickness
– Color
– Etc…
– Every pixel does not have to be accounted for.
3-53
Vector Graphics
• Vector graphics can be resized mathematically,
and these changes can be calculated
dynamically as needed.
(example: True-Type fonts)
• However, vector graphics is not good for
representing real-world images (photographs).
3-54
Vector Graphics
3-55