Transcript Slide 1

Data representation considers how a computer uses
numbers to represent data inside the computer. Three
types of data are considered at this stage:
1.
Numbers including positive, negative and fractions.
2.
Text.
3.
Graphics.
CS Topic 1 - Data Representation v2
1
Binary (Base 2)
The binary system only requires two symbols. 0 and 1 are
used. The columns in binary represent:
27
26
128s 64s
25
32s
24
16s
e.g. the binary number
0
0
0
1
23
8s
22
4s
21
2s
20
units
0
1
0
1
is equal to 16 + 4 + 1 = 21 in decimal.
The number 1110 = 8+4+2 = 14 in decimal
CS Topic 1 - Data Representation v2
2
Try the following. Show your working:
The number 1110 = 8+4+2 = 14 in decimal
1. 0110
4+2 = 6
2. 1001
8+1 = 9
3. 0101
4+1 = 5
4. 1111
8+4+2+1 = 15
5. 0010
2=2
6. 1101
8+4+1 = 13
CS Topic 1 - Data Representation v2
3
Try to learn the following powers of 2 by heart
28
= 256
210
= 1024 =1K
216
= 65,536= 64K
220
= 1,048,576 = 1 MB
224
= 16 MB
230
= 1 GB
232
= 4 GB
240
= 1 TB = 1 Terabyte
CS Topic 1 - Data Representation v2
4
Remember the units used in the binary system.
1 byte =
1 Kilobyte =
1 Megabyte =
1 Gigabyte =
1 Terabyte =
8 bits
1024 bytes
1024 Kilobytes
1024 Megabytes
1024 Gigabytes
2048 Kilobytes = ?
A. 1024 Megabytes
B. 1 Gigabyte
☺C. 2 Megabytes
D. 4096 bytes
3 Gigabytes = ?
A. 24 Terabytes
☺B. 3072 Megabytes
C. 24 Kilobytes
D. 3072 Terabytes
CS Topic 1 - Data Representation v2
5
Here are some useful terms used in binary
Bit
Binary digit (1 or 0)
Byte
Group of 8 bits 28 = 256 values
Least significant bit(LSB)
Bit furthest to the right (units)
Most significant bit(MSB)
Bit furthest to the left
CS Topic 1 - Data Representation v2
6
The computer is a two-state (binary) machine. All
components inside a computer and all backing storage
devices have only two states. e.g.
• a switch is on or off.
• a transistor conducts or does not conduct.
• a signal is a pulse of electricity or no pulse.
• an area of a magnetic disk is positive or negative.
• with laser technology light can reflect in two different directions.
Binary, using the numbers 0 and 1, can be represented
by a two state system.
CS Topic 1 - Data Representation v2
7
Advantages of using Binary
1.
A simple two-state system is less complex to
represent using electrical signals than our
decimal ten-state system. Degradation in signal
levels does not corrupt the information as easily
and so there is less chance of errors.
2.
A two state system is easy to store
magnetically and optically.
3.
Calculations are simpler.
There are only four rules
for addition. These can
be easily built into the
electronic circuits.
CS Topic 1 - Data Representation v2
0+0=
0+1=
1+0=
1+1=
0
1
1
0 carry 1
8
The disadvantages of using binary are that:
1.
A binary number has more digits than its
decimal equivalent. i.e. it will be longer.
This is not a problem for the computer but
it makes it harder for us to read and work
with.
2.
Binary is more difficult than decimal for us to
read as we are more used to decimal.
CS Topic 1 - Data Representation v2
9
An integer is a whole number, positive or negative.
Every integer stored in the computer is allocated the same
amount of space, whether it is a large integer or a small
integer. The number of bits allocated determines the
range of numbers which can be stored.
If one byte was allowed then the largest integer would be:
11111111
which is 255 in decimal or 28 - 1
Two bytes would allow:
216 -1 possibilities = a range from 0 to 65535.
CS Topic 1 - Data Representation v2
10
If a computer only had to store positive integers then
we could easily convert each number into its binary
equivalent as you saw in the examples earlier.
However, negative numbers have to be stored too and
we need to find a method of representing a –ve sign
using 1s and 0s.
Modern computers use the Two’s complement method
to represent integers.
CS Topic 1 - Data Representation v2
11
With this method we take the most significant bit (the
one on the far left) and treat it as a negative number.
The following examples illustrate the principle using 4
bit numbers to help you understand. A modern
computer would use 32 bit numbers for integers.
In your NABS and final exams you are likely to be
asked to use 8 bit numbers and you will practise with
these later.
CS Topic 1 - Data Representation v2
12
Two’s Complement
Binary
1000
1001
1010
1011
1100
1101
1110
1111
0000
0001
0010
0011
0100
0101
0110
0111
Decimal
-8
-7
-6
-5
-4
-3
-2
-1
0
+1
+2
+3
+4
+5
+6
+7
In this table the 1 at the
far left represents -8
(negative 8).
Make sure that you
understand this concept
Note that the range is still
24 = 16 numbers
= -8 to +7
CS Topic 1 - Data Representation v2
13
Range and Accuracy of Two’s Complement
1. The range of numbers which can be stored
depends on the number of bits being used.
4 bit numbers have a range
-8 to +7
8 bit numbers have a range
-128 to +127
2. In a modern computer 32 bits are used stored integers.
This gives a range of 232 around -2,147,483,648 to
+2,147,483,647
3. Numbers stored using two’s complement are always
100% accurate.
CS Topic 1 - Data Representation v2
14
8 Bit Two’s complement numbers
Here is an example of how to work out the
Two’s complement for the number -80
-128 64 32 16 8
-80 =
1
0
1 1
0
4
2
1
0
0
0
128
80
48
32
16
16
0
1. The number is negative so put a 1 in the first column.
2. Subtract the 80 from 128.
3. Now make 48 from the remaining columns using normal
binary rules.
CS Topic 1 - Data Representation v2
15
Express the following numbers using 8 bit Two’s
complement:
-128 64 32 16 8
4
2
1
1. -45
1
1
0 1
0
0
1
1
2. -21
1
1
1 0
1
0
1
1
3. -16
1
1
1 1
0
0
0
0
4. 127
0
1
1 1
1
1
1
1
5. -129
Number out of range
CS Topic 1 - Data Representation v2
16
Real numbers (numbers with a decimal point in them)
are stored using floating point representation. This is
like standard form/scientific notation used in decimal.
1101.101 = .1101101 x 2100
1. The binary point is moved to the far left.
2. The point has been moved 4 to the left so we need
to multiply by 24. The power 4 = 100 in binary.
CS Topic 1 - Data Representation v2
17
The general form of this representation is
m x be
where m = mantissa (the number)
b = base
e = exponent (the power)
As the base is always 2 and the point is always at the far
left, we only need to store the mantissa and the
exponent, so the number 1101.101 becomes:
Mantissa
1101101
Exponent
100
CS Topic 1 - Data Representation v2
18
Range and Precision of Floating Point numbers
1. The range of numbers which can be stored depends
on the number of bits being used for the exponent.
The exponent has no effect on precision.
2. The precision of the numbers being stored depends
on the number of bits being used for the mantissa.
The mantissa has no effect on range.
3. In a modern computer, floating point allows:
A 4 byte mantissa -231 to +231
A 1 byte exponent -128 to 127
In decimal this means accuracy to 9 significant
figures and a range from 10-38 to 1038.
CS Topic 1 - Data Representation v2
19
Text is made up of characters and each character is
allocated its own binary code.
The set of characters that can be represented by a
computer is known as the character set.
Western world alphabets need around 80 characters.
These are made up of 26 upper case letters, 26 lower case
letters, 10 digits 0-9, and around 20 punctuation marks.
80 characters would need a 7 bit code. This would allow
27 different codes = 128
CS Topic 1 - Data Representation v2
20
It is useful to have a standard code so that text can be
transferred between different types of computer easily
without the need for translation.
ASCII and Unicode are two of the most common codes in
use today.
ASCII (American Standard Code for Information
Interchange) is a 7 bit code allowing 128 characters.
These include 96 displayable characters and 32 control
characters which control the display devices. Examples
of these include:
Code 13 = Carriage Return
Code 9 = TAB
Code 10 = Line feed
Code 8 = Backspace
CS Topic 1 - Data Representation v2
21
ASCII is often extended to 8 bit which allows
28 = 256 different characters.
These include alphabetic characters in foreign languages
and accented characters. This standard became known
as extended ASCII and then ISO 8859.
ASCII was designed to cope with Western based
character sets such as English, French, German but
did not include Japanese or Arabic symbol shapes.
The increase in worldwide communication led to a
need for a larger standard code to cope with other
foreign alphabets, technical symbols etc.
CS Topic 1 - Data Representation v2
22
Unicode provides a unique number for every character,
no matter what the platform,
no matter what the program,
no matter what the language.
www.unicode.org
Unicode use a 16 bit code for each character.
This provides a unique code for up to 216 = 65,536 characters.
Unicode includes all the ASCII character codes
to ensure compatibility.
CS Topic 1 - Data Representation v2
23
Unicode
Advantage –
Can represent many more characters
than ASCII.
Disadvantage –
takes up more space to store Unicode
than it does to store ASCII.
CS Topic 1 - Data Representation v2
24
The graphic is seen as a matrix of (picture elements) pixels
and the colour of each pixel is represented by a binary code.
This simple graphic
of a match stick man
could be stored as a
series of binary
numbers.
In black and white
mode, each pixel
requires a one bit
code: 0 for white
1 for black
███
000111000
███
000111000
████████
111111111
███
000111000
█
██
█
██
CS Topic 1 - Data Representation v2
001000100
110000011
25
Resolution refers to the number of pixels in the width
and height of the image. The more pixels there are in
the image the higher the resolution.
A typical 15’’ TFT screen could have a
resolution of 1024 x768 = 786,432 pixels
Bit depth refers to to the number of bits needed to represent
the colour of each pixel. Greyscale simply means shades of
grey and so each shade needs its own code.
A 2 colour image would
require a 1 bit code. e.g.
0 = red
1 = green
CS Topic 1 - Data Representation v2
26
A 16 colour image would need
a 4 bit code(=24).
Increasing the number of colours
that are available increases the size
of the code for each colour.
Bit depth x
(No of bits in code)
No of colours
available = 2x
1
2
4
16
8
256
16
65,536
24 (true colour)
16 million
CS Topic 1 - Data Representation v2
0000 = red
0001 = green
0010 = blue
0011 = yellow
0100 = orange
0101 = etc.
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
27
Here is an example of how to calculate memory requirements
for an image on a screen 800x600 using 16 million colours.
Number of pixels = 800 x 600 = 480000 pixels
Bit Depth is 2x = 16 million so bit depth = 24 bits
i.e. you need a 24 bit code to represent the colour for each
pixel.
The file size is 480000 x 24 bits = 11520000 bits
Divide by 8 to find the number of bytes. = 1440000 Bytes
Keep dividing by 1024 until to you have an appropriate unit.
/1024 = 1.4 MB
1440000/1024 = 1406.25 KB
CS Topic 1 - Data Representation v2
28
Remember that the size of an image depends on the
number of pixels and the bit depth.
1. Find the number of pixels.
2. Find the bit depth. (express answer in bits)
3. Multiply the pixels by the bit depth to give an answer in bits.
4. Divide by 8 to give the answer in bytes.
5. Keep dividing by 1024 to find the answer in KB, MB or GB.
Resolution
640 x 420
No of colours
16
800 x 600
1024 x 768
65,536
256
File size
131.2 KB
937.5 KB
768 KB
CS Topic 1 - Data Representation v2
29
Sometimes you are given the bit depth in the question e.g.
24 bit colour. This makes the question easier.
If you are only told how many colours can be
represented then unfortunately you have to
calculate the bit depth using the equation:
2x = number of colours
where x is the bit depth.
Use a calculator to do this if necessary.
CS Topic 1 - Data Representation v2
30
A higher bit depth allows more colours so the quality of
photographs etc will improve.
Disadvantage:
the file size will increase.
If asked to work out how many images can be stored on a
backing storage medium then remember to round down
your answer as you would want to store complete images.
Here is a worked example:
CS Topic 1 - Data Representation v2
31
How many 8.4 MB images can be stored on a 1 GB
memory stick?
1. Make sure that each number is using the same units.
So 1 GB = 1024 MB
2. Divide the capacity by the number of images
1024/8.4 = 121.904
3. Round down the answer (Remember that you
wouldn’t store a part of an image!)
You can store 121 images on a 1 GB memory stick.
CS Topic 1 - Data Representation v2
32
Bit Map graphics - Advantages
1. You can edit individual pixels in the image.
2. It is easy to draw freehand shapes.
Bit Map graphics - Disadvantages
1. File sizes are large as the content of every pixel has
to be stored (even blank (background) pixels).
2. Resolution dependent - when a graphic is created at a
particular resolution it cannot then take advantage of a
higher resolution device. It becomes "blocky" if enlarged.
3. It is difficult to manipulate shapes on the screen.
(e.g. move, scale, rotate or layer)
CS Topic 1 - Data Representation v2
33
A graphic is seen as being made up of a series of objects.
A mathematical description of each object is stored as a set
of instructions or formulae.
A straight line can
be stored as a set
of two co-ordinate
pairs, a line
colour, thickness,
pattern and layer.
A square has co-ordinates
for four points, four coordinate pairs, line colour,
thickness, pattern, fill
pattern and layer.
This information allows the objects to be represented accurately.
CS Topic 1 - Data Representation v2
34
Vector graphics - Advantages
1. Resolution independent - a graphic created at a
particular resolution can take advantage of a higher
resolution device. It will still look in proportion.
2. It is easy to manipulate shapes on the screen.
(e.g. move, scale, rotate or layer)
3. File sizes are generally smaller as values do not
need to be held for every pixel.
4. Objects can be grouped to form larger objects that can
then be manipulated as a single object
Vector graphics - Disadvantages
1. It is difficult to represent freehand shapes as the
computer needs to describe them mathematically.
2. You cannot edit individual pixels.
CS Topic 1 - Data Representation v2
35
Bit mapped & vector graphics - File size
Vector -
The more objects there are on the
screen the bigger the file size will be.
Bit-mapped -
At any given resolution and bit depth, the
file size will be the same. It doesn’t matter
what is actually on the screen. The
content of every pixel has to be stored.
CS Topic 1 - Data Representation v2
36
Graphics on screen and at the printer
Bit mapped and Vector are different ways of
representing graphics in RAM and on disk.
It is important to remember that monitors and printers
always display a graphic as a bit-map.
A vector graphic has to be converted into a bit map before
it is displayed on the screen or printed out. This is called
rasterising or rendering.
Bit mapped packages often have the word Paint or Photo
associated with them. e.g. Adobe Photoshop.
Vector packages often contain the words Draw or Design
e.g. Corel Draw.
CS Topic 1 - Data Representation v2
37