Transcript Document

Data Rep
CSCI130
Instructor: Dr. Lynn Ziegler
Overview of Computer Design

Memory is divided into locations/registers/words

Each holds a fixed amount of data


Each has an address used to access its contents



1 means electric current is strong
0 no/weak electric current
The normal addressable chunk of informations is 8 bits (1
byte)


P.O. boxes: number is address and the box is the register
Each word is made of a fixed number of memory units called
bits (Binary digITs) each having one of two states. (Word sizes
for modern computers are almost always 32 or 64 bits.)


Numbers, characters
Data and addresses are represented in binary
Memory sizes are measured in bytes in groups of
Overview of Computer Design

The number system that we use daily is called the
decimal system




How many different digits?
Why?
In computers, we have only two possibilities (0 or 1)
 we can use the binary system
…+ 104+103+102+101+100 becomes
… 24+23+22+21+20 = ...16+8+4+2+1


37 = 3x101 + 7x100
100101 = 1x25 + 0x24 + 0x23 + 1x22 + 0x21 + 1x20
Binary to Decimal and Vice Versa

Binary  decimal




Decimal  Binary




1100 10012 = ?
1001 10012 =?
0010 10012 = ?
1410 = ?
910 = ?
129 10 = ?
This is how we can represent (positive) numbers
Hexadecimal

Binary is not very convenient for humans to use






In base 10, we have 10 digits (0-9)
In base 2, we have 2 digits (0-1)
What about base 16 (hexadecimal)?



1001000010010010
Instead we use the hexadecimal system (base 16)
Group every four binary digits into a single hexadecimal
value starting at the right.
0, 1, 2, 3.. 9, A, B, C, D, E, and F
…+163+162+161+160 = …+4096+256+16+1
A1F = A*162 + 1*161 + F*160
= 10*16*16 + 1*16 + 15 = 259110
Binary to Hexa and Vice Versa

From binary  hexadecimal: group 4 bits at a
time as one hex. digit




10011111100100102 = 1001 1111 1001 0010
= 9F9216
110010012 = ?
100110012 =?
Group starting from left or right?


01010012 = ?
From hexadecimal  binary:




9A9216 = 1001 1010 1001 0010
A416 = ?
916 = ?
BC16 = ?
2
Unsigned Integers



Range of values for 8-bit unsigned?
What if we want to represent higher
values? Say up to 300? Up to 600?
How many bytes (or memory locations)
would that require?
Signed Integers

Easy to store positive numbers


Negative numbers?
Signed-magnitude representation






Use the last (leftmost) bit as the sign bit
1 for negative and 0 for positive
1000 0011 = ?
0100 0011 = ?
1000 0000 = ?
0000 0000 = ?
Signed Integers

Problems:

Two values for zero!



Problems for comparisons
Addition won’t work well
Adding in binary?
1000 0011 (-3)
+ 0100 0011 (67)
----------------1100 0110 (-70)


Obviously, we need something better

 2’s complement representation
2’s Complement Representation


Used by almost all computers today
All places hold the value they would in binary except for the
leftmost place which is negative



If last bit




8 bit integer: -128 64 32 16 8 4 2 1
Range????
is 0, then positive  looks just as before
is 1, then negative add values for first 7 digits and then subtract 128
1000 1101 = 1+4+8-128 = -115
What if I wanted to represent numbers up to
-200?
2’s Complement Representation

Converting from decimal to 2’s complement


For positive numbers: find the binary representation
For negative numbers:




43


43 = 0010 1011
-43



Find the binary representation for its positive equivalent
Flip all bits
Add a 1
43 = 0010 1011  1101 0100+1  1101 0101
1101 0101 = -128+64+16+4+1 = -43
We can use the usual laws of binary addition
2’s Complement Representation



125 - 19 = 106
0111 1101
1110 1101
------------0110 1010 = 106 !
What happened to the one that we carried at the
end?

Got lost but still we got the right answer!


We carried into and carried out of the leftmost column
Try 125+65 and -125-65
2’s Complement Representation




125+65 = 190
0111 1101
0100 0001
------------1011 1110 = - 66 !!!
We only carried into  overflow
Number too big to fit in 8 bits since range=[128,127]

125+65 = 190 > 127
2’s Complement Representation



-125-65 = -190
1000 0010
1011 1111
------------0100 0001 = 65 !!!
We only carried out of  overflow


-190 < -127
Solution


use larger registers (more than 8 bits)
Very large positive and very small negative we might still
have a problem  combine two registers (double precision)
Real Number Representation

We deal with a lot of real numbers in our lives (class
average) and while using computers




82.34 = 8*10 + 2*1 + 3*1/10 + 4*1/100
In Math: to represent a real number in binary, we use
a radix point instead of a decimal point


Fractions or numbers with decimal parts
3/10=0.3 or 82.34
1101.11 = 8 + 4 + 1 + ½ + ¼ = 13.75
On Computers: How can we represent the radix
point? 0? 1?
Real Number Representation



We resort to the floating-point representation
We represent the number without the radix point!!!
Based on a popular scientific notation that has three parts:




6.124E5



Sign
Mantissa (one non-zero digit before the decimal point, and two or
more after), and
Exponent (written with an E following by a sign and an integer which
represents what power to raise 10 to)
= 6.124 * 105 = 612,400
Sign? Mantissa? Exponent?
Scientific to decimal

-9.14E-3 = -9.14 * 10-3 = -9.14/1000 = - 0.00914
Real Number Representation

Decimal to scientific



123.8 = ?


Divide (or multiply) by 10 until you have only one
digit (non-zero) before the decimal point
Add (or subtract) one to the exponent every time
you divide (or multiply)
1.238E2
0.2348 = ?

2.348E-1
Real Number Representation

Floating-point numbers are very similar

But use only 0s and 1s instead

+/- 1.xxxx*2exponent

For an eight bit number


0 000 0000
Sign, exponent, mantissa (not the standard … lab 2)


Sign = 0 for positive and 1 for negative
The exponent has three digits [0,7], but can be positive or negative:



shift by 4  [-4,3]
000-4, 001-3, 010-2, …, 1113
 i.e. subtract 4 from the corresponding decimal value
Use base two now (i.e. 2 raised to the power of the exponent value)
Real Number Representation

The mantissa has one (non-zero) place before the
decimal point and a number of places after it


11100010  1 110 0010




in binary our digits are 0 and 1 and so we always have 1.xxxx
 No need to represent the one (add one to the result)
Negative
110=6, -4  2, exponent=22
Mantissa = 0010 = 0*1/2 + 0*1/4 + 1*1/8 + 0*1/16 = 1/8
 without the initial 1 which we’ve omitted  1+1/8
 Or 1 + decimal equivalent/16 = 1 +2/16
- (2*2)(1+1/8) = -4 1/2
Real Number Representation

Floating point to decimal conversion






1: break bit pattern into sign (1 bit), exponent (3
bits), mantissa (4 bits)
2: sign is – if 1 and + otherwise
3: exponent = decimal equivalent -4
4: mantissa = 1 + decimal equivalent/24
5: number = (sign) mantissa*2exponent
Try 01100011
Real Number Representation


What about from decimal to floating point?
Format the number into the form 1.xxxx*2exponent








Multiply (or divide) by 2 until we have a 1 before the decimal point
Subtract (or add) 1 from (to) the exponent for every such multiplication
(or division)
Add 4 to result (in floating-to-decimal conversion we subtracted 4)
Convert exponent to binary
Subtract 1 from the resulting mantissa (in floating-to-decimal conversion
we added 1)
Multiply mantissa by 16 (in floating-to-decimal conversion we multiplied
by 16)
Round to the nearest integer
Convert mantissa to binary
Real Number Representation

-8.48:






Sign is negative  1
8.48  4.24  2.12 1.06  3 divisions
exponent=3+4 =7 or 1112
mantissa = mantissa -1  .06
multiply mantissa by 16 (write it in terms of 16ths)
 0.96 ~ 1 0r 0001
1 111 0001
Real Number Representation

Decimal to floating-point conversion



Sign bit = 1 if negative and 0 otherwise
Exponent = 0, mantissa = absolute(number)
While mantissa < 1 do



While mantissa > 2 do



Change to binary
Mantissa = (mantissa -1)*16


mantissa = mantissa / 2
exponent = exponent + 1
Exponent = exponent + 4


mantissa = mantissa * 2
exponent = exponent -1
Round off to nearest integer and change to binary
Assemble number: sign | exponent | mantissa
Real Number Representation

0.319







- 0.319





Positive  sign bit = 02
mantissa = 0.319 * 2 = 0.638  exponent = -1
mantissa = 0.638 * 2 = 1.276  exponent = -2
exponent = -2 + 4 = 2 = 0102
mantissa = (mantissa -1)*16 = 4.416 ~ 4 = 01002
0 010 0100
0


Negative  sign bit = 12
exponent = 0102
mantissa = 01002
1 010 0100
Sign bit = 02
mantissa? Not going to change to 1.xxxx no matter what!
Real Number Representation

No representation for 0



0 000 0000 = 1.0 * 2-4 (we assumed there is a 1 before the mantissa)
Assume this to be zero!
Another issue, is the method exact?

rounding truncation (close numbers give same floating point values)



Mantissa = 4.416 or 4.0123 or 4.400 are all ~ 4 = 01002
Problem is also due to using only 8 bits (4-bit mantissa)
Floating-point numbers require 32 or even 64 bits usually


But still we will have to round off and truncate
That happens regularly with us even when not using computers




PI = 22/7
2/3 or 1/3
We approximate a lot … but we should know it !
Higher precision applications use much larger size registers
Non-numeric Data Representation

Text data






The most common type of data people use on computers
How can we transform it to binary --- not as intuitive as
numbers!
Words can be divided into characters
Each character can then be encoded by some binary code
Every language has its set of letters  we will limit
ourselves to the Latin alphabet
Numbers were (relatively) easy to map to binary


Decimal to binary change
What about letters and other symbols?
Non-numeric Data Representation


Many transformations exist and all are arbitrary
Two popular


EBCDIC (Extended Binary Coded Decimal Interchange Code) by IBM
ASCII (American Standard Code for Information Interchange) by
American National Standards Institute (ANSI)




Most widely used
Every letter/symbol is represented by 7 bits
 How many letters/symbols do we have in total?
A-Z (26) , a-z (26) , 0-9 (10), symbols (33), control characters (33)
If using 1 byte/character  we have one extra bit
 Extended ASCII-8 (more mathematical and graphical symbols or
phonetic marks)
Extended Ascii (Macintosh Courier font)
Data types in VB 6
Data type
Boolean
Integer
Long
(long integer)
Single
(single-precision floatingpoint)
Storage size
2 bytes
2 bytes
4 bytes
True or False
-32,768 to 32,767
-2,147,483,648 to
2,147,483,647
4 bytes
-3.402823E38 to -1.401298E-45 for
negative values;
Double
8 bytes
(double-precision floatingpoint)
String
Range
1.401298E-45 to 3.402823E38 for positive
values
-1.79769313486231E308 to
-4.94065645841247E-324 for negative
values;
4.94065645841247E-324 to
1.79769313486232E308 for positive values
10 bytes + string length 0 to approximately 2 billion
Non-numeric Data Representation

Given a memory register with the value
00110100




2’s complement  52
Floating point  0.625
ASCII  character “4” (check ASCII table in book)
There are some code blocks preceding such
values informing the computer of the type


Sometimes called meta-data
E.g. Font style
Non-numeric Data Representation

Picture/Image Data:



512*256 image  grid has 512 columns and 256 rows
Divide the screen into a grid of cells each referred to as a
pixel
Pixel values and sizes depend on the type of the image



Black & white images: 1 bit for every pixel such that 1 is black and 0
is white
Grayscale images: 1 byte/pixel where 255 is black and 0 is white and
anything in between is gray (higher/lower values are closer to
black/white)
Color images: Three values per pixel




Red/Green/Blue
1 byte per color  3 bytes per pixel
For an 512x256 image (512x256x3 = 384K bytes)
For any image, we only store the pixel values
Assume all images have
100x80 resolution. Can
you find their sizes in
bits? bytes?
An Image File














//B/W IMAGE
P1
15 11
01111011
01000010
01000011
01000000
01111011
00000000
01111001
01000000
01111000
00001001
01111001
1
0
1
0
1
0
1
0
0
0
1
1
0
1
1
1
0
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
0
1
1
1
1
1
1
0
1
0
1
0
0
0
0
0
1
1
0
1
0
1
0
0
0
0
0
1
1
1
1
1
1
0
1
1
1
1
1

//COLORED IMAGE
P3
27
255
255 0 0
0 0 225
255 0 225
100 150 200
50 50 50 100 100 100
0 0 0 255
5 200 225

http://www.csbsju.edu/computerscience/curriculum/launch/default.htm










0 255 0
255 255 0
0 255 225
200 150 100
255 225
192 150 225
Non-numeric Data Representation

Image movies are built from a number of images (or frames) that are
displayed in a certain sequence at high speeds

30 images per second



Assume every image has a size of 500KB
500K * 30 ~ 15 MB
2-hr movie needs (assume same sample image used)

15MB * 60 * 120 ~ 108 GB (billion) bytes!
Non-numeric Data Representation

Sound/Audio Data

Produced when objects vibrate in
matter (e.g. air)


Heard sound depends on the
amplitude and frequency
Sends a wave of pressure fluctuations
Sounds differ because of variations
in the sound wave frequency
Volts
(pitch or speed)


Frequency = # cycles/second
Higher wave frequency  pressure
fluctuation switches back and forth
more quickly during a period of time


We hear this as a higher pitch
Level of air pressure in each fluctuation,
the wave's amplitude or height,
determines how loud the sound is
Cycle
TIME
Non-numeric Data Representation




Not all frequencies are audible
Your ears are particularly sensitive to sounds in
the middle range, from about 500 Hz to 2 kHz
The hi-fi range is defined as from 20 Hz to
20 kHz
As you get older, you will find it more and more
difficult to hear higher frequencies

By the time you are able to afford a decent hi-fi
system, you will probably be unable to fully
appreciate its performance 
Non-numeric Data Representation


Numbers used to represent the
amplitude of sound wave
Analog is continuous and we
need digital



Digitize the sound signal
Measure the voltage of the signal
at certain intervals (e.g. 44,100
per sec for CDs)
 process of sampling
Reconstruct wave


Sound will slightly vary
A sound file is nothing but a
sequence of numbers measured
at equal intervals
Digitizing a sound wave
Non-numeric Data Representation


Compression can also be used for audio files
MP3: reduces size to 1/10th

 faster transfer over the Internet
Non-numeric Data Representation

Digital image and audio have a lot of
advantages over non-digital ones

Can easily be modified by changing the bit pattern



Image enhancement, noise/distortion removal, etc …
Superimpose one sound on another or image on
another results in newer ones
Courts won’t accept them as evidence
Digital alterations