Lecture 5 COSC 3430 PH 3: Chapter 3 1

Transcript Lecture 5 COSC 3430 PH 3: Chapter 3 1

Lecture 5
COSC 3430
PH 3: Chapter 3
1
Converting n bit binary numbers to their
decimal equivalent
• an-1… a2a1a0 (bin) = an-12n-1 + … + a222 + a12 + a0 (dec)
• For 2’s complement numbers we have
• an-1… a2a1a0 = -an-12n-1 + … + a222 + a12 + a0
2
Possible Representations
•
Sign Magnitude:
000 = +0
001 = +1
010 = +2
011 = +3
100 = -0
101 = -1
110 = -2
111 = -3
•
•
One's Complement
Two's Complement
000 = +0
001 = +1
010 = +2
011 = +3
100 = -3
101 = -2
110 = -1
111 = -0
000 = +0
001 = +1
010 = +2
011 = +3
100 = -4
101 = -3
110 = -2
111 = -1
Issues: balance, number of zeros, ease of operations
Which one is best? Why?
3
MIPS (2’s complement representation)
•
32 bit signed numbers:
0000
0000
0000
...
0111
0111
1000
1000
1000
...
1111
1111
1111
0000 0000 0000 0000 0000 0000 0000two = 0ten
0000 0000 0000 0000 0000 0000 0001two = + 1ten
0000 0000 0000 0000 0000 0000 0010two = + 2ten
1111
1111
0000
0000
0000
1111
1111
0000
0000
0000
1111
1111
0000
0000
0000
1111
1111
0000
0000
0000
1111
1111
0000
0000
0000
1111
1111
0000
0000
0000
1110two
1111two
0000two
0001two
0010two
=
=
=
=
=
+
+
–
–
–
2,147,483,646ten
2,147,483,647ten
2,147,483,648ten
2,147,483,647ten
2,147,483,646ten
maxint
minint
1111 1111 1111 1111 1111 1111 1101two = – 3ten
1111 1111 1111 1111 1111 1111 1110two = – 2ten
1111 1111 1111 1111 1111 1111 1111two = – 1ten
4
Two's Complement Operations
•
Negating a two's complement number: invert all bits and add 1
– remember: “negate” and “invert” are quite different!
•
Converting n bit numbers into numbers with more than n bits:
– MIPS 16 bit immediate gets converted to 32 bits for arithmetic
– copy the most significant bit (the sign bit) into the other bits
0010
-> 0000 0010
1010
-> 1111 1010
– "sign extension" (lbu vs. lb, lbu does not sign extend)
– Also lh sign extends whereas lhu does not
5
Addition & Subtraction
•
Just like in grade school (carry/borrow 1s)
0111
0111
0110
+ 0110
- 0110
- 0101
•
Two's complement operations easy
– subtraction using addition of negative numbers
0111
+ 1010
10001
•
Overflow (result too large for finite computer word):
– e.g., adding two n-bit numbers does not yield an n-bit number
0111
+ 0001
note that overflow term is somewhat misleading,
1000
it does not mean a carry “overflowed”
6
Detecting Overflow
•
•
•
•
No overflow when adding a positive and a negative number
No overflow when signs are the same for subtraction
Overflow occurs when the value affects the sign:
– overflow when adding two positives yields a negative
– or, adding two negatives gives a positive
– or, subtract a negative from a positive and get a negative
– or, subtract a positive from a negative and get a positive
Consider the operations A + B, and A – B
– Can overflow occur if B is 0 ?
– Can overflow occur if A is 0 ?
7
Effects of Overflow
•
•
•
•
An exception (interrupt) occurs
– Control jumps to predefined address for exception
– Interrupted address is saved for possible resumption
Details based on software system / language
– example: flight control vs. homework assignment
Don't always want to detect overflow
— new MIPS instructions: addu, addiu, subu
note: addiu still sign-extends! But ignores overflow
note: sltu, sltiu for unsigned comparisons
8
Addition
9
Adder
10
Exclusive-Or
• Truth table
x
y
0
0
0
1
1
0
1
1
•
x xor y
0
1
1
0
(x xor y)’
1
0
0
1
Equation
x xor y = x’y + xy’
Where xy means x and y and x + y means x or y
11
Exclusive-or continued
The following equation can be represented as (a xor b) xor carryin
Proof: (a xor b) xor ci = (ab’ + a’b) ci’ + (ab’ + a’b)’ ci
= (ab’ + a’b) ci’ + (a’b’ + ab) ci .
Note it is easily shown that (a xor b)’ = a’b’ + ab
12
Realization of a full binary adder
13
Parallel (Ripple) binary adder
14
A 32-bit Ripple Carry Adder/Subtractor
Remember 2’s
complement is just

complement all the bits
control
(0=add,1=sub)
B0

B0 if control = 0,
⌐ B0 if control =
1
add a 1 in the least
significant bit
A
0111
B - 0110
c0=carry_in
A0
1-bit
FA
c1
S0
A1
1-bit
FA
c2
S1
A2
1-bit
FA
c3
S2
B0
B1
B2
...

add/sub

0111
 +
c31
A31
B31
1-bit
FA
S31
c32=carry_out
15
Multiplication
•
•
•
More complicated than addition
– accomplished via shifting and addition
More time and more area
Approach is like grade school as we will demonstrate on the next
slide
0010
__x_1011
•
(multiplicand)
(multiplier)
What about negative numbers? We will convert and multiply
– there are better techniques, but we won’t look at them
16
Example
•
•
•
0011
1011
0011
0011
0000
0011
0100001
Notice this is 3 times 11 and the product is 33.
17
Multiplication: Implementation
Start
Multiplier0 = 1
1. Test
Multiplier0 = 0
Multiplier0
Multiplicand
Shift left
64 bits
1a. Add multiplicand to product and
place the result in Product register
Multiplier
Shift right
64-bit ALU
32 bits
Product
Write
2. Shift the Multiplicand register left 1 bit
Control test
64 bits
3. Shift the Multiplier register right 1 bit
No: < 32 repetitions
32nd repetition?
Datapath
Yes: 32 repetitions
Control
Done
18
Final Version
Start
•Multiplier starts in right half of product
Product0 = 1
1. Test
Product0 = 0
Product0
Multiplicand
32 bits
32-bit ALU
Product
Shift right
Write
Control
test
3. Shift the Product register right 1 bit
64 bits
No: < 32 repetitions
32nd repetition?
What goes here?
Yes: 32 repetitions
Done
19
Floating Point (a brief look)
•
We need a way to represent
– numbers with fractions, e.g., 3.1416
– very small numbers, e.g., .000000001
– very large numbers, e.g., 3.15576  109
•
Representation:
– sign, exponent, significand:
(–1)sign  significand  2exponent
– more bits for significand gives more accuracy
– more bits for exponent increases range
•
IEEE 754 floating point standard:
– single precision: 8 bit exponent, 23 bit significand
– double precision: 11 bit exponent, 52 bit significand
20
IEEE 754 floating-point standard
•
Leading “1” bit of significand is implicit
•
Exponent is “biased” to make sorting easier
– all 0s is smallest exponent all 1s is largest
– bias of 127 for single precision and 1023 for double precision
– summary: (–1)sign  (1+significand)  2exponent - bias
•
Example:
– decimal: -.75 = - ( ½ + ¼ )
– binary: -.11 = -1.1 x 2-1
– floating point: exponent = 126 = 01111110
– IEEE single precision: 10111111010000000000000000000000
21
Floating Point Addition
•
Addition (and subtraction)
(F1  2E1) + (F2  2E2) = F3  2E3
– Step 1: Restore the hidden bit in F1 and in F2
– Step 1: Align fractions by right shifting F2 by E1 - E2 positions
(assuming E1  E2) keeping track of (three of) the bits shifted out in
a round bit, a guard bit, and a sticky bit. (Right shifting is
equivalent to moving the binary point left.)
– Step 2: Add the resulting F2 to F1 to form F3
– Step 3: Normalize F3 (so it is in the form 1.XXXXX …)
• If F1 and F2 have the same sign  F3 [1,4)  1 bit
right shift F3 and increment E3
• If F1 and F2 have different signs  F3 may require
many left shifts each time decrementing E3
– Step 4: Round F3 and possibly normalize F3 again
– Step 5: Rehide the most significant bit of F3 before storing the
result
22
Example
ADD 1.000 × 2-1 = 1.000 × 2-1
- 1.110 × 2-2 = - 0.111 × 2-1
0.001 × 2-1
2’s complement
arithmetic.
There is no overflow or underflow,
Since
-126 ≤ -4 ≤ 127.
Considering arithmetic
as signed, we need to
add a sign bit and the
subtraction becomes
2’s complement
addition
Here there is no rounding error
Since all bits fit into the allocated
4 bits.
01000
11001 (-0111)
00001
Next normalize to get 1.0 × 2-4
23
24
Floating point addition
•
Sign
Exponent
Fraction
Sign
Exponent
1. Compare the exponents of the two numbers.
Shift the smaller number to the right until its
exponent would match the larger exponent
Small ALU
Exponent
difference
0
Start
Fraction
2. Add the significands
1
0
1
0
1
3. Normalize the sum, either shifting right and
incrementing the exponent or shifting left
and decrementing the exponent
Shift right
Control
Overflow or
underflow?
Big ALU
Yes
No
0
0
1
Increment or
decrement
Exception
1
4. Round the significand to the appropriate
number of bits
Shift left or right
No
Rounding hardware
Still normalized?
Yes
Sign
Exponent
Fraction
Done
25
MIPS Floating Point Instructions
• MIPS has a separate Floating Point Register File
($f0, $f1, …, $f31) (whose registers are used in
pairs for double precision values) with special
instructions to load to and store from them
lwcl
$f1,54($s2)
#$f1 = Memory[$s2+54]
swcl
$f1,58($s4)
#Memory[$s4+58] = $f1
• And supports IEEE 754 single
add.s $f2,$f4,$f6 #$f2 = $f4 + $f6
and double precision operations
add.d $f2,$f4,$f6
#$f2||$f3 =
$f4||$f5 + $f6||$f7
similarly for sub.s, sub.d, mul.s, mul.d,
div.s, div.d
26
MIPS Floating Point Instructions, Con’t
• And floating point single precision comparison
operations
c.x.s $f2,$f4
#if($f2 < $f4) cond=1;
else cond=0
where x may be eq, neq, lt, le, gt, ge
and branch operations
bclt 25
#if(cond==1)
go to PC+4+25
bclf 25
#if(cond==0)
go to PC+4+25
• And double precision comparison operations
c.x.d $f2,$f4
#$f2||$f3 < $f4||$f5
cond=1; else cond=0
27
Floating Point Complexities
•
Operations are somewhat more complicated (see text)
•
In addition to overflow we can have “underflow”
•
Accuracy can be a big problem
– IEEE 754 keeps two extra bits, guard and round
– four rounding modes
– positive divided by zero yields “infinity”
– zero divide by zero yields “not a number”
– other complexities
•
•
Implementing the standard can be tricky
Not using the standard can be even worse
– see text for description of 80x86 and Pentium bug!
28
Floating Point Square Root Example
.text
main:
# SPIM starts execution at main.
addi $sp, $sp, -4
sw $ra, 0($sp)
la $a0, prompt
li $v0, 4
syscall
li $v0,6
syscall
mov.s $f2, $f0
la $a0, prompt1
li $v0, 4
syscall
li $v0, 6
syscall
#The number to find square root is in $f2, tol in $f0.
#Return the square root in $f12
jal sqrt
29
Floating point example continued
la $a0, prompt2
li $v0, 4
syscall
beq $v1, $0, L3
li $v0, 2
syscall
j L4
L3: la $a0, neg1
li $v0, 4
syscall
L4: lw $ra, 0($sp)
addi $sp, $sp, 4
jr $ra
30
Square Root procedure
sqrt:
# returns sqrt(R), located in $f2, in $f12,
# returns $v1 = 0 if R < 0, 1 otherwise. This will determine
# the message to be output upon return.
addi $sp, $sp, -8
swc1 $f0, 4($sp)
# save tolerance and $f0
swc1 $f2, 0($sp)
# save copy of R and $f2
sub.s $f10, $f10, $f10
#initializes $f10 = 0
c.lt.s $f2, $f10
#sets cond flag if R < 0
bc1t print_neg
lwc1 $f1, 0($sp)
# loads R in $f1
L1:
mul.s $f3, $f1, $f1
# computes R^2 and puts it in $f3
add.s $f4, $f3, $f2
# $f4 = R^2 + R
add.s $f5, $f1, $f1
# $f5 = 2R
div.s $f12, $f4, $f5
# $f12 = (R^2 + R)/2R = estimate of sqrt(R)
mul.s $f6, $f12, $f12
# $f6 = [(R^2 + R)/2R]^2 = estimate^2
31
Square root continued
sub.s $f7, $f6, $f2
# $f7 = error
abs.s $f8, $f7
# abs(error)
c.lt.s $f8, $f0
# sets cond flag = 1 if abs(error) < tol
bc1t L2
mov.s $f1, $f12
# error >= tol repeat loop
j L1
print_neg: sub.s $f12, $f12, $f12
add $v1, $0, $0
lwc1 $f0, 4($sp)
lwc1 $f2, 0($sp)
addi $sp, $sp, 8
jr $ra
32
Square root continued
L2: lwc1 $f0, 4($sp)
li $v1, 1
lwc1 $f2, 0($sp)
addi $sp, $sp, 8
jr $ra
#
#
#
.data
prompt: .asciiz "Enter a floating point number\n"
prompt1: .asciiz "enter a tolerance\n"
prompt2: .asciiz "The square root is "
neg1: .asciiz "NaN\n"
# end of program
33
Floating point temperature conversion
main:
# SPIM starts execution at main.
addi $sp, $sp, -4
sw $ra, 0($sp)
la $s0, fahr
l.s $f12, 0($s0)
jal f2c
lw $ra, 0($sp)
addi $sp, $sp, 4
jr $ra
f2c:
lwc1 $f16,4($s0)
# $f16 = 5.0 (5.0 in memory)
lwc1 $f18,8($s0)
# $f18 = 9.0 (9.0 in memory)
div.s $f16, $f16, $f18
lwc1 $f18,12($s0)
sub.s $f18, $f12, $f18
mul.s $f0, $f16, $f18
jr $ra
34
Temp conversion continued
#
#
.data
#fahr: .float 5.6e1, 5.0e0, 9.0e0, 3.2e1
fahr: .float 56., 5., 9., 32., 0., 1.
# end of f2c
35
Registers and instructions for FP calculations
• The floating point registers are $f0, $f1, $f2, …, $f31
• There are actually 32 registers for single precision
floating point operations.
• For double precision they are used in pairs which
means there are 64, where $f0 and $f1 comprise the
first, $f2 and $f3 the second, etc.
• Memory is organized the same. You can store 230
words. However, for double precision you can store
only half as many double words since each requires 8
bytes.
36
Floating Point Instructions
• FP add single
add.s $f2, $f4, $f6
• For single precision you can use all registers, not just the
even ones.
• FP subtract single
sub.s $f2, $f4, $f6
• FP multiply single
mul.s $f2, $f4, $f6
• FP divide single
div.s $f2, $f4, $f6
• FP add double
add.d $f2, $f4, $f6
• FP subtract double
sub.d $f2, $f4, $f6
• FP multiply double
mul.d $f2, $f4, $f6
• FP divide double
div.d $f2, $f4, $f6
• Load word copr. 1
lwc1 $f1, 100($s2)
• Store word copr. 1
swc1 $f1, 100($s2)
37
Instructions for Conditional Branching
• Branch on FP true
bc1t 25
If(cond ==1) go to PC + 4 + 100
• Branch on FP false
bc1f 25
If(cond ==1) go to PC + 4 + 100
• FP Compare single
c.lt.s $f2, $f4
= 1; else cond = 0
• (eq, ne, lt, le, gt, ge)
• FP Compare double
c.lt.d $f2, $f4
= 1; else cond = 0
• (eq, ne, lt, le, gt, ge)
If($f2 < $f4) cond
If($f2 < $f4) cond
38
Chapter Three Summary
•
Computer arithmetic is constrained by limited precision
•
Bit patterns have no inherent meaning but standards do exist
– two’s complement
– IEEE 754 floating point
•
Computer instructions determine “meaning” of the bit patterns
•
Performance and accuracy are important so there are many
complexities in real machines
•
Algorithm choice is important and may lead to hardware
optimizations for both space and time (e.g., multiplication)
•
You may want to look back (Section 3.10 is great reading!)
39
Reading assignment
• Read PH 3: B.1-B.6
40

Lecture 5 COSC 3430 PH 3: Chapter 3 1

Transcript Lecture 5 COSC 3430 PH 3: Chapter 3 1

Directory