Chapter 2 - Iowa State University
Download
Report
Transcript Chapter 2 - Iowa State University
CprE 381 Computer Organization and Assembly
Level Programming, Fall 2013
Chapter 2
Instructions: Language
of the Computer
Zhao Zhang
Iowa State University
Revised from original slides provided
by MKP
Steps required
1.
2.
3.
4.
5.
6.
Place parameters in registers
Transfer control to procedure
Acquire storage for procedure
Perform procedure’s operations
Place result in register for caller
Return to place of call
§2.8 Supporting Procedures in Computer Hardware
Procedure/Function Calling
Chapter 2 — Instructions: Language of the Computer — 2
Register Usage Review
$a0 – $a3: arguments (reg’s 4 – 7)
$v0, $v1: result values (reg’s 2 and 3)
$t0 – $t9: temporaries
Can be overwritten by callee
$s0 – $s7: saved
Must be saved/restored by callee
$gp: global pointer for static data (reg 28)
$sp: stack pointer (reg 29)
$fp: frame pointer (reg 30)
$ra: return address (reg 31)
Note: There are additional rules for floating point
registers
Chapter 2 — Instructions: Language of the Computer — 3
Procedure Call Instructions
Procedure call: jump and link
jal ProcedureLabel
Address of following instruction put in $ra
Jumps to target address
Procedure return: jump register
jr $ra
Copies $ra to program counter
Can also be used for computed jumps
e.g., for case/switch statements
Chapter 2 — Instructions: Language of the Computer — 4
Leaf Procedure Example
C code:
int leaf_example (int g, h, i, j)
{
int f;
f = (g + h) - (i + j);
return f;
}
Arguments g, …, j in $a0, …, $a3
f in $s0 (hence, need to save $s0 on stack)
Result in $v0
Chapter 2 — Instructions: Language of the Computer — 5
Leaf Procedure Example
MIPS code:
leaf_example:
addi $sp, $sp, -4
sw
$s0, 0($sp)
add $t0, $a0, $a1
add $t1, $a2, $a3
sub $s0, $t0, $t1
add $v0, $s0, $zero
lw
$s0, 0($sp)
addi $sp, $sp, 4
jr
$ra
Save $s0 on stack
Procedure body
Result
Restore $s0
Return
Chapter 2 — Instructions: Language of the Computer — 6
Exercise
Write MIPS code for
int add2(int x, int y)
{
return x + y;
}
Chapter 1 — Computer Abstractions and Technology — 7
Exercise
First version, with stack frame
# x in $a0, y in $a1, return in $v0
add2:
addi $sp, $sp, -4
# alloc frame
sw
$s0, 0($sp)
# save $s0
add
$s0, $a0, $a1
# tmp = x + y
add
lw
addi
jr
$v0, $s0, $zero # $v0 = tmp
$s0, 0($sp)
# restore $s0
$sp, $sp, 4
# release frame
$ra
Chapter 1 — Computer Abstractions and Technology — 8
Exercise
Optimized version, w/o stack frame
# x in $a0, y in $a1, return in $v0
add2:
add $v0, $a0, $a1 # $v0 = x + y
jr
$ra
In this case, we have nothing to store in
stack frame
Chapter 1 — Computer Abstractions and Technology — 9
Exercise
Write MIPS code for
int max(int x, int y)
{
if (x > y)
return x;
else
return y;
}
Chapter 1 — Computer Abstractions and Technology — 10
Exercise
# x in $a0, y in $a1, return in $v0
max:
slt $t0, $a1, $a0 # y < x?
beq else
# no, do else
add $v0, $a0, $zero # to return x
jal $ra
# return
else:
add $v0, $a1, $zero # to return y
jal $ra
Chapter 1 — Computer Abstractions and Technology — 11
Non-Leaf Procedures
Procedures that call other procedures
For nested call, caller needs to save on the
stack:
Its return address
Any arguments and temporaries needed after
the call
Restore from the stack after the call
Chapter 2 — Instructions: Language of the Computer — 12
Stack Frame Contents
A complete stack frame may hold
Extra arguments exceeding $a0-$a3
Save registers ($s0-$s7) that will be
overwritten
Return address ($ra)
Local, automatic variables
A non-leaf function must have a stack frame,
because $ra has to be saved
Chapter 1 — Computer Abstractions and Technology — 13
Local Data on the Stack
Local data allocated by callee
e.g., C automatic variables
Procedure frame (activation record)
Used by some compilers to manage stack storage
Our examples do not use $fp
Chapter 2 — Instructions: Language of the Computer — 14
Non-Leaf Procedure Example
Write MIPS code for
int max3(int x, int y, int z)
{
return max(max(x, y), z);
}
We have to use a procedure frame in stack
(stack frame)
Chapter 1 — Computer Abstractions and Technology — 15
Non-Leaf Procedure Example
# x in $a0, y in $a1, z in $a2, ret in $v0
max3:
addi $sp, $sp, -8
# alloc stack frame
sw
$ra, 4($sp)
# preserve $ra
sw
$a2, 0($sp)
# preserve z
jal max
# call max(x, y)
add $a0, $v0, $zero # $a0 = max(x, y)
lw
$a1, 0($sp)
# $a1 = z
jal max
# 2nd call max(…)
lw
$ra, 4($sp)
# restore $ra
addi $sp, $sp, 8
# free stack frame
jr
$ra
# return
Chapter 1 — Computer Abstractions and Technology — 16
Non-Leaf Procedure Example
Write MIPS code for
int add3(int x, int y, int z)
{
return add2(add2(x, y), z);
}
Chapter 1 — Computer Abstractions and Technology — 17
Non-Leaf Procedure Example
C code:
int fact (int n)
{
if (n < 1)
return f;
else
return n * fact(n - 1);
}
Argument n in $a0
Result in $v0
Chapter 2 — Instructions: Language of the Computer — 18
Non-Leaf Procedure Example
MIPS code:
fact:
addi
sw
sw
slti
beq
addi
addi
jr
L1: addi
jal
lw
lw
addi
mul
jr
$sp,
$ra,
$a0,
$t0,
$t0,
$v0,
$sp,
$ra
$a0,
fact
$a0,
$ra,
$sp,
$v0,
$ra
$sp, -8
4($sp)
0($sp)
$a0, 1
$zero, L1
$zero, 1
$sp, 8
$a0, -1
0($sp)
4($sp)
$sp, 8
$a0, $v0
#
#
#
#
adjust stack for 2 items
save return address
save argument
test for n < 1
#
#
#
#
#
#
#
#
#
#
if so, result is 1
pop 2 items from stack
and return
else decrement n
recursive call
restore original n
and return address
pop 2 items from stack
multiply to get result
and return
Chapter 2 — Instructions: Language of the Computer — 19
Memory Layout
Text: program code
Static data: global
variables
Dynamic data: heap
e.g., static variables in C,
constant arrays and strings
$gp initialized to address
allowing ±offsets into this
segment
E.g., malloc in C, new in
Java
Stack: automatic storage
Chapter 2 — Instructions: Language of the Computer — 20
Byte-encoded character sets
ASCII: 128 characters
Latin-1: 256 characters
95 graphic, 33 control
ASCII, +96 more graphic characters
§2.9 Communicating with People
Character Data
Unicode: 32-bit character set
Used in Java, C++ wide characters, …
Most of the world’s alphabets, plus symbols
UTF-8, UTF-16: variable-length encodings
Chapter 2 — Instructions: Language of the Computer — 21
Byte/Halfword Operations
Could use bitwise operations
MIPS byte/halfword load/store
String processing is a common case
lb rt, offset(rs)
Sign extend to 32 bits in rt
lbu rt, offset(rs)
lhu rt, offset(rs)
Zero extend to 32 bits in rt
sb rt, offset(rs)
lh rt, offset(rs)
sh rt, offset(rs)
Store just rightmost byte/halfword
Chapter 2 — Instructions: Language of the Computer — 22
String Copy Example
C code (array-based version)
Null-terminated string
void strcpy (char x[], char y[])
{
int i = 0;
while ((x[i] = y[i]) != '\0')
i++;
}
Addresses of x, y in $a0, $a1
i in $s0
Chapter 2 — Instructions: Language of the Computer — 23
String Copy Example
MIPS code:
strcpy:
addi
sw
add
L1: add
lbu
add
sb
beq
addi
j
L2: lw
addi
jr
$sp,
$s0,
$s0,
$t1,
$t2,
$t3,
$t2,
$t2,
$s0,
L1
$s0,
$sp,
$ra
$sp, -4
0($sp)
$zero, $zero
$s0, $a1
0($t1)
$s0, $a0
0($t3)
$zero, L2
$s0, 1
0($sp)
$sp, 4
#
#
#
#
#
#
#
#
#
#
#
#
#
adjust stack for 1 item
save $s0
i = 0
addr of y[i] in $t1
$t2 = y[i]
addr of x[i] in $t3
x[i] = y[i]
exit loop if y[i] == 0
i = i + 1
next iteration of loop
restore saved $s0
pop 1 item from stack
and return
Chapter 2 — Instructions: Language of the Computer — 24
String Copy Example
C code, pointer-based version
void strcpy (char *x, char *y)
{
while ((*x++ = *y++) != '\0')
{
}
}
A good optimizing compiler may generate the
same, efficient code for both versions (see next)
Chapter 2 — Instructions: Language of the Computer — 25
Strcpy: Optimized Version
strcpy:
# reg: x in $a0, y in $a1, *y in $t0
Loop:
lbu $t0, 0($a1)
# load *y
sb
$t0, 0($a0)
# store to *x
addi $a0, $a0, 1
# x++
addi $a1, $a1, 1
# y++
bne $t0, $zero, Loop
# *y != 0?
jr
$ra
# return
5 vs. 7 instructions in the loop
6 vs. 13 instructions in the function
Chapter 1 — Computer Abstractions and Technology — 26
Array indexing involves
Multiplying index by element size
Adding to array base address
Pointers correspond directly to memory
addresses
§2.14 Arrays versus Pointers
Arrays vs. Pointers
Can avoid indexing complexity
Chapter 2 — Instructions: Language of the Computer — 27
Another Example
Clear an array, Array access
clear1(int array[], int size)
{
int i;
for (i = 0; i < size; i++) {
array[i] = 0;
}
}
Chapter 1 — Computer Abstractions and Technology — 28
Array Access MIPS Code
# array in $a0, size in $a1
clear1:
move $t0,$zero
# i = 0
loop1:
sll $t1, $t0, 2
# $t1 = i * 4
add $t2, $a0, $t1 # $t2 = &array[i]
sw
$zero, 0($t2) # array[i] = 0
addi $t0, $t0, 1
# i = i + 1
slt $t3, $t0, $a1 # $t3 = (i < size)
bne $t3, $zero, loop1 # if true,
repeat
Chapter 1 — Computer Abstractions and Technology — 29
Pointer Access
Clear an array, array access
clear2(int *array, int size)
{
int *p;
for (p = array; p < array + size;
p++) {
*p = 0;
}
}
Chapter 1 — Computer Abstractions and Technology — 30
Pointer Access MIPS Code
clear2:
move $t0, $a0
# p = array
sll $t1, $a1, 2
# $t1 = size * 4
add $t2, $a0, $t1 # $t2 = &array[size]
j loop2_cond
loop2:
sw
$zero, 0($t0) # *p = 0
addi $t0, $t0, 4
# p++
loop2_cond:
slt $t3, $t0, $t2 # p < &array[size]?
bne $t3, $zero, loop2
$jr $ra
Chapter 1 — Computer Abstractions and Technology — 31
Comparison of Array vs. Ptr
Multiply “strength reduced” to shift
Array version requires shift to be inside
loop
Part of index calculation for incremented i
c.f. incrementing pointer
Compiler can achieve same effect as
manual use of pointers
Induction variable elimination
Better to make program clearer and safer
Chapter 2 — Instructions: Language of the Computer — 32
For-Loop Example
Calculate the sum of array
int array_sum(int X[], int size)
{
int sum = 0;
for (int i = 0; i < size; i++)
sum += X[i];
return sum;
}
Chapter 1 — Computer Abstractions and Technology — 33
FOR Loop
Control and Data Flow
Graph
Linear Code Layout
Init-expr
Init-expr
Jump
For-body
For-body
Incr-expr
Incr-expr
Test cond
Cond
F
T
Branch if true
(Optional: prologue and epilogue)
34
For-Loop MIPS Code
# X in $a0, size in $a1, return in $v0
array_sum:
add
$v0, $zero, $zero # sum = 0
add
$t0, $zero, $zero # i = 0
j
for_cond
for_loop:
sll
$t1, $t0, 2
# $t1 = i*4
add
$t1, $a0, $t1
# $t1 = &X[i]
lw
$t1, 0($t1)
# $t1 = X[i]
add
$v0, $v0, $t1
# sum += X[i]
addi $t0, $t0, 1
# i++
for_cond:
slt
$t1, $t0, $a1
# i < size?
bne
$t1, $zero, for_loop # if true, repeat
jr
$ra
Chapter 1 — Computer Abstractions and Technology — 35
For-Loop: Pointer Version
Calculate the sum of array
int array_sum(int X[], int size)
{
int *p, sum = 0;
for (p = X; p < &X[size]; p++)
sum += *p;
return sum;
}
Again, do not write pointer version for performance – A
good compiler will take care of it.
Chapter 1 — Computer Abstractions and Technology — 36
Optimized MIPS Code
# X in $a0, size in $a1, return in $v0
array_sum:
add
$v0, $zero, $zero # sum = 0
add
$t0, $a0, $zero
# p = X
sll
$a1, $a1, 2
# $a1 = 4*size
add
$a1, $a0, $a1
# $a1 = &X[size]
j
for_cond
for_loop:
lw
$t1, 0($t0)
# $t1 = *p
add
$v0, $v0, $t1
# sum += *p
addi $t0, $t0, 4
# p++
for_cond:
slt
$t1, $t0, $a1
# p < &X[size]?
bne
$t1, $zero, for_loop # if true, repeat
jr
$ra
Chapter 1 — Computer Abstractions and Technology — 37
Most constants are small
16-bit immediate is sufficient
For the occasional 32-bit constant
lui rt, constant
Copies 16-bit constant to left 16 bits of rt
Clears right 16 bits of rt to 0
lhi $s0, 61
0000 0000 0111 1101 0000 0000 0000 0000
ori $s0, $s0, 2304 0000 0000 0111 1101 0000 1001 0000 0000
§2.10 MIPS Addressing for 32-Bit Immediates and Addresses
32-bit Constants
Chapter 2 — Instructions: Language of the Computer — 38
32-bit Constants
Translate C to MIPS
f = 0x10203040;
# Assume f in $s0
lui $s0, 0x1020
ori $s0, $s0, 0x3040
Chapter 1 — Computer Abstractions and Technology — 39
32-bit Constants
Load a big value in MIPS
int *p = array;
# assume p in $s0
la $s0, array
MIPS assembly supports pseudo
instruction “la”, equivalent to
lui $s0, upper_of_array
ori $s0, $s0, lower_of_array
The assembler decides the value for
upper_of_array and lower_of_array
Chapter 1 — Computer Abstractions and Technology — 40
Shift Instructions
Ex:
0
--
rt
rd
shamt
0
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
sll
sll
rd, rt, shamt
$s0, $s0, 4
; shift left logic
; sll by 4 bits
0
rs
rt
rd
0
4
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
sllv rd, rt, rs
Ex: sllv $s0, $s0, $t0
Source: textbook B-55, B56
; SLL variable
; ssl by $t0 bits
Chapter 1 — Computer Abstractions and Technology — 41
Shift Instructions
Other shift instructions
srl rd, rt, shamt
srlv rd, rt, rs
# shift right logic
# SRL varaible
sra rd, rt, shamt
srav rd, rt, rs
# shift right arithmetic
# SRA variable
Chapter 1 — Computer Abstractions and Technology — 42
Branch Addressing
Branch instructions specify
Opcode, two registers, target address
Most branch targets are near branch
Forward or backward
op
rs
rt
constant or address
6 bits
5 bits
5 bits
16 bits
PC-relative addressing
Target address = PC + offset × 4
PC already incremented by 4 by this time
Chapter 2 — Instructions: Language of the Computer — 43
Jump Addressing
Jump (j and jal) targets could be
anywhere in text segment
Encode full address in instruction
op
address
6 bits
26 bits
(Pseudo)Direct jump addressing
Target address = PC31…28 : (address × 4)
Chapter 2 — Instructions: Language of the Computer — 44
Target Addressing Example
Loop code from earlier example
Assume Loop at location 80000
Loop: sll
$t1, $s3, 2
80000
0
0
19
9
2
0
add
$t1, $t1, $s6
80004
0
9
22
9
0
32
lw
$t0, 0($t1)
80008
35
9
8
0
bne
$t0, $s5, Exit 80012
5
8
21
2
19
19
1
addi $s3, $s3, 1
80016
8
j
80020
2
Exit: …
Loop
20000
80024
Chapter 2 — Instructions: Language of the Computer — 45
Branching Far Away
If branch target is too far to encode with
16-bit offset, assembler rewrites the code
Example
beq $s0,$s1, L1
↓
bne $s0,$s1, L2
j L1
L2: …
Chapter 2 — Instructions: Language of the Computer — 46
Addressing Mode Summary
Chapter 2 — Instructions: Language of the Computer — 47