CDA 3101 Spring 2001 Introduction to Computer Organization

Download Report

Transcript CDA 3101 Spring 2001 Introduction to Computer Organization

CDA 3101
Fall 2013
Introduction to Computer Organization
Pointers & Arrays
MIPS Programs
16 September 2013
Overview
•
•
•
•
•
•
Pointers (addresses) and values
Argument passing
Storage lifetime and scope
Pointer arithmetic
Pointers and arrays
Pointers in MIPS
Review: Pointers
• Pointer: a variable that contains the address of
another variable
– HLL version of machine language memory address
• Why use Pointers?
– Sometimes only way to express computation
– Often more compact and efficient code
• Why not?
– Huge source of bugs in real software, perhaps the
largest single source
1) Dangling reference (premature free)
2) Memory leaks (tardy free): can't have long-running
jobs without periodic restart of them
Review: C Pointer Operators
• Suppose c has value 100, it is located in memory at
address 0x10000000
• Unary operator & gives address:
p = &c; gives address of c to p;
– p “points to” c
(p == 0x10000000) (Referencing)
• Unary operator * gives value that pointer points to
– if p = &c => * p == 100 (Dereferencing a pointer)
• Deferencing  data transfer in assembler
– ... = ... *p ...;
 load
(get value from location pointed to by p)
– *p = ...;
 store
(put value into location pointed to by p)
Review: Pointer Arithmetic
int x = 1, y = 2;
int z[10];
int *p;
/* x and y are integer variables */
/* an array of 10 ints, z points to start */
/* p is a pointer to an int */
x = 21;
z[0] = 2; z[1] = 3
p = &z[0];
p = z;
p = p+1;
p++;
*p = 4;
p = 3;
p = &x;
z = &y
/* assigns x the new value 21 */
/* assigns 2 to the first, 3 to the next */
/* p refers to the first element of z */
/* same thing; p[ i ] == z[ i ]*/
/* now it points to the next element, z[1] */
/* now it points to the one after that, z[2] */
/* assigns 4 to there, z[2] == 4*/
/* bad idea! Absolute address!!! */
/* p points to x, *p == 21 */
illegal!!!!! array name is not a variable
p:
z[1]
z[0]
4
3
2
y:
2
x:
21
Review: Assembly Code
c is int, has value 100, in memory at address
0x10000000, p in $a0, x in $s0
1. p = &c;
/* p gets 0x10000000*/
lui $a0,0x1000 # p = 0x10000000
2. x = *p;
lw
/* x gets 100 */
$s0, 0($a0) # dereferencing p
3. *p = 200; /* c gets 200 */
addi $t0,$0,200
sw
$t0, 0($a0) # dereferencing p
Argument Passing Options
• 2 choices
– “Call by Value”: pass a copy of the item to the
function/procedure
– “Call by Reference”: pass a pointer to the item
to the function/procedure
• Single word variables passed by value
• Passing an array? e.g., a[100]
– Pascal (call by value) copies 100 words of a[]
onto the stack
– C (call by reference) passes a pointer
(1 word) to the array a[] in a register
Lifetime of Storage and Scope
• Automatic (stack allocated)
– Typical local variables of a function
– Created upon call, released upon return
– Scope is the function
• Heap allocated
– Created upon malloc, released upon free
– Referenced via pointers
Code
Static
Heap
• External / static
– Exist for entire program
Stack
Arrays, Pointers, and Functions
•
4 versions of array function that adds two
arrays and puts sum in a third array (sumarray)
1. Third array is passed to function
2. Using a local array (on stack) for result and passing
a pointer to it
3. Third array is allocated on heap
4. Third array is declared static
•
Purpose of example is to show interaction of C
statements, pointers, and memory allocation
Version 1
int x[100], y[100], z[100];
sumarray(x, y, z);
• C calling convention means:
sumarray(&x[0], &y[0], &z[0]);
• Really passing pointers to arrays
addi
addi
addi
jal
$a0,$gp,0
# x[0] starts at $gp
$a1,$gp,400 # y[0] above x[100]
$a2,$gp,800 # z[0] above y[100]
sumarray
Version 1: Compiled Code
void sumarray(int a[], int b[], int c[]) {
int i;
for(i = 0; i < 100; i = i + 1)
c[i] = a[i] + b[i];
}
Loop:
Exit:
addi
beq
lw
lw
add
sw
addi
addi
addi
j
jr
$t0,$a0,400
$a0,$t0,Exit
$t1, 0($a0)
$t2, 0($a1)
$t1,$t1,$t2
$t1, 0($a2)
$a0,$a0,4
$a1,$a1,4
$a2,$a2,4
Loop
$ra
# beyond end of a[]
# $t1=a[i]
# $t2=b[i]
# $t1=a[i] + b[i]
# c[i]=a[i] + b[i]
# $a0++
# $a1++
# $a2++
Version 2
int *sumarray(int a[],int b[]) {
int i, c[100];
for(i=0;i<100;i=i+1)
c[i] = a[i] + b[i];
return c;
}
$sp
c[100]
a[100]
B[100]
addi
addi
addi
addi
Loop: beq
lw
lw
add
sw
addi
addi
addi
j
Exit: addi
jr
$t0,$a0,400 # beyond end of a[]
$sp,$sp,-400 # space for c
$t3,$sp,0
# ptr for c
$v0,$t3,0
# $v0 = &c[0]
$a0,$t0,Exit
$t1, 0($a0) # $t1=a[i]
$t2, 0($a1) # $t2=b[i]
$t1,$t1,$t2 # $t1=a[i] + b[i]
$t1, 0($t3) # c[i]=a[i] + b[i]
$a0,$a0,4 # $a0++
$a1,$a1,4 # $a1++
$t3,$t3,4
# $t3++
Loop
$sp,$sp, 400 # pop stack
$ra
Version 3
int * sumarray(int a[],int b[]) {
int i;
int *c;
c = (int *) malloc(100);
for(i=0;i<100;i=i+1)
c[i] = a[i] + b[i];
return c;
}
Code
Static
c[100]
Heap
• Not reused unless freed
– Can lead to memory leaks
– Java, Scheme have garbage
collectors to reclaim free space
Stack
Version 3: Compiled Code
addi $t0,$a0,400 # beyond end of a[]
addi $sp,$sp,-12 # space for regs
sw
$ra, 0($sp) # save $ra
sw
$a0, 4($sp) # save 1st arg.
sw
$a1, 8($sp) # save 2nd arg.
addi $a0,$zero,400
jal
malloc
addi $t3,$v0,0
# ptr for c
lw
$a0, 4($sp) # restore 1st arg.
lw
$a1, 8($sp) # restore 2nd arg.
Loop:
beq $a0,$t0,Exit
... (loop as before on prior slide )
j
Loop
Exit:lw
$ra, 0($sp) # restore $ra
addi $sp, $sp, 12 # pop stack
jr
$ra
Version 4
int * sumarray(int a[],int b[]) {
int i;
static int c[100];
for(i=0;i<100;i=i+1)
c[i] = a[i] + b[i];
return c;
}
• Compiler allocates once for
function, space is reused
– Will be changed next time
sumarray invoked
– Why describe? used in C libraries
Code
Static
c[100]
Heap
Stack
New Topic - MIPS Programs
• Data types and addressing included in the ISA
• Compromise between application requirements and
hardware implementation
• MIPS data types
• 32-bit word
• 16-bit half word
• 8-bit bytes
• Addressing Modes
• Data
• Register
• 16-bit signed constants
• Base addressing
• Instructions
• PC-relative
• (Pseudo) direct
Overview of Program Development
C program: foo.c
Compiler (cc)
Assembly program: foo.s
Assembler (as)
Object(mach lang module): foo.o
Linker (ld)
Executable(mach lang pgm): a.out
Loader
Memory
lib.o
Assembler
• Reads and use directives
• Replace pseudoinstructions
– subu $sp,$sp,32
– sd $a0, 32($sp)
– mul $t7,$t6,$t5
– la $a0, 0xAABBCCDD
addiu
sw
sw
mult
mflo
lui
ori
$sp, $sp, -32
$a0, 32($sp)
$a1, 36($sp)
$t6,$t5
$t7
$at, 0xAABB
$a0, $at, 0xCCDD
• Produce machine language
• Create object file (*.o)
Assembler Directives
• Directions to assembler that don’t produce machine
instructions
.align n
Align the next datum on a 2n byte boundary
.text
.data
.globl sym
.asciiz str
.word w1…wn
Subsequent items put in user text segment
Subsequent items put in user data segment
sym can be referenced from other files
Store the string str in memory
Store the n 32-bit quantities in successive
memory words
.byte b1..bn Store n 8-bit values in successive bytes
of memory
.float f1..fn: Store n floating-point numbers in
successive memory words
Absolute Addresses
• Which instructions need relocation editing?
• Loads / stores to variables in static area
lw/sw
$gp
$x
address
• Conditional branches
beq/bne
$rs
address
$rt
– PC-relative addressing preserved even if code moves
• Jump instructions
j/jal
xxxxx
• Direct (absolute) references to data (e.g. la)
Producing Machine Language
• Simple case
– Arithmetic, logical, shifts, etc.
– All necessary info is within the instruction already.
• Conditional branches (beq, bne)
– Once pseudoinstructions are replaced by real ones, we
know by how many instructions for branch span
– PC-relative, easy to handle
• Direct (absolute) addresses
– Jumps (j and jal)
– Direct (absolute) references to data
– These can’t be determined yet, so we create two tables
Assembler Tables
• Symbol table
– List of “items” in this file that may be used by other files.
• Labels: function calling
• Data: anything in the .data section; variables which may be
accessed across files
– First Pass: record label-address pairs
– Second Pass: produce machine code
– Can jump to a later label without first declaring it
• Relocation Table
– List of “items” for which this file needs the address.
– Any label jumped to: j or jal
• internal
• external (including lib files)
– Any piece of data (e.g. la instruction)
Object File Format
• Object file header: size and position of the
other pieces of the object file
• Text segment: the machine code
• Data segment: binary representation of the
data in the source file
• Relocation information: identifies lines of
code that need to be “handled”
• Symbol table: list of this file’s labels and
data that can be referenced
• Debugging information
Linker (Link Editor)
• Combines object (.o) files into an executable file
• Enables separate (independent) compilation of files
– Only recompile modified file (module)
• Windows NT source is >30 M lines of code! And Growing!
• Edits the “links” in jump and link instructions
• Process (input: object files produced by assembler)
– Step 1: put together the text segments from each .o file
– Step 2: put together the data segments from each .o file and
concatenate this onto end of text segments
– Step 3: resolve references. Go through Relocation Table
and handle each entry (fill in all absolute addresses)
Resolving References
• Four types of references (addresses)
–
–
–
–
PC-relative (e.g. beq, bne): never relocate
Absolute address (j, jal): always relocate
External reference (jal): always relocate
Data reference (lui and ori): always relocate
• Linker assumes first word of first text segment is
at address 0x00000000.
• Linker knows:
– Length of each text and data segment
– Ordering of text and data segments
• Linker calculates:
– Absolute address of each label to be jumped to (internal
or external) and each piece of data being referenced
Loader
•
•
•
Executable file is stored on disk.
Loader’s job: load it into memory and start it running.
In reality, loader is part of the operating system (OS)
1. Reads header to determine size of text and data segments
2. Creates new address space for program large enough to
hold text and data segments, along with a stack segment
3. Copies instructions and data from executable file memory
4. Copies arguments passed to the program onto the stack
5. Initializes machine registers, $sp = 1st free stack location
6. Jumps to start-up routine that copies program’s arguments
from stack to registers and sets the PC
7. If main routine returns, start-up routine terminates program
with the exit system call
Example: C => Asm => Obj => Exe => Run
#include <stdio.h>
int main (int argc, char *argv[]) {
int i;
int sum = 0;
for (i = 0; i <= 100; i = i + 1)
sum = sum + i * i;
printf ("The sum from 0 .. 100 is
%d\n", sum);
}
Example: C => Asm => Obj => Exe => Run
.text
.align 2
.globl main
main:
subu $sp,$sp,32
sw
$ra, 20($sp)
sd
$a0, 32($sp)
sw
$0, 24($sp)
sw
$0, 28($sp)
loop:
lw
$t6, 28($sp)
mul $t7, $t6,$t6
lw
$t8, 24($sp)
addu $t9, $t8,$t7
sw
$t9, 24($sp)
addu $t0, $t6, 1
sw
$t0, 28($sp)
ble $t0,100, loop
la
$a0, str
lw
$a1, 24($sp)
jal printf
move $v0, $0
lw
$ra, 20($sp)
addiu $sp,$sp,32
jr
$ra
.data
.align 0
str:
.asciiz "The sum
from 0 .. 100 is %d\n"
Example: C => Asm => Obj => Exe => Run
Replace pseudoinstructions; assign addresses (start at 0x00)
00
04
08
0c
10
14
18
1c
20
24
28
2c
addiu
sw
sw
sw
sw
sw
lw
mult
mflo
lw
addu
sw
$29,$29,-32
$31,20($29)
$4, 32($29)
$5, 36($29)
$0, 24($29)
$0, 28($29)
$14,28($29)
$14,$14
$15
$24,24($29)
$25,$24,$15
$25,24($29)
30
34
38
3c
40
44
48
4c
50
54
58
5c
60
addiu
sw
slti
bne
lui
ori
lw
jal
add
lw
addiu
jr
The
$8,$14, 1
$8,28($29)
$1,$8, 101
$1,$0, loop
$4, hi.str
$4,$4,lo.str
$5,24($29)
printf
$2, $0, $0
$31,20($29)
$29,$29,32
$31
Symbol and Relocation Tables
• Symbol Table
– Label
main:
loop:
str:
printf:
Address
0x00000000
0x00000018
0x10000430
0x004003b0
• Relocation Information
–
–
–
–
Address
0x00000040
0x00000044
0x0000004c
Instr. Type
HI16
LO16
jal
Dependency
str
str
printf
Example: C => Asm => Obj => Exe => Run
Edit addresses
00
04
08
0c
10
14
18
1c
20
24
28
2c
addiu
sw
sw
sw
sw
sw
lw
multu
mflo
lw
addu
sw
$29,$29,-32
$31,20($29)
$4,32($29)
$5,36($29)
$0, 24($29)
$0, 28($29)
$14, 28($29)
$14, $14
$15
$24, 24($29)
$25,$24,$15
$25, 24($29)
30
34
38
3c
40
44
48
4c
50
54
58
5c
addiu
sw
slti
bne
lui
ori
lw
jal
add
lw
addiu
jr
$8,$14, 1
$8,28($29)
$1,$8, 101
$1,$0, -9
$4, 4096
$4,$4,1072
$5,24($29)
1048812
$2, $0, $0
$31,20($29)
$29,$29,32
$31
Example: C => Asm => Obj => Exe => Run
0x00400000
0x00400004
0x00400008
0x0040000c
0x00400010
0x00400014
0x00400018
0x0040001c
0x00400020
0x00400024
0x00400028
0x0040002c
0x00400030
0x00400034
0x00400038
0x0040003c
0x00400040
0x00400044
0x00400048
0x0040004c
0x00400050
0x00400054
0x00400058
0x0040005c
00100111101111011111111111100000
10101111101111110000000000010100
10101111101001000000000000100000
10101111101001010000000000100100
10101111101000000000000000011000
10101111101000000000000000011100
10001111101011100000000000011100
10001111101110000000000000011000
00000001110011100000000000011001
00100101110010000000000000000001
00101001000000010000000001100101
10101111101010000000000000011100
00000000000000000111100000010010
00000011000011111100100000100001
00010100001000001111111111110111
10101111101110010000000000011000
00111100000001000001000000000000
10001111101001010000000000011000
00001100000100000000000011101100
00100100100001000000010000110000
10001111101111110000000000010100
00100111101111010000000000100000
00000011111000000000000000001000
00000000000000000001000000100001
MIPS Program - Summary
• Compiler converts a single HLL file into a
single assembly language file.
• Assembler removes pseudos, converts what
it can to machine language, and creates a
checklist for the linker (relocation table).
This changes each .s file into a .o file.
• Linker combines several .o files and
resolves absolute addresses.
• Loader loads executable into memory and
begins execution.