Transcript Document

Instruction Set Architectures
CMSC411/Computer Architecture
These slides and all associated material are
© 2003 by J. Six and are available only for
students enrolled in CMSC411.
Science can amuse and fascinate us all, but it is engineering that changes the world.
CMSC411 – Computer Architecture / © 2003 J. Six
Use and Distribution Notice
Possession of any of these files implies understanding and
agreement to this policy.
The slides are provided for the use of students enrolled in Jeff
Six's Computer Architecture class (CMSC 411) at the University of
Maryland Baltimore County. They are the creation of Mr. Six and
he reserves all rights as to the slides. These slides are not to be
modified or redistributed in any way. All of these slides may only
be used by students for the purpose of reviewing the material
covered in lecture. Any other use, including but not limited to, the
modification of any slides or the sale of any slides or material, in
whole or in part, is expressly prohibited.
Most of the material in these slides, including the examples, is
derived from Computer Organization and Design, Second Edition.
Credit is hereby given to the authors of this textbook for much of
the content. This content is used here for the purpose of
presenting this material in CMSC 411, which uses this textbook.
Instructions and
Instruction Sets
CMSC411 – Computer Architecture / © 2003 J. Six
Each computer uses a certain language – each
individual command (a word in the computer’s
language) is called an instruction.
All of the instructions that a specific computer
understands is called the instruction set.
There are multiple machine languages (instruction
sets). While each is different, reflecting the design
choices made during its construction, most
instructions sets are similar.
Throughout this course, we will study the MIPS
instruction set, with occasional comparisons to
others. MIPS-based microprocessors are currently
used by Silicon Graphics/SGI, NEC, and Cisco
Systems (among others).
CMSC411 – Computer Architecture / © 2003 J. Six
At the Beginning: Adding
The most basic instruction for every computer
is that which performs addition.
The MIPS instruction for adding numbers is
quite simple…
add a, b, c
This instruction tells the computer to add the
two variables b and c and to put the sum in
the variable a.
This syntax is fixed … the add instruction in
MIPS always takes two operands and
produces the result.
Adding More
than Two Numbers
CMSC411 – Computer Architecture / © 2003 J. Six
Since the instruction format for adding is fixed,
multiple instructions would need to be used to
add more than two numbers. For instance, to
add b, c, d, and e, putting the result in a…
add a, b, c
add a, a, d
add a, a, e
# sum of b & c is now stored in a
# sum of b, c, & d is now stored in a
# sum of b, c, d, & e is now stored in a
The sharp symbol (#) delimits a comment. They
are ignored by the computer. A comment in this
language always ends at the end of the line.
Also note that one and only one command can
appear on a line.
Fixed Operands:
A Design Principle
CMSC411 – Computer Architecture / © 2003 J. Six
Addition naturally favors three arguments,
two operands and the sum.
Hardware for a fixed number of operands is
much less complex than hardware that could
support a variable number of operands (it’s
always easier to do two rather than to do
one, two, three, or any other number).
This gives rise to one of the cardinal rules of
computer design…
Simplicity favors regularity.
C to Assembler:
A Basic Example
CMSC411 – Computer Architecture / © 2003 J. Six
A compiler transforms a high-level language
(like C) program into assembly language
(what we have just seen).
A simple C program…
a = b + c;
d = a – e;
… becomes a simple assembly language
program…
add a, b, c
sub d, a, e
CMSC411 – Computer Architecture / © 2003 J. Six
C to Assembler:
A More Complex Example
A more complex example simple C statement…
f = (g + h) – (i + j);
… also becomes a fairly simple assembly language
program. Here, we need some temporary storage, so
let’s call these temporary variables t0 and t1 (there’s a
reason for those names - we’ll get to that in a bit)…
add t0, g, h # temp var t0 = g + h
add t1, i, j # temp var t1 = i + j
sub f, t0, t1 # f = to – t1
CMSC411 – Computer Architecture / © 2003 J. Six
Registers
So far, we have used the non-descriptive term
variable for the operands to our instructions.
At the assembly level, the operands of
arithmetic instructions must be located in
registers.
Registers are one of the fundamental concepts
of computer architecture – they are high speed
memory locations located on the same die as
the microprocessor. They are directly
addressable by the CPU.
The size of registers varies based on the
instruction set – MIPS registers are 32 bits.
CMSC411 – Computer Architecture / © 2003 J. Six
General Purpose Registers
These are registers that are
used for normal instructions –
such as arithmetic operands.
Non GP registers would be
registers like the stack pointer,
amount
of
which contains the location of
the system stack data structure.
Register Limitations
There are typically a small
registers in a CPU…


MIPS has 32 32-bit general purpose registers.
Intel IA-32/x86 has 4 32-bit general purpose registers.
This (severe) limitation on the number of
registers is typically done for two reasons…


Speed – smaller is normally faster
Cost – Registers are expensive
Modern computer designs have seen an
explosive growth in general purpose
registers…

Intel IA-64 has 128 64-bit general purpose registers.
Loading Information
into Registers
CMSC411 – Computer Architecture / © 2003 J. Six
Since all operands for arithmetic operands
must be in registers, there needs to be a way
to put information from main memory into a
register.
In MIPS, this is accomplished using the loadword instruction (lw). The operands for this
instruction are the name of the register to be
loaded, and a constant followed by a register.
The memory location to load the value from
is formed by adding the constant to a pointer
contained in the last register.
CMSC411 – Computer Architecture / © 2003 J. Six
Memory Addressing
Memory is modern microprocessors is byte
addressable. That means that each individual
byte can be specified by a memory address.
Each microprocessor has the concept of a
word – this is the “preferred size” of data
values for this architecture



A MIPS word is 32 bits.
An Intel IA-32/x86 word is 32 bits.
An Intel IA-64 word is 64 bits.
Most architectures enforce alignment
restrictions – each word access must be at an
address that is a multiple of the word size.
CMSC411 – Computer Architecture / © 2003 J. Six
Alignment Restrictions
For example, a word can be found at
memory locations 0, 4, 8, and so forth.
It is not legal to attempt to access a
word at location 3.
Location 20
Location 16
Location 12
Location 8
Location 4
Location 0
32 bits
CMSC411 – Computer Architecture / © 2003 J. Six
Memory Address Space
The entire memory that is addressable by a
microprocessor is referred to as its address
space.
MIPS has a 32-bit address space. That
means that all memory addresses are 32-bits
long. This allows a maximum of 232 memory
locations (remember each memory location is
a byte – this is byte addressable memory).
This means that 32-bit microprocessors can
address 4 GB of memory.


The first (lowest) byte address is 0.
The last (highest) byte address is 4294967295.
CMSC411 – Computer Architecture / © 2003 J. Six
The Load-Word Instruction
Consider the C language statement…
g = h + A[8];
We need to get the value stored in main memory at
location 8 in the array A into a register in order to
perform this instruction. If the base address of A is
in $s3, this can be accomplished using the load-word
instruction…
lw $t0, 8($s3)
…now that the value A[8] is in register t0, we can
perform the addition…
add $s1, $s2, $t0
#g = h + A[8]
CMSC411 – Computer Architecture / © 2003 J. Six
Memory Layout: Endianness
One design decision faced by every microprocessor
design must decide how to store multibyte values in
memory.
Big endian processors store the leftmost (highest
order) byte at the actual address of the value – little
endian processors store the rightmost (lowest order)
byte at the actual address of the value.
For example, the value 00000000 11111111
01010101 10101010 stored at location 1024…
Location 1027
10101010
Location 1026
Location 1026
01010101
Location 1025
Location 1025
11111111
Location 1024
Location 1024
00000000
Big Endian MIPS is big endian. Little
Location 1027
00000000
11111111
01010101
10101010
Endian
CMSC411 – Computer Architecture / © 2003 J. Six
Revisiting the
Load-Word Instruction
Looking again at the C language statement…
g = h + A[8];
We need to get the value stored in main memory
at location 8 in the array A into a register in
order to perform this instruction. Assuming A is
an integer array and since integers are 32 bits in
the MIPS architecture, the offset must be 4 x 8 =
32, since each entry in A is four bytes big and
memory is byte addressable. We now have the
load-word instruction…
lw $t0, 32($s3)
CMSC411 – Computer Architecture / © 2003 J. Six
The Store-Word Instruction
We have an instruction to load a value
from memory into a register – now we
need an instruction to take value from a
register and copy it into memory.
This is accomplished using the storeword (sw) instruction, which has the
same syntax as the load-word
instruction.
CMSC411 – Computer Architecture / © 2003 J. Six
Store-Word Example
For example, consider the C statement…
A[12] = h + A[8];
Assuming A is an array of 100 integers, the base
address of A is stored in register $s3, and the
variable h is already in register $s2, this
statement could be implemented in MIPS
assembly language as…
lw $t0, 32($s3)
#$t0 <= A[8]
add $t0, $s2, $t0 #$t0 <= h+A[8]
sw $t0, 48($s3)
#A[12] <= $t0
CMSC411 – Computer Architecture / © 2003 J. Six
Machine Instructions
So how are these instructions actually represented in
the computer?
Like everything else … as a number.
In fact, each portion of an instruction is usually
represented as a number and then the numbers are
abutted together to form one instruction.
So far we have used registers like $s2 and $t0.
These names are for our benefit … the CPU only
understands numbered registers. We normally use a
convention for name->number register mappings.
So far we have the s registers and the t registers
(what the s and t represent will be discussed soon).
In MIPS design, $s0->$s7 map to registers 16->23
and $t0->$t7 map to registers 8->15.
CMSC411 – Computer Architecture / © 2003 J. Six
Instruction Format: R-Type
In the MIPS architecture, all instructions are 32 bits wide. The
layout of these bits is known as the instruction format.
The addition and subtraction instructions are considered R-type
(R for register) instructions in MIPS – they follow a standard
format, common among all R-type instructions…
opcode
rs
rt
rd
shamt
funct
Opcode – This is the basic operation – what this instruction?
Rs – This is the first operand register.
Rt – This is the second operand register.
Rd – This is the destination register.
Shamt – This is the shift amount – it is only used in shift
instructions which will be discussed at a later point.
Funct – This is the function (or function code). This selectes
the specific variant of the operation (represented in the opcode)
that is desired.
CMSC411 – Computer Architecture / © 2003 J. Six
Instruction Formats
Remember our first design principle,
simplicity favors regularity – all MIPS
instructions are 32 bits and all register based
instructions use the common R-type format.
Ideally, all instructions would use the same
format. However, this is not always possible
– how would one encode a data transfer
(lw/sw) instruction using a R-type instruction
format?
Since we need a (potentially large) offset
encoded in the instruction and only two
registers, the R-type format is not ideal.
CMSC411 – Computer Architecture / © 2003 J. Six
Instruction Format: I-Type
This leads to another design principle…
Good design demands good compromises.
Data transfer instructions are encoded in MIPS using
the I-type instruction format…
opcode
rs
rt
address
6 bits
5 bits
5 bits
16 bits
Opcode – This is the basic operation – what this instruction?
Rs – This is the first operand register (the base address).
Rt – This is the destination register (loaded into or stored
from).
Address – This is the offset for the instruction.
So, lw/sw instructioncs can reference a region
+/- 215 (32768) bytes from the base address.
CMSC411 – Computer Architecture / © 2003 J. Six
MIPS Opcodes
Note that the R-type and I-type formats are similar
(they both start with the opcode and then the
registers). Each opcode has a specific instruction
format and the CPU can look at the opcode to see
what format the rest of the instruction is in.
Let’s look at how the four instructions we have been
looking at are encoded…
instruction
format
opcode
rs
rt
rd
shamt function address
add
R
0
reg
reg
reg
0
32
NA
sub
R
0
reg
reg
reg
0
34
NA
lw
I
35
reg
reg
NA
NA
NA
addr
sw
I
43
reg
reg
NA
NA
NA
addr
reg= register number, NA=fields does not appear in this instruction format,
addr=16 bit address/offset
CMSC411 – Computer Architecture / © 2003 J. Six
C -> Assembly -> Machine
Let’s consider this C statement…
A[300] = h + A[300];
Assuming $t1 has the base address of A and $s2 corresponds
to h, this statement can be compiled into MIPS assembly…
lw $t0, 1200($t1)
add $t0, $s2, $t0
sw $t0, 1200($t1)
We can then express the assembly language in machine
language (let’s start with decimal numbers)…
opcode
rs
rt
35
9
8
0
18
8
43
9
8
rd
shamt/addr
funct
1200
8
0
1200
32
CMSC411 – Computer Architecture / © 2003 J. Six
C -> Assembly -> Machine
Now that we have the machine instructions…
opcode
rs
rt
35
9
8
0
18
8
43
9
8
rd
shamt/addr
funct
1200
8
0
32
1200
…we can express our instructions in binary,
just like they are stored in the computer…
opcode
rs
rt
100 011
01001
01000
000 000
10010
01000
101 011
01001
01000
rd
shamt/addr
funct
0000 0100 1011 0000
01000
00000
0000 0100 1011 0000
100000
CMSC411 – Computer Architecture / © 2003 J. Six
C -> Assembly -> Machine
A[300] = h + A[300];
lw $t0, 1200($t1)
add $t0, $s2, $t0
sw $t0, 1200($t1)
10001101001010000000010010110000
00000010010010000100000000100000
10101101001010000000010010110000
CMSC411 – Computer Architecture / © 2003 J. Six
The Stored Program Concept
So, programs are made up of instructions and
instructions are simply numbers – programs are just
numbers stored in memory.
This means that the same memory can contain
program source code, the compiled machine code,
the data being used, created, and manipulated by the
program, and even the compiler used to compile the
program (it’s all numbers!).
M e m o ry
A c c o u n t in g p r o g r a m
( m a c h in e c o d e )
This is known
as the stored
program concept.
E d it o r p r o g r a m
( m a c h in e c o d e )
P ro c e s s o r
C c o m p i le r
( m a c h in e c o d e )
P a y r o ll d a t a
B o o k te x t
S o u r c e c o d e in C
fo r e d ito r p r o g r a m
CMSC411 – Computer Architecture / © 2003 J. Six
Decision Making:
Branching Instructions
Computers are capable of making decisions – a
capability that is frequently used in programming
languages using the if statement.
The MIPS instruction set supports decision making
using two conditional branch instructions. Both of
these instructions involve two registers and a label.
 The branch-if-equal (beq) instruction goes to the
labeled point if the two registers contain the same
value…
beq $s1, $s1, LABEL1
 The branch-if-not-equal (bne) instruction goes to
the labeled point if the two registers do not
contain the same value…
bne $s2, $s2, LABEL2
CMSC411 – Computer Architecture / © 2003 J. Six
Labels
Remember, program instructions are stored in
memory, just like data. The label used in the
conditional branch instructions is simply a name for a
specific memory address (that contains the
instruction that should be executed next if the branch
is taken).
The compiler will often generate labels where they
are needed (often in places that you would not think
of). This is one (of the many) benefits of
programming in high-level languages.
This label is converted into an actual address when
the program is converted from assembly language to
machine language (this is normally done by a
program known as an assembler).
CMSC411 – Computer Architecture / © 2003 J. Six
Compiling
if-then-else Statements
If-then-else statements compile nicely with beq and
bne instructions.
For example, let’s compile the statement…
if (i==j) f=g+h; else f=g-h;
Assume f->i are stored in $s0->$s4. This can be
compiled into MIPS assembly language quite easily…
bne
add
j
$s3, $s4, Else
$s0, $s1, $s2
Exit
The jump (j) instruction
is an
sub
$s0, $s1, $s2
It simply goes to the
specified label/address.
Else:
Exit:
unconditional branch.
CMSC411 – Computer Architecture / © 2003 J. Six
Compiling Loops
Conditional branches are also useful for loops.
For example, let’s compile the loop…
while (save[i] == k)
i += j;
Assume i,j,k are in $s3,$s4,$s5 and the base address
of the integer array save is in $s6. Let’s compile…
Loop: add $t1,
add $t1,
add $t1,
lw $t0,
bne $t0,
add $s3,
j Loop
Exit:
$s3, $s3
$t1, $t1
$t1, $s6
0($t1)
$s5, Exit
$s3, $s4
#
#
#
#
#
#
#
temp = i * 2
temp = i * 4
addr of save[i] in $t1
load save[i] into $t0
check loop condition
if statement
we’re not done yet
CMSC411 – Computer Architecture / © 2003 J. Six
The Zero Register and the
set-on-less-than Instruction
MIPS includes the set-on-less-than (slt)
instruction, which sets the destination register to
one if the first operand register is less than the
second operand register, zero otherwise…
slt $t0, $s0, $s1
#t0 gets 1 if $s0<$s1, 0 else
Using only the slt, beq, and bne instructions, a
compiler can produce any conditional expression
(<, >, ==, !=, <=, >=). Note that sometimes
this requires the value zero…the MIPS register
$zero (register 0) is a zero bucket … it always
contains a zero and writing anything to it simply
discards the write.
Computer Hardware
Function Support
CMSC411 – Computer Architecture / © 2003 J. Six
Most modern high-level languages employ the
concept of a function (or procedure or method).
When a function is called, there are generally six
steps which are performed by the computer…






Parameters are placed somewhere that the called
function can access them.
Control is transferred to the called function.
Storage resources needed in the called function are
acquired.
The called function’s task is performed.
The return value is placed somewhere that the
calling function can access it.
Control is transferred back to the point of origin.
CMSC411 – Computer Architecture / © 2003 J. Six
In/Out Registers
As already discussed, registers are the fastest
place to store information.
The MIPS architecture allocates seven of its
registers for function calling…



$a0 -> $a3 are argument registers – they are
used to pass data into a function.
$v0 -> $v1 are return registers – they are used to
pass data out of a function.
$ra is the return address register – this is used to
store a pointer to the point of origin (the place
where the function was called; this is used to
jump back to after the called function completes).
CMSC411 – Computer Architecture / © 2003 J. Six
Function Calling
Functions are typically called in the MIPS
architecture using the jump-and-link (jal)
instruction. This instruction takes one argument,
the memory address that the function begins at.
Upon execution, control is passed to the function
entry point. The return address is automatically
stored in the $ra register – this is the address
that will be jumped to after the function
completes; it is the address of the instruction
right after the jal instruction.
This function is referred to as CALL in other
popular instruction sets.
CMSC411 – Computer Architecture / © 2003 J. Six
The Program Counter
For all of this to work, the address of the current
instruction must be stored somewhere (how
about a register?).
This is done in MIPS using the $pc (or PC)
register – PC stands for program counter. This is
also referred to as an instruction pointer.
Therefore, when a jal instruction is encountered
(the PC points to a jal instruction), PC+4 is
stored in $ra and PC changes to the memory
address specified in the jal instruction.
00001348
0000134C
00001350
add $t0, $t0, $t0
jal 00001D60
add $t1, $t1, $t1
$pc
00001D60
0000134C
00001348
$ra
00001350
CMSC411 – Computer Architecture / © 2003 J. Six
Returning from a Function
So once we are in a function and we’re all
done, how do we get back?
The return address has been stored in the $ra
register, so we simply need to jump to it.
This can be accomplished using the MIPS
jump-to-register (jr) instruction. This
instruction takes one parameter, the register
that contains the address to jump to.
Since the $ra register has the return address,
the end of the function is very simple…
jr
$ra
CMSC411 – Computer Architecture / © 2003 J. Six
Function Call Flow
The calling function (the caller) puts input
parameters for the called function (the callee)
into registers $a0->$a3.
It then uses the jal instruction to jump to the
other function’s entry point.
jal address
The callee performs the required function and
puts the results into registers $v0->$v1.
It then returns to the next instruction in the
caller by using the jr instruction and the $ra
register.
jr $ra
CMSC411 – Computer Architecture / © 2003 J. Six
Register Spilling
A function call should “cover its own tracks.”
It should not alter the register contents of the
calling function.
So, if the called function needs more than the
$a0->$a3 and $v0->$v1 registers, some
registers must be spilled. This means that
the register contents are copied into memory,
the registers are used, and then the original
contents are restored from memory.
The ideal data structure for spilling registers
is a stack. This is a basic last-in-first-out data
structure.
CMSC411 – Computer Architecture / © 2003 J. Six
Stack Layout
High address
$sp
$sp
Contentsof register $t1
Contents of register $t0
$sp
Low address
a.
Contentsof register $s0
b.
c.
For instance, if a function needs to use $t1, $t0, and
$s0, those registers must be spilled while in the
called function.
In this diagram, (a) shows the stack before the
function call, (b) shows the stack during the function
call, and (c) shows the stack after the function call.
CMSC411 – Computer Architecture / © 2003 J. Six
Stack Layout – The Details
The stack is a data structure that is typically
managed by the program with some assistance
from the hardware.
In MIPS, one of the registers, the stack pointer
($sp) is reserved for storing the address of the
most recently allocated address in the stack
(where did the last thing that was put on the
stack end up in memory?).
The stack pointer is adjusted by one word for
each register that is spilled onto the stack
(remember, a word in MIPS is 32 bits, the size
of a register).
CMSC411 – Computer Architecture / © 2003 J. Six
Stack Operations
Placing something on the stack is known as
pushing it onto the stack.
Removing something from the stack is known
as popping it off of the stack.
Stacks normally grow down – they start at
higher memory addresses and each push gets
stored at a lower memory address.
Each push moves the stack pointer down by 4
bytes ($sp = $sp – 4).
Each pop moves the stack pointer up by 4
bytes ($sp = $sp + 4).
CMSC411 – Computer Architecture / © 2003 J. Six
Function Prologues
and Epilogues
High address
$sp
$sp
Contentsof register $t1
Contents of register $t0
$sp
Low address
Contentsof register $s0
a.
sub
sw
sw
sw
b.
$sp, $sp, 12
$t1, 8($sp)
$t0, 4($sp)
$s0, 0($sp)
This is the very beginning of the
function. It is sometimes referred
to as the function prologue.
c.
lw
lw
lw
add
jr
$s0, 0($sp)
$t0, 4($sp)
$t1, 8($sp)
$sp, $sp, 12
$ra
This is the very end of the
function. It is sometimes referred
to as the function epilogue.
CMSC411 – Computer Architecture / © 2003 J. Six
Intel x86 and the Stack
Some microprocessors have more explicit support
for the stack data structure.
Intel, for example, has PUSH and POP
instructions in the x86 instruction set (both take
a register as an argument…


MIPS
PUSH copies that register onto the current end of the
stack and moves the stack pointer down 32 bits.
POP copies the current piece of data on the stack into
that register and moves the stack pointer up 32 bits.
sub
sw
sw
sw
$sp, $sp, 12
$t1, 8($sp)
$t0, 4($sp)
$s0, 0($sp)
PUSH
PUSH
PUSH
EAX ($t1)
EBX ($t0)
ECX ($s0)
x86
CMSC411 – Computer Architecture / © 2003 J. Six
So, in our previous
example, we did not need
to spill $t0 and $t1, only $s0.
Register Semantics
The MIPS $t# registers are known as temporary
registers. They are just like temporary variables –
their results are no longer necessary upon the
computation is complete.
Therefore, MIPS provides two categories of
registers with different semantics…


$t0->$t9 – these are 10 temporary registers that are
not preserved by the called function on a function
call (their values can change within the called
procedure and are not restored prior to returning).
$s0->$s7 – these are 8 saved registers that are
preserved by the called function (if their values are
changed within the called procedure, the original
values must be restored prior to returning).
CMSC411 – Computer Architecture / © 2003 J. Six
If there is a jump-and-link instruction in this
function body, we need the sw and lw
instructions involving $ra. If there are no
function calls in the body, they are not needed.
Nested Procedures
So, when a procedure gets
called, the jal instruction saves
the return address into the $ra
register.
If the called procedure calls
another procedure, how is this
accounted for?
Well, the $ra value must be
spilled before the jal instruction
is executed (otherwise, the first
return address would be
overwritten and lost).
sub
sw
sw
sw
sw
$sp, $sp, 16
$ra, 12($sp)
$t1, 8($sp)
$t0, 4($sp)
$s0, 0($sp)
… function body …
lw
lw
lw
lw
add
jr
$s0, 0($sp)
$t0, 4($sp)
$t1, 8($sp)
$ra, 12($sp)
$sp, $sp, 16
$ra
CMSC411 – Computer Architecture / © 2003 J. Six
Local Data
The stack is also used for function local
storage (arrays, variables, structures, and so
forth that are local to the function).
The segment of the stack that is used by a
function (including saved/spilled registers and
any local variables) is called the stack frame.
To keep track of all of this, MIPS uses a
second register to keep track of the stack, the
frame pointer ($fp). This register points to
the first word of the current function’s stack
frame.
CMSC411 – Computer Architecture / © 2003 J. Six
The Frame Pointer
H ig h a d d r e s s
$ fp
$ fp
$sp
$sp
$ fp
Saved argum ent
r e g is t e r s ( if a n y )
S a v e d re tu r n a d d r e s s
Saved saved
r e g is te r s ( if a n y )
L o c a l a r ra y s a n d
s tr u c t u r e s ( if a n y )
$sp
L o w a d d re s s
a.
b.
c.
Here, (a), (b), and (c) represent the stack
before, during, and after the function call.
Note that the frame pointer points to the first
word of the stack frame and the stack pointer
points to the top of the stack.
CMSC411 – Computer Architecture / © 2003 J. Six
Accessing Data on the Stack
Looking at the stack…

$ fp
Saved argum ent
r e g is t e r s ( if a n y )

S a v e d re tu r n a d d r e s s
Saved saved
r e g is te r s ( if a n y )
L o c a l a r ra y s a n d
s tr u c t u r e s ( if a n y )
$sp
Normally saved registers,
including the $ra, are accessed
relative to the frame pointer.
Local variables are normally
referenced relative to the stack
pointer (as we sometimes do not
know at compile time how many
local variables might be pushed
onto the stack and only $sp
moves with each stack push).
If there are no local variables,
the frame pointer is not
normally used.
CMSC411 – Computer Architecture / © 2003 J. Six
Immediate Addressing
Many common operations in computer
programs involve constants (such as in the C
statement x=x+12;).
To allow this, MIPS has an immediate
addressing mode in which the constant is
encoded right into the instruction, such as the
instructions that we have already seen for
adding and subtracting the stack pointer.
MIPS defines the instruction format involving
constants (or immediate data) – this is the Itype we have already seen - as having a 16 bit
data field.
op rs rt immediate
CMSC411 – Computer Architecture / © 2003 J. Six
Immediate Addressing and
Constant Comparison
Immediate address comes up a lot when doing
comparisons and arithmetic operations.
Comparisons are accomplished using the
immediate version of the set-on-less-than
instruction (slt)…
slti
$t0, $s2, 10 # $t0=1 if $s2 < 10
The arithmetic instructions have immediate
versions as well…
op
8
001000
rs
29
11101
addi $sp, $sp, 4
rt
immediate
29
4
11101
0000 0000 0000 0100
CMSC411 – Computer Architecture / © 2003 J. Six
The Common Case
Constants occur in arithmetic operations and
comparisons a lot.
The inclusion of the immediate addressing
versions of arithmetic and set-on-less-than
instructions in the MIPS architecture is an
illustration of the “making the common case
fast,” something that is a prevailing theme in
computer system performance.
This is so because it is much quicker to get
the constant right from the instruction than to
keep it in memory and load it into a register
when necessary.
CMSC411 – Computer Architecture / © 2003 J. Six
Immediate Data Size
The immediate addressing mode in MIPS allows 16 bit
constants. So, when a 32 bit constant is necessary, how
can this value be loaded and used?
MIPS provides a load upper immediate (lui) instruction
that takes a 16 bit constant and copies it into the upper
16 bits of the target register (filling the lower 16 bits
with zeros).
lui
$t0, 255
op
rs
rt
immediate
001111 00000 01000
0000 0000 1111 1111
After instruction
Before
instructionexecution…
execution…
0101 0000
0101 1111
0101 1111
0101 0000
0101 0000
0101 0000
0101 0000
0101
$t0: 0000
CMSC411 – Computer Architecture / © 2003 J. Six
Loading a 32-bit Constant
Let’s say we wanted to load the 32-bit
constant
0000 0000 0011 1101 0000 1001 0000 0000
into $s0.
lui
$s0, 61
lui
$s0, 61
addi $s0, $s0, 2304
# 61 = 0000 0000 0011 1101
# 61 = 0000 0000 0011 1101
# 2304 = 0000 1001 0000 0000
xxxx0000
xxxx0011
xxxx 1101
xxxx xxxx
0000 xxxx
0000xxxx
1001
0000xxxx
0000
$s0: 0000
CMSC411 – Computer Architecture / © 2003 J. Six
Jump Addressing
The addressing mode associated with branch
instructions again follows the “keep it simple”
design idea.
The jump instruction follows the MIPS J-type
addressing format, which consists of the
opcode and the address to jump to…
opcode
(6 bits)
jump target address
(26 bits)
The opcode for jump is 2. So, to jump to
memory address 10000, the machine
language instruction is…
2
10000
CMSC411 – Computer Architecture / © 2003 J. Six
Branch Addressing
Conditional branch instructions must also specify
two registers (for the condition checking)…
opcode
(6 bits)
rs
(5 bits)
rt
(5 bits)
branch target (PC-relative offset)
(16 bits)
The branch target size is too small to specify a
large enough range for modern programs (a
branch could only have a target in the lower half
of memory!). So, branch instructions employ a
PC-relative addressing mode – the branch
instruction is considered an offset and is added
to the program counter to form the branch
target.
CMSC411 – Computer Architecture / © 2003 J. Six
PC-Relative Branch Addressing
Actually, in MIPS the branch target offset is added to
the program counter plus four (the address of the
instruction immediately after the branch instruction).
This gives us…
New PC = Old PC + 4 + Offset
Why this is will be explored later, during a discussion on
microprocessor control path design. It ends up that it
is convenient for the hardware to increase the PC early
to point to the next instruction – since this has already
been computed, it is efficient to use it.
PC-relative addressing is useful because the destination
of conditional branching is highly likely to be local to
the branch. In contrast, jump (and jump-and-link) has
no such spatial locality characteristics. (Why is this?)
More Quirks
in MIPS Addressing
CMSC411 – Computer Architecture / © 2003 J. Six
As all MIPS instructions are 4 bytes long, the
PC-relative addressing associated with branch
instructions is actually the number of words
to the target instruction (this gives us four
times the range).
The direct (J-type) addressing used in jump
instructions has a 26-bit target address field.
These 26 bits are considered the low 26-bits
of the target address. The high 6 bits are
copied from the current value of the PC. This
is known as pseudodirect addressing.
CMSC411 – Computer Architecture / © 2003 J. Six
PC-Relative
Addressing Example
Loop:
Exit:
add
add
add
lw
bne
add
j Loop
80000:
80004:
80008:
80012:
80016:
80020:
80024:
0
0
0
35
5
0
2
$t1, $s3, $s3
$t1, $t1, $t1
$t1, $t1, $s5
$t0, 0($t1)
$t0, $s5, Exit
$s3, $s3, $s4
19
9
9
9
8
19
19
9
21
8
21
20
9
9
9
…
19
80000
0
0
0
0
2
0
32
32
32
32
CMSC411 – Computer Architecture / © 2003 J. Six
Far-Away Branching
We already discussed that the destination of
conditional branching is highly likely to be local
to the branch instruction itself.
Sometimes this is not the case – the assembler
typically deals with this by inserting an
unconditional jump (with the large 26-bit target
address) and then inserting a opposite branch
instruction that would skip the jump instruction.
For example, if L1 is too far away…
beq
$s0, $s1, L1
L2:
bne
j
$s0, $s1, L2
L1
CMSC411 – Computer Architecture / © 2003 J. Six
Summary of
MIPS Addressing Modes
1 . Im m e d i a t e a d d r e s s i n g
op
rs
rt
Im m e d ia te
2 . R e g is te r a d d r e s s in g
op
rs
rt
rd
. . .
fu n c t
R e g is te r s
R e g is te r
3 . B a s e a d d r e s s in g
op
rs
rt
M emory
A d dres s
+
R e g is t e r
B y te
H a lfw o r d
4 . P C - r e la ti v e a d d r e s s in g
op
rs
rt
M emory
A d dres s
PC
+
W o rd
5 . P s e u d o d ir e c t a d d r e s s in g
op
A d d re ss
PC
M emory
W o rd
W o rd
CMSC411 – Computer Architecture / © 2003 J. Six
Summary of MIPS Registers
and their Conventional Uses
register
number
name
usage
preserved
on function
call?
0
$zero
always the constant zero
N/A
1
$at
reserved for assembler
yes
2-3
$v0-$v1
results and expression eval
no
4-7
$a0-$a3
arguments
yes
8-15
$t0-$t7
temporaries
no
16-23
$s0-$s7
saved
yes
24-25
$t8-$t9
more temporaries
no
26-27
$k0-$k1
reserved for the OS
yes
28
$gp
global pointer
yes
29
$sp
stack pointer
yes
30
$fp
frame pointer
yes
31
$ra
return address
yes
CMSC411 – Computer Architecture / © 2003 J. Six
The PowerPC Architecture and
Other Addressing Modes
We have seen that MIPS has five addressing
modes: register, base, immediate, PC-relative,
and pseudodirect.
There are a number of other addressing modes
that might be useful. For example, let’s consider
two that are found on the PowerPC architecture.
PowerPC is designed and made by IBM and
Motorola and is typically found in Apple
Macintosh computers. It is similar to
MIPS…PowerPC has 32 integer registers, all
instructions are 32-bits long, and data in memory
is manipulated using loads and stores.
CMSC411 – Computer Architecture / © 2003 J. Six
Indexed Addressing
Consider an array of values in memory – remember
an array is simply a set of same-type variables next to
each other in memory.
In this case, we might use indexed addressing. One
register might contain the base address of the array
and another register would contain the offset (or
index). Using this approach, only one register needs
to change to iterate through the array.
Let’s look at MIPS code to do this and the
corresponding PowerPC instruction…
add
lw
$t0, $a0, $s3
$t1, 0($t0)
MIPS
lw
PowerPC
$t1, $a0+$s3
CMSC411 – Computer Architecture / © 2003 J. Six
Update Addressing
Consider the array again.
Another common set of operations is to load a word from
memory and then increment the base register to point to the
next word.
Update addressing introduces a new version of the data transfer
instructions (load/store) that automatically increments the base
register to point to the next word whenever data is transferred.
Let’s look at MIPS code to do this and the corresponding
PowerPC instruction (remember, a word is 4 bytes in both
architectures)…
lw
addi
$t0, 4($s3)
$s3, $s3, 4
MIPS
lwu
PowerPC
$t0, 4($s3)
Indexed and
Update Addressing
CMSC411 – Computer Architecture / © 2003 J. Six
a. Indexed addressing
op
rs
rt
rd
...
Memory
Register
+
Word
Register
b. Update addressing
op
rs
rt
Register
Address
Memory
+
Word
CMSC411 – Computer Architecture / © 2003 J. Six
PowerPC Instructions
Most PowerPC instructions are very similar to MIPS
The PowerPC bc Instruction
instructions and PowerPC followed a lot of the same
This instruction uses the special ctr
design principles.
register, a special register (in addition
However, there are some
which
more
to theinstructions
normal 32) used
forare
loop
control.
complex.
You simply set this register to the number
As an example, PowerPC
introduces
a special
branch
of iterations
you want
and end
the loop
instruction intended forcode
loops
where
the
control value
with
this bc
instruction.
Easy!
starts off at some value and is decremented until it
reaches zero.
Let’s look at MIPS code for such a loop and the
corresponding PowerPC instructions…
Loop: …
addi $t0,$t0,-1
bne $t0,$zero,Loop
MIPS
PowerPC
Loop: …
bc Loop, ctr!=0
The Intel IA-32/x86
Architecture
CMSC411 – Computer Architecture / © 2003 J. Six
The Intel 8086 started life as a 16-bit
microprocessor. With the introduction of the
80386, the x86 line (called IA-32 for the past
couple of years), Intel moved to a 32-bit
design.
IA-32 has a huge number of instructions
(over 100) – each new edition adds new
instructions.
Let’s look at some aspects of IA-32, including
the register set, addressing modes, integer
operations, and instruction encoding.
CMSC411 – Computer Architecture / © 2003 J. Six
Intel IA-32’s Register Set
Name
31
0
Use
EAX
GPR 0
ECX
GPR 1
EDX
GPR 2
EBX
GPR 3
ESP
GPR 4
EBP
GPR 5
ESI
GPR 6
EDI
GPR 7
EIP
EFLAGS
CS
Code segment pointer
SS
Stack segment pointer (top of stack)
DS
Data segment pointer 0
ES
Data segment pointer 1
FS
Data segment pointer 2
GS
Data segment pointer 3
Instruction pointer (PC)
Condition codes
The first
observation to
make about IA-32
is that there are
only eight general
purpose registers
(GPRs) – this is in
sharp contrast to
MIPS’s 32 GPRs.
CMSC411 – Computer Architecture / © 2003 J. Six
IA-32 Addressing
IA-32 uses two operands for its arithmetic, logical,
and data transfer instructions. This means one
operand must act as a source and a destination.
Unlike in MIPS, both operands do not need to be
registers – IA-32 allows instructions to operate
directly on data stored in memory (one of the two
operands can be a memory location).
When memory locations are referenced, accesses
do not need to be 32-bits wide (you do not need to
always read/write a word). Most instructions can
operate on a byte or a 16 or 32 bit value.
By the way, in IA-32 a word in 16-bits and a 32-bit
value is considered a doubleword (or DWORD).
CMSC411 – Computer Architecture / © 2003 J. Six
IA-32 Integer Instructions
IA-32 integer operations fall into four major
categories…




Data transfer instructions – including move
(move from one place to another; memory or
register on either end), push, and pop
Arithmetic/logic instructions – test operands
(condition evaluation), integer and decimal
operations
Control flow instructions – condition branches,
unconditional jumps, calls, returns
String instructions – including string moves and
string comparisions
CMSC411 – Computer Architecture / © 2003 J. Six
Conditional Branching
Conditional branching is handled in a similar
manner on PowerPC and IA-32 – this is based
on the concept of condition codes or flags.
Condition codes are a side effect of an
operation – most often they are used to
compare a value to zero and are checked by
branch instructions.

Most arithmetic and logic operations set condition
codes. This is good because you do not need to
explicitly compute such a flag and since they occur
as part of the operation, it is faster. This is bad
because every operation is more expensive as
these code are computed, needed or not.
CMSC411 – Computer Architecture / © 2003 J. Six
IA-32 Instruction Encoding
Not all IA-32 instructions are the same size.
In fact, they can vary from 1 byte to 17
bytes!
The opcode usually specifies what addressing
mode is being used – alternatively it can say
postbyte to learn the addressing mode (this is
like a second opcode field). Sometimes there
are two postbytes.
This can be very confusing and there are so
many addressing modes that it is very difficult
to keep them straight.
CMSC411 – Computer Architecture / © 2003 J. Six
IA-32 Example Instructions
Function
Instruction
JE name
If equal (CC) EIP = name};
EIP – 128  name < EIP + 128
JMP name
{EIP = NAME};
CALL name
SP = SP – 4; M[SP] = EIP + 5; EIP = name;
MOVW EBX,[EDI + 45]
EBX = M [EDI + 45]
PUSH ESI
SP = SP – 4; M[SP] = ESI
POP EDI
EDI = M[SP]; SP = SP + 4
ADD EAX,#6765
EAX = EAX + 6765
TEST EDX,#42
Set condition codea (flags) with EDX & 42
MOVSL
M[EDI] = M[ESI];
EDI = EDI + 4; ESI = ESI + 4
CMSC411 – Computer Architecture / © 2003 J. Six
IA-32 Instruction Formats
(just a few…)
a. JE EIP + displacement
4
4
8
JE
Condition
Displacement
b. CALL
8
32
CALL
Offset
c. MOV EBX, [EDI + 45]
6
1 1
MOV
d w
8
r-m
postbyte
8
Displacement
d. PUSH ESI
5
3
PUSH
Reg
e. ADD EAX, #6765
4
3
1
32
ADD
Reg
w
Immediate
f. TEST EDX, #42
7
1
8
32
TEST
w
Postbyte
Immediate
CMSC411 – Computer Architecture / © 2003 J. Six
RISC and CISC Architectures
We have seen MIPS in great detail, and
PowerPC and Intel IA-32 in passing.
There appears to be a fundamental
difference between MIPS and IA-32.
This difference is more common than
just these two architectures – let’s look
at RISC and CISC criteria.
CMSC411 – Computer Architecture / © 2003 J. Six
MIPS – A Classic RISC Design
MIPS is very simple.




Data must be in registers (this is known as a
load/store architecture) before being operated on.
There is a small number of instructions.
Each instruction has a similar format.
Each instruction does one basic thing; complex
tasks are done by combining a (large) number of
instructions.
MIPS is classically known as a reduced
instruction set computer (RISC) architecture.
CMSC411 – Computer Architecture / © 2003 J. Six
IA-32 – A Classic CISC Design
IA-32 is very complex.




Data does not need to be in registers
before being operated on (direct memory
operations are possible).
There is a large number of instructions.
Instructions has very different formats.
Each instruction can do a complex task.
IA-32 is classically known as a complex
instruction set computer (CISC)
architecture.
CMSC411 – Computer Architecture / © 2003 J. Six
RISC vs. CISC
Ten years ago, each instruction set
architecture was either CISC or RISC and the
different was clear.
This is not very true anymore – each type has
taken the best parts of the other and
integrated them into their own design…


PowerPC, the major RISC player right now, has
some complex instructions and uses CISC-like
technology such as condition codes.
IA-32, the major CISC player right now, has a
microprogrammed core where the complex
instructions are broken down into simple
microinstructions and executed in that manner.
CMSC411 – Computer Architecture / © 2003 J. Six
The End of RISC vs. CISC
To a large extent, this RISC vs. CISC war is
over with each side admitting that the other
had some good points and adopting those
points themselves.
Modern and future architecture do even more
combinations of principles (and introduce new
designs all together) – this includes such ISAs
as Intel’s IA-64, an architecture designed
jointly by Intel (a major CISC organization)
and HP (a major RISC organization).
CMSC411 – Computer Architecture / © 2003 J. Six
Summary: Design Principles
Regardless of which design is personally
favored, four design principles have
emerged that are (almost) universally
accepted…




Simplicity favors regularity.
Smaller is faster.
Good design demands good compromises.
Make the common case fast.