TI1400 Computer Organization at TU Delft

download report

Transcript TI1400 Computer Organization at TU Delft

Instructions and Addressing (cont’d.)
Index addressing (1)
instruction
opcode
Reg
index
+
operand
operand
registers
memory
or
registers
Index addressing (2)
• Advantages:
- Allows specification of fixed offset to operand address
• Disadvantages:
- Extra addition to operand address
• Notation: ADD X(R1),R3 (X=number)
• Meaning: [R3]  [R3] + M([R1] + X)
Example index addressing
Move #E,R0
program
loop
Move N,R1
Clear R2
L Add 8(R0),R2
Add #16,R0
Decrement R1
Branch>0 L
Program with
index addressing
N n
E Empoyee ID
sex
age
salary
Empoyee ID
sex
age
salary
Move N,R1
Div R1,R2
Move R2,Sum
Q1: What does this program do?
0(R0)
4(R0)
8(R0)
12(R0)
16(R0)
20(R0)
Additional modes
• Some computers have auto-increment (decrement
instructions)
• Example: (R0)+
• Meaning .. M(R0)..; [R0]  [R0]+1
• Example: -(R0)
• Meaning [R0]  [R0]-1; .. M(R0)..
Additional Instructions
• Logic instructions
- Not R0; invert all bits in R0
- And #$FF000000,R0; AND with bit string
• Shift and rotate instructions
- Many variants for different purposes
6
Logical shifts
Logical shift left
LShiftL #2,R0
used in bit Packing
0
C
R0
before:
0
0 1 1 1 0 . . . 0 1 1
virtual:
0
after:
1
1 1 1 0 . . . 0 1 1 0
1 1 0 . . . 0 1 1 0 0
0
R0
C
before:
0 1 1 1 0 . . . 0 1 1
0 0 1 1 1 0 . . . 0 1
0 0 0 1 1 1 0 . . . 0
0
Logical shift right
LShiftR #2,R0
virtual:
after:
LShiftL #1,R0
LShiftL #2,R0
0
LShiftR #1,R0
1
LShiftR #2,R0
7
Arithmetic shifts
Arithmetic shift right (signed shift)
AShiftR #2,R0
R0
C
before:
1 0 0 1 1 . . . 0 1 0
0
virtual:
1 1 0 0 1 1 . . . 0 1
1 1 1 0 0 1 1 . . . 0
0
AShiftR #1,R0
1
AShiftR #2,R0
Tough questions
after:
Q1: AShiftR by n bits is equivalent to
division by 2n for numbers in 2C or 1C?
Q2: Rounding negative number
shifts towards 0 or -infinity?
8
Rotate
Rotate left w/o Carry
RotateL #2,R0
C
R0
before:
0
0 1 1 1 0 . . . 0 1 1
virtual:
0
after:
1
1 1 1 0 . . . 0 1 1 0
1 1 0 . . . 0 1 1 0 1
C
R0
before:
0
0 1 1 1 0 . . . 0 1 1
virtual:
0
after:
1
1 1 1 0 . . . 0 1 1 0
1 1 0 . . . 0 1 1 0 0
Rotate left w/ Carry
RotateLC #2,R0
RotateL #1,R0
RotateL #2,R0
RotateLC #1,R0
RotateLC #2,R0
9
Assemblers
http://www.pds.ewi.tudelft.nl/~iosup/Courses/2011_ti1400_5.ppt
10
History of Computing
(1642-2011)
Done so far…
Lectures
3,4
Lecture 2
Lecture 1
Lecture 0
Computers
Programmable Devices
Circuit Design
Why Computer
Organization Matters?
Data representation,
conversion, and op.
Instruction
representation and use
Memory organization
Program sequencing
von Neumann archi.
Instruction levels
Digital logic
Memory elements
Other building blocks
(Multiplexer,Decoder)
Finite State Machines
Problem: How to Program Computers?
Lectures
3,4
Lecture 2
Lecture 1
Lecture 0
Computers
Programmable Devices
Circuit Design
Why Computer
Organization Matters?
Data representation,
conversion, and op.
Instruction
representation and use
Memory organization
Program sequencing
von Neumann archi.
Instruction levels
Digital logic
Memory elements
Other building blocks
(Multiplexer,Decoder)
Finite State Machines
Program Creation and Execution Flow
type source
program
translate
Source in ASCII
editor
assembler
object code
machine 1
machine 2
link/load
run
TI1400/11-PDS
listing
Source and
Object code
+error messages
linker/loader
memory image
input/
output
TU-Delft
13
Three levels of instructions
high level programming
language
C/C++, Java, …
program expressed in a
high-level language
translation
instruction set
program expressed as
a series of instructions
Assembler
direct implementation
fetch/execute
implementation
program execution
in hardware
Instructions and Addressing
1.
2.
3.
4.
5.
6.
Introduction
Assembler: What and Why?
Assembler Statements and Structure
The Stack
Subroutines
Architectures: CISC and RISC
15
Why assembler? [1/2]
• Assembler is a symbolic notation for machine
language
• It improves readability (vs machine code):
- Assembler: Move R0,SUM
- Machine code: 0010 1101 1001 0001 (16 bits)
16
TI1400/11-PDS
TU-Delft
Why assembler ? [2/2]
Lecture 0
• Speed of programs in critical applications
• Access to all hardware resources of the machine
• Target for compilers
17
TI1400/11-PDS
TU-Delft
Source: http://www.cs.berkeley.edu/~volkov/cs267.sp09/hw1/results/
Q: Where to get ISA references?
• Manufacturer’s documentation
• Third-party manuals (ATTN: may be incorrect)
Q: Does each processor have its own
machine language (instruction set)?
• Shared across generations and even competitors
developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/ptx_isa_2.1.pdf
NVIDIA
www.eng.ucy.ac.cy/theocharides/Courses/ECE656/ia-32.pdf
1982
Intel
AMD
1985
Cyrix
1989
…
1993
1995
Q: Are similar instructions identical on
different platforms?
• Often, they are not
NVIDIA
Intel
AMD
Cyrix
Machine Language [1/4]
• Is Machine language difficult to learn?
- That holds for every unknown language. Machine
language is more difficult because you have to work with
the specifically defined micro instruction set.
• Is Machine language difficult to read and
to understand?
- Of course, if you do not know the language;
however, assembler is more difficult to read and
understand than a High Level Language (HLL).
21
TI1400/11-PDS
TU-Delft
Machine Language [2/4]
• Is Machine language difficult to write?
- Often HLL languages use libraries to make programming
simpler. Machine language programmers often start from
scratch. However, full performance may require machine
language implementation (or a smart/expensive
compiler)
• Machine language programming is time
consuming
- One estimates that the time for coding a program is only
30% of the total development time.
22
TI1400/11-PDS
TU-Delft
Machine Language [3/4]
• Compilers make machine language superfluous
- A good machine language program often looks very
different from a compiler generated program. Generally,
a C program will win over a hand-made assembly
program (unless you’re Michael Abrash …
or a student at TU Delft)
- Assembler still heavily used for hot/optimized functions
(esp. scientific codes), real-time platforms, embedded
systems, …
23
TI1400/11-PDS
TU-Delft
Machine Language [4/4]
• Is Machine language difficult to maintain?
- Maintainable programs are not specifically dependent on
the language they are written in, but more on the way
they are contructed
• Is Machine language difficult to debug?
- Often debuggers output both the HLL and the machine
language, and the error can only be found in the
generated machine language
24
TI1400/11-PDS
TU-Delft
Case-in-Point
• Universele Brander Automaat (UBA)
25
TI1400/11-PDS
TU-Delft
Case
Universele Brander Automaat
Klant:
Nefit Fasto B.V.
Markt: HVAC (AirCo)
Ontwikkelen (1990) en
produceren (100k/jaar) van de
UBA universele branderautomaat voor Nefit Fasto
voorzien van een bipolaire
Application-Specific Integrated
Circuit (ASIC).
Eerste product met een
universeel karakter, die een
fail-safe approval heeft.
26
TI1400/11-PDS
TU-Delft
Case
Universele Brander Automaat
Ignition
230V , Pump and Fan
6 schakel ingangen
8 analoge ingangen
3 schakeluitgangen
3 modulerende uitgangen
2 draads communicatie bus
Externe KIM module
aansluiting met 178 bytes
config settings
ASIC and micro-Computer
27
TI1400/11-PDS
TU-Delft
UBA software opbouw
HWIO
Application
C- language
15 Kbyte
1 Kbyte
28
TI1400/11-PDS
TU-Delft
UBA
micro computer
HWIO
1 Kbytes
MC68HC05B16
24 I/O bi-directional
8 A/D analogue inputs
2 TCAP input timers
2 TCMP output compare
2 PWM D/A outputs
1 SCI serial output
1 COP watchdog
256 bytes RAM
256 bytes EEPROM
16 Kbytes (EP)ROM
29
TI1400/11-PDS
TU-Delft
UBA PuR
After Power up Reset special routine
[ see also The Zen of Diagnostics,
http://www.ganssle.com/articles/adiags1.htm ]
- all instruction set in test routine
- 16-bit CRC (99,98% data integrity)
- Walking A0 and 05 RAM test (pattern sensitivity)
- Check on A/D (converter linearity)
- Main loop partitioned in modules
- Module check in each phase
- Acknowledge module check
by pulse to ASIC (350ms)
- Interrupt program termination check
by pulse to ASIC (20ms)
30
TI1400/11-PDS
TU-Delft
UBA Assembly
• Check instruction set
- Test of each opcode over and over again
- Emergency stop at fault detection
- Not possible in “C”
• Check memory
- As part of the program
- Emergency stop at fault detection
- Difficult in “C”
• Better control on application
- Compiler generated code must be
checked for correctness.
31
TI1400/11-PDS
TU-Delft
Instructions and Addressing
1.
2.
3.
4.
5.
6.
Introduction
Assembler: What and Why?
Assembler Statements and Structure
The Stack
Subroutines
Architectures: CISC and RISC
32
TI1400/11-PDS
TU-Delft
Assembler Statements
• Declarations
-
no code generation
memory reservation
symbolic data declarations
where to start the code execution
• Executable statements
- are translated to real machine instructions
(often, one-to-one)
33
TI1400/11-PDS
TU-Delft
Data declarations
Label
S
operation
EQU
ORIGIN
operand
200
201
N
N1
DATA
RESERVE
ORIGIN
300
300
100
34
TI1400/11-PDS
TU-Delft
Program
TI1400/11-PDS
Addr
START
operation
Move
operand
N,R1
LOOP
Move
Clear
Add
#N1,R2
R0
(R2),R0
Incr
Decr
Branch>0
R2
R1
LOOP
Move
Return
End
R0,S
START
35
TU-Delft
Memory lay-out
100 Move N,R1
101 .....
102 .....
103 .....
104 .....
105 .....
106 Branch >0
107
200
S
201 300
202 .....
203 .....
.....
.....
N
501
Nn
N1
36
TI1400/11-PDS
TU-Delft
Structure assembler [1/3]
• Assembler is hardly more than substitution
- substitute 0001 for Move
- substitute 0000 0000 0000 0101 for #5
• Assembler is level above machine language
• Assembler languages for different architectures
are alike, but not identical
37
TI1400/11-PDS
TU-Delft
Structure assembler [2/3]
Assembler programs contain three kind of
quantities:
• Absolute:
- opcodes, contants: can be directly translated
• Relative:
- addresses of instructions which are dependent of final
memory location
• Extern:
- call to subroutines
38
TI1400/11-PDS
TU-Delft
Structure assembler [3/3]
• Literals: constants in programs
• Some assemblers act as if literals are immediate
operands
• Example:
Load
is equivalent to:
Load
...
One:
1
#1
One
39
TI1400/11-PDS
TU-Delft
Number notation
• Numbers can be represented using various
formats:
ADD
#93,R1
ADD
#%01011101,R1
ADD
#$5D,R1
or
or
40
TI1400/11-PDS
TU-Delft
Instructions and Addressing
1.
2.
3.
4.
5.
6.
Introduction
Assembler: What and Why?
Assembler Statements and Structure
The Stack
Subroutines
Architectures: CISC and RISC
41
TI1400/11-PDS
TU-Delft
The Stack
Main idea
- (Large?) Memory space used to store program data
- Items are added to the stack through a PUSH operation
- Items are removed from the stack through a POP
operation
Details
- Often, a stack is a contiguous array of memory locations
- Often, any number of stacks can be set up by a program
- Often, only one stack can be used at a time
(changing the active stack possible at any time)
Q1: Why use stacks?
TI1400/11-PDS
Q2: Implications?
TU-Delft
Stack registers
CPU
PC
SP
Stack Pointer
Main
Memory
43
TI1400/11-PDS
TU-Delft
Stack operations
0
1
SP
70
300
20
10
60
44
TI1400/11-PDS
TU-Delft
Push
SP
0
1
Subtract #4,SP
Move R0,(SP)
80
70
300
20
10
60
or:
Move R0,-(SP)
80
R0
45
TI1400/11-PDS
TU-Delft
Pop
0
1
SP
80
70
300
20
10
60
Move (SP),R0
Add #4,SP
or:
Move (SP)+,R0
70
R0
46
TI1400/11-PDS
TU-Delft
Instructions and Addressing
1.
2.
3.
4.
5.
6.
Introduction
Assembler: What and Why?
Assembler Statements and Structure
The Stack
Subroutines
Architectures: CISC and RISC
47
TI1400/11-PDS
TU-Delft
Subroutines
• More structure in programs
• Mimics procedure and function calls in
High Level programming Languages (HLL)
48
TI1400/11-PDS
TU-Delft
Calling mechanism
200
204
1000
Call SUB
next instr.
................
................
RTS
PC
204
Link
PC
Link
204
49
TI1400/11-PDS
TU-Delft
Question
Is a Link register sufficient ?
50
Subroutine nesting
• For nesting of subroutines return address in link
register must be stored
• Can be implemented by using stacks
51
TI1400/11-PDS
TU-Delft
Subroutine stack
PC
204
Link
Stack
subroutine
Move Link, -(SP)
......
Move (SP)+, Link
RTS
52
TI1400/11-PDS
TU-Delft
Parameter passing (1)
• Through registers
- fast
- limited number of parameters
- caller and callee must know where parameters are
placed
• Example:
Move A,R0
Call Sub
Sub: Move R0,C
......
RTS
53
TI1400/11-PDS
TU-Delft
Parameter passing (2)
• Through memory
- very flexible
- slower than through registers
• Often implemented through Stack Pointer
• Parameters are pushed on stack before calling
subroutine
• Results are popped from stack after return
• Subroutine needs registers
54
TI1400/11-PDS
TU-Delft
Parameter passing (3)
Stack
calling subroutine
Move #List, -(SP)
Move N, -(SP)
Call LISTADD
Move 4(SP), SUM
Add #8, SP
SP
55
TI1400/11-PDS
TU-Delft
Parameter passing (3)
Stack
calling subroutine
Move #List, -(SP)
Move N, -(SP)
Call LISTADD
Move 4(SP), SUM
Add #8, SP
SP
LIST
56
TI1400/11-PDS
TU-Delft
Parameter passing (3)
Stack
calling subroutine
Move #List, -(SP)
Move N, -(SP)
Call LISTADD
Move 4(SP), SUM
Add #8, SP
SP
n
LIST
57
TI1400/11-PDS
TU-Delft
Parameter passing (3)
Stack
calling subroutine
Move #List, -(SP)
Move N, -(SP)
Call LISTADD
Move 4(SP), SUM
Add #8, SP
SP
Return
n
LIST
58
TI1400/11-PDS
TU-Delft
Parameter passing (3)
Stack
calling subroutine
Move #List, -(SP)
Move N, -(SP)
Call LISTADD
Move 4(SP), SUM
Add #8, SP
SP
n
sum
59
TI1400/11-PDS
TU-Delft
Parameter passing (3)
Stack
calling subroutine
Move #List, -(SP)
Move N, -(SP)
Call LISTADD
Move 4(SP), SUM
Add #8, SP
SP
60
TI1400/11-PDS
TU-Delft
Parameter passing (4)
LISTADD
Move R0, -(SP)
...
Move 16(SP), R1
Move 20(SP), R2
Clear R0
LOOP Add (R2), R0
Decr R1
Incr R2
Branch>0 LOOP
Move R0, 20(SP)
Move (SP)+, R2
.....
Return
TI1400/11-PDS
Stack frame
SP
[R0]
Return
n
LIST
Subroutine
61
TU-Delft
Parameter passing (4)
LISTADD
Move R0, -(SP)
...
Move 16(SP), R1
Move 20(SP), R2
Clear R0
LOOP Add (R2), R0
Decr R1
Incr R2
Branch>0 LOOP
Move R0, 20(SP)
Move (SP)+, R2
.....
Return
TI1400/11-PDS
Stack frame
SP
[R2]
[R1]
[R0]
Return
n
LIST
Subroutine
62
TU-Delft
Parameter passing (4)
LISTADD
Move R0, -(SP)
...
Move 16(SP), R1
Move 20(SP), R2
Clear R0
LOOP Add (R2), R0
Decr R1
Incr R2
Branch>0 LOOP
Move R0, 20(SP)
Move (SP)+, R2
.....
Return
TI1400/11-PDS
Stack frame
SP
20(SP)
[R2]
[R1]
0(SP)
[R0]
Return
n
8(SP)
sum
20(SP)
4(SP)
12(SP)
16(SP)
Subroutine
63
TU-Delft
Parameter passing(4)
LISTADD
Move R0, -(SP)
...
Move 16(SP), R1
Move 20(SP), R2
Clear R0
LOOP Add (R2), R0
Decr R1
Incr R2
Branch>0 LOOP
Move R0, 20(SP)
Move (SP)+, R2
.....
Return
TI1400/11-PDS
Stack frame
SP
[R1]
[R0]
Return
n
sum
Subroutine
64
TU-Delft
Parameter passing (4)
LISTADD
Move R0, -(SP)
...
Move 16(SP), R1
Move 20(SP), R2
Clear R0
LOOP Add (R2), R0
Decr R1
Incr R2
Branch>0 LOOP
Move R0, 20(SP)
Move (SP)+, R2
.....
Return
TI1400/11-PDS
Stack frame
SP
Return
n
sum
Subroutine
65
TU-Delft
Frame Pointer
SP
(stack pointer)
saved [R1]
saved [R0]
localvar3
localvar2
localvar1
FP
(frame pointer)
saved [FP]
Return address
param1
param2
Access of local variables
of Subroutine
through Index
Addressing on FP
Stack
frame
for
called
subroutine
param3
param4
Old ToS
(top-of-stack)
66
TI1400/11-PDS
TU-Delft
Re-entrancy
• Subroutines can be called more than once
- Recursion: subroutine calls itself
- Sub A calls Sub B, which in turn calls Sub A
- Multiple callers “at the same time”
• Special measures for re-entrancy
- No change of instructions
- Each caller must have its own copy of data
- Use stack(s)
67
TI1400/11-PDS
TU-Delft
Instructions and Addressing
1.
2.
3.
4.
5.
6.
Introduction
Assembler: What and Why?
Assembler Statements and Structure
The Stack
Subroutines
Architectures: CISC and RISC
68
TI1400/11-PDS
TU-Delft
Memory
CPU
CISC characteristics
• Complex Instruction Set
• Traditional architectures
• Powerful instructions
- Complex operations
- Many instructions
•
•
•
•
Memory to memory operations
Programs often use stacks
Examples 68xxx and 80xxx architectures
The Pentium architecture
69
TI1400/11-PDS
TU-Delft
Memory
CPU
RISC characteristics
•
•
•
•
•
•
Reduced Instruction Set
Small number of instructions
Load/Store from memory
Operations between registers
Large register sets
Example PowerPC architecture
70
TI1400/11-PDS
TU-Delft
Pro CISC
See also: http://arstechnica.com/cpu/4q99/risc-cisc/rvc-5.html
• Easier to program
• Reduced code size
• Complexity in hardware not in software
- HLL support in hardware
• (Politics) Legacy
- CISCs are in all our PCs and servers
TI1400/11-PDS
TU-Delft
Con CISC
• Instruction encoding complex
• Variable number of cycles to load instruction
- IA-32 instructions can be 1—17 bytes long
• Many instructions too specific, thus not used
• May be slow
- Stacks are in main memory, registers are near processor
• May consume more energy
- Not in embedded systems, portable devices, …
TI1400/11-PDS
TU-Delft
Frequency of Instruction Use
Frequency
of Use
(logscale)
Source: http://www.eng.ucy.ac.cy/theocharides/Courses/ECE656/ia-32.pdf
50% code just 3 instructions (mov, call, jmp)
99% code under 50 instructions
Instruction Rank
TI1400/11-PDS
TU-Delft