No Slide Title
Download
Report
Transcript No Slide Title
ENGS 116 Lecture 3
1
Instruction Set Design
Vincent H. Berk
September 29th, 2008
Reading for Today: Chapter 1.5 – 1.11, Mazor article
Reading for Wednesday: Appendix B.1 – B.11, Wulf article
Homework for Wednesday: 1.1, 1.3, 1.6, 1.7, 1.13
ENGS 116 Lecture 3
2
Instruction Sets
software
instruction set
hardware
ENGS 116 Lecture 3
3
Interface Design
A good interface:
• Lasts through many implementations (portability, compatibility).
• Is used in many different ways (generality).
• Provides convenient functionality to higher levels.
• Permits an efficient implementation at lower levels.
imp 1
use
Time
use
Interface
use
imp 2
imp 3
ENGS 116 Lecture 3
4
Evolution of Instruction Sets
Single Accumulator (EDSAC 1950)
Accumulator + Index Registers
(Manchester Mark I, IBM 700 series 1953)
Separation of Programming Model
from Implementation
High-level Language Based
(B5000 1963)
Concept of a Family
(IBM 360 1964)
General Purpose Register Machines
Complex Instruction Sets
(Vax, Intel 432 1977-80)
CISC
(Intel x86, Pentium II/III/4,
core 2, AMD Atlon/Opteron)
Load/Store Architecture
(CDC 6600, Cray 1 1963-76)
RISC
(MIPS, SPARC, 88000, IBM RS6000, …1987)
ENGS 116 Lecture 3
5
Evolution of Instruction Sets
• Major advances in computer architecture are typically
associated with landmark instruction set designs
– Ex: Stack vs General Purpose Registers (GPR)
• Design decisions must take into account:
– technology
– machine organization
– programming languages
– compiler technology
– operating systems
• Few will ever design an instruction set, but understanding
ISA design decisions is important
ENGS 116 Lecture 3
6
Design Space of ISA
Five Primary Dimensions
– Number of explicit operands (0,1,2,3) - ISA class
– Operand storageWhere besides memory?
– Effective address: How is memory location specified?
– Type & size of operands byte, int, float, vectors,
32-bits, 64-bits? How is it specified?
– Operations add, sub, mul, … How is it specified?
Other Aspects
• Successor How is it specified?
• Conditions How are they determined?
• Encodings Fixed or variable? Wide?
• Parallelism
ENGS 116 Lecture 3
7
Basic ISA Classes
Accumulator:
1 address
1+x address
add A
addx A
acc acc + mem[A]
acc acc + mem[A + x]
Stack:
0 address
add
tos tos + next
General Purpose Register:
2 address
add A B
3 address
add A B C
Load/Store:
3 address
add Ra Rb Rc
load Ra Rb
store Ra Rb
AA+B
AB+C
Ra Rb + Rc
Ra mem[Rb]
mem[Rb] Ra
ENGS 116 Lecture 3
Primary Advantages and Disadvantages
of Each Class of Machine
Stack
A: Simple model of expression evaluation (reverse polish). Short
instructions can yield good code density.
D: A stack cannot be randomly accessed. This limitation makes it
difficult to generate efficient code. It’s also difficult to implement
efficiently, since the stack becomes a bottleneck.
Accumulator
A: Minimizes internal state of machine. Short instructions.
D: Since accumulator is only temporary storage, memory traffic is
highest for this approach.
8
ENGS 116 Lecture 3
Register
A: Most general model for code generation.
D: All operands must be named, leading to longer instructions.
While most early machines used stack or accumulator-style
architectures, modern machines (designed in last 10-15 years and still
in use) use a general-purpose register architecture.
Registers are faster than memory
Registers are easier for compilers to use
Registers can be used more effectively than other forms of
internal storage
9
ENGS 116 Lecture 3
10
Machine Types
ENGS 116 Lecture 3
EDSAC
IBM 701
CDC 6600
IBM 360
DEC PDP-8
DEC PDP-11
Intel 8008
Motorola 6800
DEC VAX
Intel 8086
Motorola 68000
Intel 80386
MIPS
HP PA-RISC
SPARC
PowerPC
DEC Alpha
11
general-purpose
registers
1
1
8
16
1
8
1
2
16
1
16
8
32
32
32
32
32
Accumulator
Accumulator
Load-store
Register-memory
Accumulator
Register-memory
Accumulator
Accumulator
Register-memory, mem-mem
Extended accumulator
Register-memory
Register-memory
Load-store
Load-store
Load-store
Load-store
Load-store
FIGURE from SECOND EDITION
1949
1953
1963
1964
1965
1970
1972
1974
1977
1978
1980
1985
1985
1986
1987
1992
1992
ENGS 116 Lecture 3
12
Addressing Modes
memory
•
•
•
•
•
•
•
•
•
•
Register
Immediate (literal)
Direct (absolute)
Base+Displacement
Register indirect
Base+Index (Indexed)
Scaled Index
Autoincrement
Autodecrement
Memory indirect
Ri
v
M[v]
M[Ri +v]
M[Ri]
M[Ri + Rj]
M[Ri + Rj*d +v]
M[Ri++]
M[Ri--]
M[ M[Ri] ]
reg. file
ENGS 116 Lecture 3
13
Addressing Modes
ENGS 116 Lecture 3
14
Memory Alignment
Processors often require data-types to be aligned on addresses
that are a multiple of their size:
• address % sizeof (datatype) == 0
• bytes can be aligned everywhere
• 4 byte integers aligned on addresses divisible by 4
Byte Order
• Little Endian - Little End First (Intel) D C B A
• Big Endian – Big End First (PowerPC, MIPS, NBO)
•Bi-Endian – can do both (SPARC v9)
A B C D
ENGS 116 Lecture 3
15
Operations in the Instruction Set
• Arithmetic and logical – integer arithmetic and logical operations:
add, and, subtract, or
• Data transfer – loads/stores (move instructions on machines with
memory addressing)
• Control – branch, jump, procedure call and return, traps
• System – operating system call, virtual memory management
instructions
• Floating point – floating-point operations: add, multiply
• Decimal – decimal add, decimal multiply, decimal-to-character
conversions
• String – string move, string compare, string search
• Graphics – pixel and vertex operations
ENGS 116 Lecture 3
16
Rank
80x86 instruction
Integer average
(% total executed)
1
2
3
load
conditional branch
compare
22%
20%
16%
4
5
6
7
8
9
10
store
add
and
sub
move register-register
call
return
12%
8%
6%
5%
4%
1%
1%
96%
Total
FIGURE B.13 The top 10 instructions for the 80x86.
ENGS 116 Lecture 3
17
Control Flow
PIC – Position Independent Code
Caller vs. Callee saving of state
ENGS 116 Lecture 3
18
Instruction Set Encoding
• Affects program size:
– Number of instructions: size of the Opcode
– Number of instructions: types of instructions
– Number of operands
– Number of registers: size of the operand fields
– Variable instruction length vs. Fixed instruction length
• Intel x86 instructions are between 1 and 17 bytes long.
ENGS 116 Lecture 3
19
ENGS 116 Lecture 3
20
RISC vs. CISC
RISC = Reduced Instruction Set Computer
•
•
•
•
Small instruction sets
Fixed-length instructions that often execute in a single cycle
Operations performed only on registers
Simpler chip that can run at higher clock speed
CISC = Complex Instruction Set Computer
• Large instruction sets
• Complex, variable-length instructions
• Memory-to-memory operations
ENGS 116 Lecture 3
21
Design Principles CISC
(Patterson, 1985)
• Richer instruction sets would simplify compilers.
• Richer instruction sets would alleviate the software crisis.
• Richer instruction sets would improve architecture quality.
– Since execution speed was proportional to program size,
architectural techniques that led to smaller programs also
led to faster computers.
ENGS 116 Lecture 3
22
Design Principles RISC
(Patterson, 1985)
• Functions should be kept simple unless there is a very good reason
to do otherwise.
• Simple decoding and pipelined execution are more important than
program size.
• Compiler technology should be used to simplify instructions rather
than to generate complex instructions.
ENGS 116 Lecture 3
23
A “Typical” RISC
(Patterson)
• 32-bit fixed format instruction (3 formats)
• 32 64-bit general-purpose registers (R0 contains zero,
double-precision numbers take two registers)
• Single address mode for load/store: base + displacement
(no indirection)
• Simple branch conditions
• Delayed branch to avoid pipeline penalties
Examples: DLX, SPARC, MIPS, HP PA-RISC, DEC Alpha,
IBM/Motorola PowerPC, Motorola M88000
ENGS 116 Lecture 3
24
MIPS Instruction Formats (DLX)
31
R-type
I-type
J-type
26
21
16
11
6
shamt
0
op
rs
rt
rd
funct
6 bits
5 bits
5 bits
5 bits
op
rs
rt
immediate/address
6 bits
5 bits
5 bits
16 bits
5 bits 6 bits
op
target address
6 bits
26 bits
ENGS 116 Lecture 3
25
Impact of Compiler Technology
on Architecture Decisions
The interaction of compilers and high-level languages significantly
affects how programs use an instruction set.
1. How are variables allocated and addressed? How many
registers are needed to allocate variables appropriately?
2. What is the impact of optimization techniques on
instruction mixes?
3. What control structures are used and with what frequency?
ENGS 116 Lecture 3
26
Instruction Set Properties
That Simplify Compiler Writing
1. Provide regularity.
2. Provide primitives, not solutions.
3. Simplify tradeoffs among alternatives.
4. Provide instructions that bind the quantities known at compile time
as constants.
ENGS 116 Lecture 3
DEC VAX: “The penultimate CISC”
• VAX-11/780 introduced in 1977
• 2 goals:
– 32-bit extension of PDP-11 architecture (make customers
comfortable)
– ease task of writing compilers and operating systems
• General-purpose register machine with large orthogonal
instruction set
• 16 general-purpose registers (4 reserved)
• Large number of addressing modes, large number of instructions
27
ENGS 116 Lecture 3
• Any combination of addressing modes works with nearly every
opcode
• Variable-length instructions
– 3-operand instruction may have 0 to 3 operand memory
references, each of which may be any of the addressing modes
• Elaborate instructions can take dozens of clock cycles
28
ENGS 116 Lecture 3
29
IBM 360/370
• 360 introduced in 1964 – first to use notion of instruction set
architecture (370 introduced in 1970 as successor to 360)
• Goals:
– exploit storage – large main storage, storage hierarchies
– support concurrent I/O
– create a general-purpose machine with new OS facilities and many
data types
– maintain strict upward and downward machine-language compatibility
• 32-bit machine with byte addressability and support for variety of
data types
ENGS 116 Lecture 3
• 16 32-bit, general-purpose registers
• 4 double-precision (64-bit) floating-point registers
• 5 instruction formats, each of which is associated with a single
addressing mode
• Basic operations
– logic operations on bits, character strings, and fixed words
– decimal or character operations on strings of characters or
decimal digits
– fixed-point binary arithmetic
– floating-point arithmetic
30
ENGS 116 Lecture 3
31
IBM 360
ENGS 116 Lecture 3
Cray
32
ENGS 116 Lecture 3
33
ISA Metrics
• Regularity (Orthogonality)
– No special registers, few special cases, all operand modes
available with any data type or instruction type
• Primitives rather than solutions
• Completeness
– Support for a wide range of operations and target
applications
•Streamlined
– Resource needs easily determined
• Ease of compilation
• Ease of implementation
• Scalability
• Density (network bandwidth and power consumption)