CS152: Computer Architecture and Engineering

Download Report

Transcript CS152: Computer Architecture and Engineering

IKI10230
Pengantar Organisasi Komputer
Kuliah no. 4: CISC vs. RISC Instruction Sets
Sumber:
1. Hamacher. Computer Organization, ed-5.
2. Materi kuliah CS61C/2000 & CS152/1997, UCB.
12 Maret 2003
Bobby Nazief ([email protected])
Qonita Shahab ([email protected])
bahan kuliah: http://www.cs.ui.ac.id/kuliah/iki10230/
1
Review: Jenis-jenis Operasi
Data Transfers
memory-to-memory move
register-to-register move
memory-to-register move
Arithmetic & Logic
integer (binary + decimal) or FP
Add, Subtract, Multiply, Divide
shift left/right, rotate left/right
not, and, or, set, clear
Program Sequencing &
Control
unconditional, conditional Branch
call, return
trap, return
Input/Output Transfers
register-to-i/o device move
Synchronization
String
Graphics (MMX)
test & set (atomic r-m-w)
search, translate
parallel subword ops (4 16bit add)
2
Review: Modus Pengalamatan (1/2)
1.
2.
3.
4.
Jenis
Syntax
Effective Address
Immediate:
#Value
; Operand = Value
Add #10,R1
; R1  [R1] + 10
Ri
; EA = Ri
Add R2,R1
; R1  [R1] + [R2]
LOC
; EA = LOC
Add 100,R1
; R1  [R1] + [100]
(Ri)
; EA = [Ri]
Add (R2),R1
; R1  [R1] + [[R2]]
(LOC)
; EA = [LOC]
Add (100),R1
; R1  [R1] + [[100]]
Register:
Absolute (Direct):
Indirect-Register:
Indirect-Memory:
3
Review: Modus Pengalamatan (2/2)
5.
Index:
Base+Index:
Base+Index+Offset:
X(R2)
; EA = [R2] + X
Add 10(R2),R1
; R1  [R1] + [[R2]+10]
(R1,R2)
; EA = [R1] + [R2]
Add (R1,R2),R3
; R3  [R3] + [[R1]+[R2]]
X(R1,R2) ; EA = [R1] + [R2] + X
Add 10(R1,R2),R3 ; R3  [R3] + [[R1]+[R2]+10]
6.
Relative:
X(PC)
Beq 10
7.
8.
Autoincrement:
Autodecrement:
; EA = [PC] + X
; if (Z==1) then PC  [PC]+10
(Ri)+
; EA = [Ri], Increment Ri
Add (R2)+,R1
; R1  [R1] + [[R2]],
; R2  [R2] + d
-(Ri)
; Decrement Ri, EA = [Ri]
Add -(R2),R1
; R2  [R2] – d,
; R1  [R1] + [[R2]]
4
Solusi PR #1
° 2.8 A x X + C x D pada single-accumulator processor
Load
Multiply
Store
Load
Multiply
A
B
X
C
D
Add
X
; Accumulator = A x B
; X can be A, B, or others except C or D
; Accumulator = C x D
; Accumulator = A x B + C x D
5
Solusi PR #1
° 2.9 Jumlah nilai dari N siswa, J tes; J >> jumlah register
Move
#SUM,R0
; R0 points to SUM
Move
J,R1
; R1 = j
Move
R1,R2
Add
#1,R2
Multiply
#4,R2
; R2 = (j + 1)*4
Lj:
Move
#LIST,R3
; R3 points to first student
Move
J,R4
Sub
R1,R4
; R4: index to particular test of first student
Multiply
#4,R4
Add
R4,R3
; R3 points to particular test of first student
Move
N,R4
; R4 = n
Clear
R5
; Reset the accumulator
Ln:
Add
(R3),R5
; Accumulate the sum particular test
Add
R2,R3
; R3 points to particular test of next student
Decrement R4
Branch>0 Ln
; Iterate for all students
Move
R5,(R0)
; Store the sum of particular test
Add
#4,R0
; R0 point to sum of the next test
Decrement R1
Branch>0 Lj
; Iterate for all tests
6
Solusi PR #1
° 2.10 (a) “Dot product” pada arsitektur Load/Store
Move
Move
Load
Clear
LOOP:
#AVEC,R1
#BVEC,R2
N,R3
R0
Load
(R1)+,R4
Load
Multiply
(R2)+,R5
R5,R4
Add
Decrement
Branch>0 LOOP
Store
R4,R0
R3
R0,DOTPROD
; R1 points to vector A.
; R2 points to vector B.
; R3 serves as a counter.
; R0 accumulates the product.
; Compute the product of next
; components.
; Add to previous sum.
; Decrement the counter.
; Loop again if not done.
; Store the product in memory.
7
Solusi PR #1
° 2.13 Effective Address
R1 = 1200, R2 = 4600
Load
20(R1),R5
Move
#3000,R5
Store
R5,30(R1,R2)
Add
-(R2),R5
; EA = [R1] + 20 = 1200 + 20 = 1220
; EA = tidak ada (#3000: immd. value)
; EA = [R1] + [R2] + 30 = 5830
; EA = [R2] – 1 = 4600 – 1 = 4599
Subtract
; EA = [R1] = 1200
(R1)+,R5
8
Solusi PR #1
° 2.14 Linked list
Move
Clear
Clear
Clear
#1000,R0
R1
R2
R3
LOOP:
Add
Add
Add
Move
8(R0),R1
12(R0),R2
16(R0),R3
4(R0),R0
Compare
BNE
Move
Move
Move
#0,R0
LOOP
R1,SUM1
R2,SUM2
R3,Sum3
9
RISC vs. CISC
10
RISC vs. CISC
° RISC = Reduced Instruction Set Computer
• Term coined at Berkeley, ideas pioneered by IBM, Berkeley, Stanford
° RISC characteristics:
• Load-store architecture
• Fixed-length instructions (typically 32 bits)
• Three-address architecture
• Simple operations
° RISC examples: MIPS, SPARC, IBM/Motorola PowerPC, Compaq Alpha,
ARM, SH4, HP-PA, ...
° CISC = Complex Instruction Set Computer
• Term referred to non-RISC architectures
° CISC characteristics:
• Register-memory architecture
• Variable-length instructions
• Complex operations
° CISC examples: Intel 80x86, VAX, IBM 360, …
11
MIPS
12
MIPS I Registers
° Programmable storage
• 2^32 x bytes of memory
•
•
•
31 x 32-bit GPRs (R0 = 0)
32 x 32-bit FP regs (paired DP)
HI, LO, PC
r0
r1
°
°
°
r31
PC
lo
hi
0
13
MIPS Addressing Modes/Instruction Formats
• All instructions 32 bits wide
Register (direct)
op
rs
rt
rd
register
Immediate
Base+index
op
rs
rt
immed
op
rs
rt
immed
register
PC-relative
op
rs
PC
• Destination first:
rt
Memory
+
immed
Memory
+
OPcode
Rdest,Rsrc1,Rsrc2
14
MIPS Data Transfer Instructions
Instruction
Comment
SW 500(R4), R3
Store word
SH
502(R2), R3
Store half
SB
41(R3), R2
Store byte
LWR1, 30(R2)
Load word
LH
Load halfword
R1, 40(R3)
LHU R1, 40(R3)
Load halfword unsigned
LB
Load byte
R1, 40(R3)
LBU R1, 40(R3)
Load byte unsigned
LUI R1, 40
Load Upper Immediate (16 bits shifted left by 16)
LUI
R5
R5
0000 … 0000
15
MIPS Arithmetic Instructions
Instruction
add
subtract
add immediate
Example
add $1,$2,$3
sub $1,$2,$3
addi $1,$2,100
Meaning
$1 = $2 + $3
$1 = $2 – $3
$1 = $2 + 100
Comments
3 operands
3 operands
+ constant
add unsigned
addu $1,$2,$3 $1 = $2 + $3
subtract unsigned subu $1,$2,$3 $1 = $2 – $3
add imm. unsign. addiu $1,$2,100 $1 = $2 + 100
3 operands
3 operands
+ constant
multiply
multiply unsigned
divide
divide unsigned
mult $2,$3
multu$2,$3
div $2,$3
divu $2,$3
Hi, Lo = $2 x $3 64-bit signed product
Hi, Lo = $2 x $3 64-bit unsigned product
Lo = $2 ÷ $3, Hi = $2 mod $3
Lo = $2 ÷ $3, Hi = $2 mod $3
Move from Hi
Move from Lo
mfhi $1
mflo $1
$1 = Hi
$1 = Lo
Used to get copy of Hi
Used to get copy of Lo
16
MIPS Logical Instructions
Instruction
Example
Meaning
and
and $1,$2,$3 $1 = $2 & $3
3 reg. operands
or
or $1,$2,$3
$1 = $2 | $3
3 reg. operands
xor
xor $1,$2,$3
$1 = $2 $3
3 reg. operands
nor
nor $1,$2,$3
$1 = ~($2 |$3)
3 reg. operands
and immediate
andi $1,$2,10 $1 = $2 & 10
Logical AND reg, constant
shift left logical
sll $1,$2,10
$1 = $2 << 10
Shift left by constant
shift right logical
srl $1,$2,10
$1 = $2 >> 10
Shift right by constant
shift right arithm. sra $1,$2,10
$1 = $2 >> 10
Shift right (sign extend)
shift left logical
sllv $1,$2,$3
$1 = $2 << $3
Shift left by variable
shift right logical
srlv $1,$2, $3 $1 = $2 >> $3
shift right arithm. srav $1,$2, $3 $1 = $2 >> $3
Comment
Shift right by variable
Shift right arith. by variable
17
MIPS Compare and Branch Instructions
° Compare and Branch
•
•
BEQ rs, rt, offset
BNE rs, rt, offset
if R[rs] == R[rt] then PC-relative branch
<>
° Compare to zero and Branch
• BLEZ rs, offset
if R[rs] <= 0 then PC-relative branch
•
•
•
•
BGTZ rs, offset
BLT
BGEZ
BLTZAL rs, offset
•
BGEZAL
>
<
>=
if R[rs] < 0 then branch and link (into R 31)
>=
18
MIPS Compare and Set, Jump instructions
Instruction
Example
Meaning
set on less than slt $1,$2,$3
if ($2 < $3) $1=1; else $1=0
Compare less than; 2’s comp.
set less than imm slti $1,$2,100 if ($2 < 100) $1=1; else $1=0
Compare < constant; 2’s comp.
set less than uns. sltu $1,$2,$3 if ($2 < $3) $1=1; else $1=0
Compare less than; natural numbers
set l. t. imm. uns. sltiu $1,$2,100 if ($2 < 100) $1=1; else $1=0
Compare < constant; natural numbers
jump
j 10000
go to 10000
Jump to target address
jump register
jr $31
go to $31
For switch, procedure return
jump and link
jal 10000
$31 = PC + 4; go to 10000
For procedure call
19
MIPS I/O Instructions
° MIPS tidak memiliki instruksi khusus untuk I/O
° I/O diperlakukan sebagai “memori”  Status Register & Data
Register “dianggap” sebagai memori
20
Contoh Program: Vector Dot Product
LUI
R1,high(AVEC)
; R1 points to vector A.
ORI
R1,R1,low(AVEC)
LUI
R2,high(BVEC)
ORI
R2,R2,low(BVEC)
LUI
R6,high(N)
LW
R3,low(N)(R6)
; R3 serves as a counter.
AND
R4,R4,R0
; R4 accumulates the product.
LOOP: LW
R5,0(R1)
; Compute the product of
LW
R6,0(R2)
;
MULT
R5,R5,R6
ADD
R4,R4,R5
; Add to previous sum.
ADDI
R3,R3,-1
; Decrement the counter.
BNE
R3,R0,LOOP
; Loop again if not done.
LUI
R6,high(DOTPROD)
; Store the product in memory.
SW
low(DOTPROD)(R6),R4
; R2 points to vector B.
next components.
21
x86
22
Intel History: ISA evolved since 1978
° 8086: 16-bit, all internal registers 16 bits wide;
no general purpose registers; ‘78
° 8087: + 60 Fl. Pt. instructions, (Prof. Kahan)
adds 80-bit-wide stack, but no registers; ‘80
° 80286: adds elaborate protection model; ‘82
° 80386: 32-bit; converts 8 16-bit registers into
8 32-bit general purpose registers;
new addressing modes; adds paging; ‘85
° 80486, Pentium, Pentium II: + 4 instructions
° MMX: + 57 instructions for multimedia; ‘97
° Pentium III: +70 instructions for multimedia; ‘99
° Pentium 4: +144 instructions for multimedia; '00
23
x86 Registers
Program Counter (PC)
24
Organization of Segment Registers
25
Instruction Format
° Ukuran instruksi [n] bervariasi: 1  n  16 byte
0, 1, 2, 3, 4
1, 2
Prefix
Opcode
0,1 0,1
Mod
R/M
SIB
0, 1, 2, 3, 4
0, 1, 2, 3, 4
Displacement
Immediate
• Prefix: (Lock, Repeat), Overrides: Segment, Operand Size, Address Size
• ModR/M: Addressing Mode
• SIB: Scale, Index, Base
• Displacement: Displacement’s Value
• Immediate: Immediate’s Value
° Konvensi:
OPcode
dst,src
; dst  dst OP src
MOV
AL,BL
; byte (8 bit)
MOV
AX,BX
; word (16 bit)
MOV
EAX,EBX
; double-word (32 bit)
26
Addressing Modes
° Immediate
° Register
° Direct (Absolute)
° Indirect (Register)
° Index:
27
Addressing Modes: Contoh
° Immediate:
MOV
EAX,25
; EAX  25
MOV
EAX,NUM
; NUM: konstanta
MOV
EAX,EBX
; EAX  [EBX]
MOV
EAX,LOC
; EAX  [LOC]
° Register:
° Direct (Absolute):
; LOC: label alamat
; EAX  [LOC]
MOV
EAX,[LOC]
MOV
EBX,OFFSET LOC ; EBX  #LOC
MOV
EAX,[EBX]
; EAX  [[EBX]]
° Register Indirect:
° Index:
•
Base+disp.:
MOV
EAX,[EBP+10]
; EAX  [EBP+10]
•
Index+disp.:
MOV
EAX,[ESI*4+10]
; EAX  [ESI*4+10]
•
Base+Index:
MOV
EAX,[EBP+ESI*4] ; EAX  [EBP+ESI*4]
•
Base+Index+disp.:
MOV
EAX,[EBP+ESI*4+10] ; EAX  [EBP+ESI*4+10]
28
Data Transfer Instructions
° MOV
Move
° PUSH
Push onto stack
° PUSHA/PUSHAD Push GP registers onto stack
° XCHG
Exchange
° CWD/CDQ
Convert word to doubleword/Convert
doubleword to quadword
° XLAT
Table lookup translation
° CMOVE
Conditional move if equal
° CMOVNE
Conditional move if not equal
° CMPXCHG
Compare and exchange
29
Arithmetic Instructions
° Binary Arithmetic:
• ADD, ADC, SUB, SBB
• IMUL, MUL, IDIV, DIV
• INC, DEC, NEG, CMP
° Decimal Arithmetic:
• DAA
Decimal adjust after addition
• DAS
Decimal adjust after subtraction
• AAA
• AAS
ASCII adjust after addition
ASCII adjust after subtraction
30
Logical Instructions
° AND
And
° OR
Or
° XOR
Exclusive or
° NOT
Not
° …
° SAR/L
Shift arithmetic right/left
° SHR/L
Shift logical right/left
° SHRD
Shift right double
° SHLD
Shift left double
° ROR/L
Rotate right/left
° RCR/L
Rotate through carry right/left
31
Branch Instructions
° JMP
Jump
° JE/JZ
Jump if equal/Jump if zero
° JNE/JNZ
Jump if not equal/Jump if not zero
° JC
Jump if carry
° JNC
Jump if not carry
° JCXZ/JECXZ
Jump register CX zero/Jump register ECX zero
° LOOP
Loop with ECX counter
° INT
Software interrupt
° IRET
Return from interrupt
32
I/O Instructions
° IN
Read from a port
° OUT
Write to a port
33
String
° MOVS/MOVSB
Move string/Move byte string
° MOVS/MOVSW
Move string/Move word string
° MOVS/MOVSD
Move string/Move doubleword string
° INS/INSB
Input string from port/Input byte string from port
° OUTS/OUTSB
Output string to port/Output byte string to port
° CMPS/CMPSB
Compare string/Compare byte string
° SCAS/SCASB
Scan string/Scan byte string
° REP
Repeat while ECX not zero
° REPE/REPZ
Repeat while equal/Repeat while zero
° REPNE/REPNZ
Repeat while not equal/Repeat while not zero
34
HLL Support & Extended Instruction Sets
° HLL Support:
• BOUND
Detect value out of range
• ENTER
High-level procedure entry; creates a stack frame
• LEAVE
High-level procedure exit; reverses the action of
previous ENTER
° Extended:
• MMX Instructions
- Operate on Packed (64 bits) Data simultaneously  use 8 MMX
Register Set
• Floating-point Instructions
• System Instruction
• Streaming SIMD Extensions
- Operate on Packed (128 bits) Floating-point Data simultenously 
uses 8 XMMX Register Set
35
Contoh Program: Vector Dot Product
LOOPSTART:
LEA
EBP,AVEC
; EBP points to vector A.
LEA
EBX,BVEC
; EBX points to vector B.
MOV
ECX,N
; ECX serves as a counter.
MOV
EAX,0
; EAX accumulates the product.
MOV
EDI,0
; EDI is an index register.
MOV
EDX,[EBP+EDI*4] ; Compute the product of
IMUL
EDX,[EBX+EDI*4] ;
INC
EDI
; Increment index.
ADD
EAX,EDX
; Add to previous sum.
LOOP
LOOPSTART
; Loop again if not done.
MOV
DOTPROD,EAX ; Store the product in memory.
next components.
36
MIPS (RISC) vs. Intel 80x86 (CISC)
37
MIPS vs. Intel 80x86
° MIPS: “Three-address architecture”
• Arithmetic-logic specify all 3 operands
add $s0,$s1,$s2 # s0=s1+s2
• Benefit: fewer instructions  performance
° x86: “Two-address architecture”
• Only 2 operands,
so the destination is also one of the sources
add $s1,$s0 # s0=s0+s1
• Often true in C statements: c += b;
• Benefit: smaller instructions  smaller code
38
MIPS vs. Intel 80x86
° MIPS: “load-store architecture”
• Only Load/Store access memory; rest operations
register-register; e.g.,
lw $t0, 12($gp)
add $s0,$s0,$t0 # s0=s0+Mem[12+gp]
• Benefit: simpler hardware  easier to pipeline, higher
performance
° x86: “register-memory architecture”
• All operations can have an operand in memory; other
operand is a register; e.g.,
add 12(%gp),%s0 # s0=s0+Mem[12+gp]
• Benefit: fewer instructions  smaller code
39
MIPS vs. Intel 80x86
° MIPS: “fixed-length instructions”
• All instructions same size, e.g., 4 bytes
• simple hardware  performance
• branches can be multiples of 4 bytes
° x86: “variable-length instructions”
• Instructions are multiple of bytes: 1 to 16;
 small code size (30% smaller?)
• More Recent Performance Benefit:
better instruction cache hit rates
• Instructions can include 8- or 32-bit immediates
40
MIPS vs. x86: Code Length
LUI
R1,high(AVEC)
LEA
EBP,AVEC
ORI
R1,R1,low(AVEC)
LEA
EBX,BVEC
LUI
R2,high(BVEC)
MOV
ECX,N
ORI
R2,R2,low(BVEC)
MOV
EAX,0
LUI
R6,high(N)
MOV
EDI,0
LW
R3,low(N)(R6)
MOV
EDX,[EBP+EDI*4]
AND
R4,R4,R0
IMUL
EDX,[EBX+EDI*4]
LOOP: LW
R5,0(R1)
INC
EDI
LW
R6,0(R2)
ADD
EAX,EDX
MULT
R5,R5,R6
LOOP
LOOPSTART
ADD
R4,R4,R5
MOV
DOTPROD,EAX
ADDI
R3,R3,-1
BNE
R3,R0,LOOP
LUI
R6,high(DOTPROD)
SW
low(DOTPROD)(R6),R4
LOOPSTART:
41