Transcript Slide 1

COM515 Computer Architecture
Lecture 7. MIPS Processor Design –
Pipelined Processor Design #2
Prof. Taeweon Suh
Computer Science Education
Korea University
Pipelined Datapath
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
Read
data
1
M
u
x
0
32
2
Korea Univ
Example for lw instruction:
Instruction Fetch (IF)
Instruction fetch
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
Read
data
1
M
u
x
0
32
3
Korea Univ
Example for lw instruction:
Instruction Decode (ID)
Instruction decode
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
Read
data
1
M
u
x
0
32
4
Korea Univ
Example for lw instruction:
Execution (EX)
Execution
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
Read
data
1
M
u
x
0
32
5
Korea Univ
Example for lw instruction: Memory
(MEM)
Memory
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
Read
data
1
M
u
x
0
32
6
Korea Univ
Example for lw instruction: Writeback
(WB)
Writeback
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
Read
data
1
M
u
x
0
32
7
Korea Univ
Example for sw instruction: Memory
(MEM)
Memory
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
Read
data
1
M
u
x
0
32
8
Korea Univ
Example for sw instruction: Writeback
(WB): do nothing
Writeback
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
Read
data
1
M
u
x
0
32
9
Korea Univ
Corrected Datapath (for lw)
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add Add
result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
Read
data
1
M
u
x
0
32
10
Korea Univ
Pipelining Example
add $14, $5, $6
lw $13, 24($1)
add $12, $3, $4
sub $11, $2, $3
lw $10, 20($1)
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
Read
data
1
M
u
x
0
32
11
Korea Univ
Pipeline Control
PCSrc
Note that in this
implementation, branch
instruction decides whether
to branch in the MEM stage
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
result
Add
4
Branch
Shift
left 2
PC
Address
Instruction
memory
Instruction
RegWrite
Read
register 1
MemWrite
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
ALUSrc
Zero
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Address
Data
memory
Write
Read
data
1
M
u
x
0
data
Instruction
16
[15– 0]
Sign
extend
32
6
ALU
control
MemRead
Instruction
[20– 16]
Instruction
[15– 11]
0
M
u
x
1
ALUOp
RegDst
12
Korea Univ
Pipeline Control
• We have 5 stages
 IF, ID, EX, MEM, WB
• What needs to be controlled in each stage?
 Instruction fetch and PC increment
 Instruction decode / operand fetch
 Execution stage
• RegDst
• ALUop[1:0]
• ALUSrc
 Memory stage
• Branch
• MemRead
• MemWrite
 Writeback
• MemtoReg
• RegWrite (note that this signal is in ID stage)
13
Korea Univ
Pipeline Control
•
•
Extend pipeline registers to include control information (created in ID)
Pass control signals along just like the data
Instruction
R-format
lw
sw
beq
Execution/Address
Calculation stage control
lines
Reg
ALU
ALU
ALU
Dst
Op1
Op0
Src
1
1
0
0
0
0
0
1
X
0
0
1
X
0
1
0
Memory access stage
control lines
Mem Mem
Branch Read Write
0
0
0
0
1
0
0
0
1
1
0
0
Write-back
stage control
lines
Reg
Mem
write to Reg
1
0
1
1
0
X
0
X
WB
Instruction
IF/ID
Control
M
WB
EX
M
WB
ID/EX
EX/MEM
MEM/WB
14
Korea Univ
Datapath with Control
PCSrc
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB
WB
Add
Add
Add result
Instruction
memory
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Address
Branch
Shift
left 2
MemWrite
PC
Instruction
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
15
Korea Univ
Datapath with Control
IF: lw $10, 9($1)
PCSrc
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB
WB
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
16
Korea Univ
Datapath with Control
IF: sub $11, $2, $3
ID: lw $10, 9($1)
PCSrc
ID/EX
0
M
u
x
1
11
“lw”
010
Control
WB
EX/MEM
M
WB
0001 E X
IF/ID
MEM/WB
M
WB
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
17
Korea Univ
Datapath with Control
IF: and $12, $4, $5
ID: sub $11, $2, $3
EX: lw $10, 9($1)
PCSrc
ID/EX
0
M
u
x
1
10
“sub”
000
Control
1100
IF/ID
WB
M
EX
11
EX/MEM
010
0
00
1
WB
MEM/WB
M
WB
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
18
Korea Univ
Datapath with Control
IF: or $13, $6, $7
ID: and $12, $4, $5 EX: sub $11, $2, $3 MEM: lw $10, 9($1)
PCSrc
ID/EX
0
M
u
x
1
10
“and”
000
Control
1100
IF/ID
WB
M
EX
10
EX/MEM
000
1
10
0
WB
M
11
0
1
0
MEM/WB
WB
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
19
Korea Univ
Datapath with Control
IF: add $14, $8, $9
ID: or $13, $6, $7
EX: and $12, $4, $5
MEM: sub $11, ..
PCSrc
WB: lw $10,
9($1)
ID/EX
0
M
u
x
1
10
“or”
000
Control
1100
IF/ID
WB
M
EX
10
EX/MEM
000
1
10
0
WB
M
10
0
0
0
MEM/WB
1
WB
1
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
20
Korea Univ
Datapath with Control
IF: xxxx
ID: add $14, $8, $9
MEM: and $12… WB: sub $11, ..
EX: or $13, $6, $7
PCSrc
ID/EX
0
M
u
x
1
10
“add”
000
Control
1100
IF/ID
WB
M
EX
10
EX/MEM
000
1
10
0
WB
M
10
0
0
0
MEM/WB
1
WB
0
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
21
Korea Univ
Datapath with Control
IF: xxxx
ID: xxxx
EX: add $14, $8, $9
MEM: or $13, ..
WB: and $12…
PCSrc
ID/EX
0
M
u
x
1
WB
M
Control
EX
IF/ID
10
EX/MEM
000
1
10
0
WB
M
10
0
0
0
MEM/WB
1
WB
0
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
22
Korea Univ
Datapath with Control
IF: xxxx
ID: xxxx
EX: xxxx
MEM: add $14, ..
WB: or $13…
PCSrc
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
M
WB
EX
M
10
0
0
0
MEM/WB
1
WB
0
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
23
Korea Univ
Datapath with Control
IF: xxxx
ID: xxxx
EX: xxxx
MEM: xxxx
WB: add $14..
PCSrc
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB
1
WB
0
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
24
Korea Univ
Dependencies
• Dependencies
 Problem with starting (or executing) next instruction before first is finished
 Dependencies incur data and control hazards
Time (in clock cycles)
CC 1
Value of
register $2: 10
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
CC 9
10
10
10
10/– 20
– 20
– 20
– 20
– 20
DM
Reg
Program
execution
order
(in instructions)
sub $2, $1, $3
and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
IM
Reg
IM
DM
Reg
IM
Reg
DM
Reg
IM
DM
Reg
sw $15, 100($2)
IM
25
Reg
Reg
Reg
DM
Reg
Korea Univ
Data Hazard - Software Solution
• Data hazards
 Dependencies that “go backward in time”
• Have compiler guarantee no hazards?
 Insert nop (no operation) instructions (“0x00000000” is nop in MIPS)
 Code scheduling
• Where do we insert the “nops” ?
sub
and
or
add
sw
$2, $1, $3
$12, $2, $5
$13, $6, $2
$14, $2, $2
$15, 100($2)
• Problem?
 This really slows us down!
26
Korea Univ
Data Hazard - Pipeline Stalls?
sub $2, $1, $3
stall
stall
stall
and $12, $2, $5
or $13, $6, $2
IM
Reg
DM
bubble
Reg
IM
IM
IM
IM
DM
Reg
IM
add $14, $2, $2
sw $15, 100($2)
DM
Reg
IM
Reg
DM
Reg
IM
27
Reg
Reg
Reg
DM
Reg
Korea Univ
Data Hazard - Forwarding
• Use temporary results, don’t wait for them to be written
 Register file forwarding to handle read/write to same register
 ALU forwarding
Time (in clock cycles)
CC 1
Value of register $2 : 10
Value of EX/MEM : X
Value of MEM/WB : X
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
CC 9
10
X
X
10
X
X
10
– 20
X
10/– 20
X
– 20
– 20
X
X
– 20
X
X
– 20
X
X
– 20
X
X
Ok.. Then, do we have to do this
forwarding?
Program
execution order
(in instructions)
sub $2, $1, $3
and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
IM
Reg
IM
DM
Reg
Reg
IM
DM
Reg
Reg
DM
IM
Reg
sw $15, 100($2)
IM
28
1. If you are asked to design CPU using only
rising-edge of the clock, then?
• Let’s stick to this for our project
2. If the register file write occurs in the first half
of the clock, and read occurs in the 2nd half of
the clock, then?
• Our textbook follows this
Reg
DM
Reg
Reg
DM
Reg
Korea Univ
Forwarding (simplified)
ID/EX
MEM/WB
ALU
Data
Memory
29
MUX
Register
File
EX/MEM
Korea Univ
Forwarding (from EX/MEM)
EX/MEM
MEM/WB
MUX
ID/EX
Register
File
Data
Memory
30
MUX
MUX
ALU
Korea Univ
Forwarding (from MEM/WB)
EX/MEM
MEM/WB
MUX
ID/EX
Register
File
Data
Memory
31
MUX
MUX
ALU
Korea Univ
Forwarding (operand selection)
EX/MEM
MEM/WB
MUX
ID/EX
Register
File
Data
Memory
MUX
MUX
ALU
Forwarding
Unit
32
Korea Univ
Forwarding (operand propagation)
EX/MEM
MEM/WB
MUX
ID/EX
Register
File
MUX
ALU
Rt
Rt
Rs
MUX
Rd
MUX
Data
Memory
Forwarding
Unit
33
EX/MEM Rd
MEM/WB Rd
Korea Univ
Forwarding
ID/EX
WB
Control
PC
Instruction
memory
Instruction
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB
WB
M
u
x
Registers
ALU
Data
memory
M
u
x
M
u
x
IF/ID.RegisterRs
Rs
IF/ID.RegisterRt
Rt
IF/ID.RegisterRt
Rt
IF/ID.RegisterRd
Rd
M
u
x
EX/MEM.RegisterRd
Forwarding
unit
34
MEM/WB.RegisterRd
Korea Univ
Can't always forward
• lw (load word) can still cause a hazard
 An instruction tries to read a register following a load instruction
that writes to the same register
Time (in clock cycles)
Program
CC 1
execution
order
(in instructions)
lw $2, 20($1)
and $4, $2, $5
or $8, $2, $6
add $9, $4, $2
slt $1, $6, $7
IM
CC 2
CC 3
Reg
IM
CC 4
CC 5
DM
Reg
Reg
IM
DM
Reg
IM
CC 6
CC 8
CC 9
Reg
DM
Reg
IM
CC 7
Reg
DM
Reg
Reg
DM
Reg
• Thus, we need a hazard detection unit to “stall” the
pipeline after the load instruction
35
Korea Univ
Stalling
• We can stall the pipeline by keeping an instruction
in the same stage
Program
Time (in clock cycles)
execution
CC 1
CC 2
order
(in instructions)
lw $2, 20($1)
and $4, $2, $5
or $8, $2, $6
IM
CC 3
Reg
IM
CC 4
CC 5
DM
Reg
Reg
ID
Reg
ID
IM
IF
IM
IF
CC 6
CC 7
DM
Reg
Reg
DM
CC 8
CC 9
CC 10
Reg
bubble
add $9, $4, $2
IM
slt $1, $6, $7
IM
36
DM
Reg
Reg
Reg
DM
Reg
Korea Univ
Hazard Detection Unit
Stall by letting an instruction that won’t write anything go forward
Stall the pipeline if both ID/EX is a load and (rt=IF/ID.rs or rt=IF/ID.rt)
ID/EX.MemRead
Hazard
detection
unit
ID/EX
IF/IDWrite
WB
Control
0
M
u
x
PC
Instruction
memory
Instruction
IF/ID
PCWrite
•
•
EX/MEM
M
WB
EX
M
MEM/WB
WB
M
u
x
Registers
ALU
Data
memory
M
u
x
M
u
x
IF/ID.RegisterRs
IF/ID.RegisterRt
IF/ID.RegisterRt
Rt
IF/ID.RegisterRd
Rd
ID/EX.RegisterRt
Rs
Rt
37
M
u
x
EX/MEM.RegisterRd
Forwarding
unit
MEM/WB.RegisterRd
Korea Univ
Control Hazards - Branch
•
•
When we decide to branch, other instructions are in the pipeline!
Assume: branch is not taken

When this assumption failed, flush 3 instructions
Time (in clock cycles)
Program
execution
CC 1
CC 2
order
(in instructions)
40 beq $1, $3, 7
44 and $12, $2, $5
48 or $13, $6, $2
52 add $14, $2, $2
IM
CC 3
Reg
IM
CC 4
CC 5
DM
Reg
Reg
IM
DM
Reg
IM
Reg
IM
CC 7
CC 8
CC 9
Reg
DM
72 lw $4, 50($7)
•
CC 6
Reg
DM
Reg
Reg
DM
Reg
We are predicting “branch not taken”

need to add hardware for flushing instructions if we are wrong
38
Korea Univ
Alleviate Branch Hazards
• Move branch compare to ID stage of the pipeline
• Add adder to calculate branch target in ID stage
• Add IF.flush signal that zeros the instruction (or
squash) in IF/ID pipeline register
• Reduce penalty to 1 cycle
39
Korea Univ
Flushing Instructions
IF.Flush
Hazard
detection
unit
ID/EX
M
u
x
WB
Control
0
M
u
x
IF/ID
4
M
WB
EX
M
MEM/WB
WB
Shift
left 2
Registers
PC
EX/MEM
M
u
x
=
Instruction
memory
ALU
M
u
x
Data
memory
M
u
x
Sign
extend
M
u
x
Forwarding
unit
40
Korea Univ
Flushing Instructions (cycle N)
beq $1, $3, L2
and $12, $2, $5
or $13, $12, $1
…
L2:
lw $4, 40($7)
beq $1, $3, L2
and $12, $2, $5
IF.Flush
Hazard
detection
unit
ID/EX
M
u
x
WB
Control
0
M
u
x
IF/ID
4
M
WB
EX
M
MEM/WB
WB
Shift
left 2
Registers
PC
EX/MEM
M
u
x
=
Instruction
memory
ALU
M
u
x
Data
memory
M
u
x
Sign
extend
M
u
x
Forwarding
unit
41
Korea Univ
Flushing Instructions (cycle N)
beq $1, $3, L2
and $12, $2, $5
or $13, $12, $1
…
L2:
lw $4, 40($7)
beq $1, $3, L2
and $12, $2, $5
IF.Flush
Hazard
detection
unit
ID/EX
M
u
x
WB
Control
0
M
u
x
IF/ID
4
L2
M
WB
EX
M
MEM/WB
WB
Shift
left 2
Registers
PC
EX/MEM
M
u
x
=
Instruction
memory
ALU
M
u
x
Data
memory
M
u
x
Sign
extend
M
u
x
Forwarding
unit
42
Korea Univ
Flushing Instructions (cycle N+1)
lw $4, 40($7)
beq $1, $3, L2
and $12, $2, $5
or $13, $12, $1
…
L2:
lw $4, 40($7)
beq $1, $3, L2
nop
IF.Flush
Hazard
detection
unit
ID/EX
M
u
x
WB
Control
0
M
u
x
IF/ID
4
M
WB
EX
M
MEM/WB
WB
Shift
left 2
Registers
PC
EX/MEM
M
u
x
=
Instruction
memory
ALU
M
u
x
Data
memory
M
u
x
Sign
extend
M
u
x
Forwarding
unit
43
Korea Univ
Improving Performance
• Try and avoid stalls! E.g., reorder these instructions:
lw
lw
sw
sw
$t0,
$t2,
$t2,
$t0,
0($t1)
4($t1)
0($t1)
4($t1)
• Add a “branch delay slot”
 The next instruction after a branch is always executed
 Rely on compiler to “fill” the slot with something useful
• Superscalar
 Start more than one instruction in the same cycle
 Most all processors are now pipelined and Superscalar
44
Korea Univ
Dynamic Scheduling
• The hardware performs the “scheduling”
 Hardware tries to find instructions to execute
 Out of order (OOO) execution is possible
 Speculative execution and dynamic branch prediction
• All modern processors are very complicated
 DEC Alpha 21264: 9 stage pipeline, 6 instruction issue
 PowerPC and Pentium: branch history table
 Compiler technology is important
• This class has given you the background you need
to learn more
45
Korea Univ
Exceptions & Interrupts
• CPU has to prepare for all possible situations it could face
 “Unexpected” events require change in flow of control
• Exceptions arise within the CPU
 Undefined opcode
 Arithmetic overflow in MIPS
• Some other architectures (such as x86 and ARM) do not generate exception
on arithmetic overflow. Instead, set bits of the flag register inside CPU
• Interrupts are from external I/O devices
• Keyboard, Mouse, Network card etc
• Many architectures and authors do not distinguish between
interrupts and exceptions
 Often use the term “interrupt” to refer to both types of events
46
Korea Univ
Pipelined Performance Example
• Ideally CPI = 1
• But, need to handle stalling (cause by loads and branches)
• SPECINT2000 benchmark:





25% loads
10% stores
11% branches
2% jumps
52% R-type
• Suppose
 40% of loads are used by next instruction
 25% of branches are mispredicted
• What is the average CPI?
47
Korea Univ
Pipelined Performance Example
•
SPECINT2000 benchmark:





•
If there is no stall in the pipelined MIPS, how would you calculate CPI?

•
40% of loads are used by next instruction
25% of branches are mispredicted
All jumps flush next instruction
What is the average CPI?




•
Average CPI = (0.25) (1 CPI) + (0.10) (1 CPI) + (0.11) (1 CPI) + (0.02) (1 CPI) + (0.52) (1 CPI) = 1
Suppose



•
25% loads
10% stores
11% branches
2% jumps
52% R-type
Load/Branch CPI = 1 when no stalling, 2 when stalling. Thus
CPIlw = 1 (0.6) + 2 (0.4) = 1.4
CPIbeq = 1 (0.75) + 2 (0.25) = 1.25
CPIjump = 2 (1) = 2
Average CPI = (0.25)(1.4) + (0.1)(1) + (0.11)(1.25) + (0.02)(2) + (0.52)(1) = 1.15
48
Korea Univ
Pipelined Performance
• Critical path of the pipelined MIPS processor:
Tc = max {
tpcq + tmem + tsetup ,
// IF stage
2(tRFread + tmux + teq + tAND + tmux + tsetup ) , // ID stage
tpcq + tmux + tmux + tALU + tsetup ,
// EX stage
tpcq + tmemwrite + tsetup ,
// MEM stage
2(tpcq + tmux + tRFwrite)
// WB stage
}
Where does this “2” come from?
1. If you are asked to design CPU using only
rising-edge of the clock, then?
• Let’s stick to this for our project
2. If the register file write occurs in the first
half of the clock, and read occurs in the 2nd
half of the clock, then?
• Our textbook follows this
49
Korea Univ
Pipelined Performance Example
Element
Parameter
Delay (ps)
Register clock-to-Q
tpcq_PC
30
Register setup
tsetup
20
Multiplexer
tmux
25
ALU
tALU
200
Memory read
tmem
250
Register file read
tRFread
150
Register file setup
tRFsetup
20
Equality comparator
teq
40
AND gate
tAND
15
Memory write
Tmemwrite
220
Register file write
tRFwrite
100 ps
Tc = 2(tRFread + tmux + teq + tAND + tmux + tsetup )
= 2[150 + 25 + 40 + 15 + 25 + 20] ps = 550 ps
50
Korea Univ
Pipelined Performance Example
• For a program with 100 billion instructions executing
on a pipelined MIPS processor,
 CPI = 1.15
 Tc = 550 ps
Execution Time = (#instructions)(cycles/instruction)(seconds/cycle)
= (100 × 109)(1.15)(550× 10-12 s)
= 63 seconds
Processor
Execution Time
(seconds)
Speedup
(single-cycle is baseline)
Single-cycle
95
1
Multicycle
133
0.71
Pipelined
63
1.51
51
Korea Univ
Backup Slides
52
Korea Univ
Exception Handling in MIPS and Handler
Actions
• Exception handling in MIPS Hardware (CPU)
 CPU saves PC of offending (or interrupted) instruction to
the “Exception Program Counter (EPC)” register
 CPU saves indication of the problem to the “Cause”
register
 Jump to handler at 0x8000 00180
• Exception Handler in Software
 Read cause, and transfer to relevant handler
 If restartable,
• Take corrective action
• Use EPC to return to program
 Otherwise
• Terminate program
• Report error using EPC, cause, …
53
Korea Univ
Exceptions in a Pipeline
• Another form of control hazard
• Consider overflow on add in EX stage
add $1, $2, $1





Prevent $1 from being clobbered
Complete previous instructions
Flush add and subsequent instructions
Set Cause and EPC register values
Transfer control to handler
• Similar to mispredicted branch
 Use much of the same hardware
54
Korea Univ
Pipeline with Exceptions
55
Korea Univ
Exception Example
• Exception on add in
40
44
48
4C
50
54
…
sub
and
or
add
slt
lw
$11,
$12,
$13,
$1,
$15,
$16,
$2, $4
$2, $5
$2, $6
$2, $1
$6, $7
50($7)
sw
sw
$25, 1000($0)
$26, 1004($0)
• Handler
80000180
80000184
…
56
Korea Univ
Exception Example
57
Korea Univ
Exception Example
58
Korea Univ