Control – Multicycle Implementation

Download Report

Transcript Control – Multicycle Implementation

Intro to Pipelining
Chapter 6 P&H
Pipelining



Multiple instructions are overlapped in
execution
Key to making modern processors fast
Example – laundry




wash load of clothes in machine
Dry load of washing in dryer
Fold dry load of washing
Flatmate puts clothes away
Example – laundry
6
T im e
T a sk
o rd e r
A
B
C
D
P M
7
8
9
1 0
1 1
1 2
1
2
A M
Example – pipelined laundry
6
T im e
T a sk
o rd e r
A
B
C
D
P M
7
8
9
1 0
1 1
1 2
1
2
A M
Pipelined CPU


Can apply same principles to CPU Design
MIPs instructions classically take 5 steps:





Fetch instructions from memory
Decode instruction and read registers
Execute the operation or calculate address
Access operand in data memory
Write result to a register
Example MIPS CPU
Program
execution
Time
order
(in instructions)
lw $1, 100($0)
2
Instruction
Reg
fetch
lw $2, 200($0)
4
6
8
ALU
Data
access
10
12
14
ALU
Data
access
16
18
Reg
Instruction
Reg
fetch
8 ns
lw $3, 300($0)
Reg
Instruction
fetch
8 ns
...
8 ns
Program
2
execution
Time
order
(in instructions)
Instruction
lw $1, 100($0)
fetch
lw $2, 200($0)
lw $3, 300($0)
2 ns
4
Reg
Instruction
fetch
2 ns
6
ALU
Reg
Instruction
fetch
2 ns
8
Data
access
ALU
Reg
2 ns
10
14
12
Reg
Data
access
Reg
ALU
Data
access
2 ns
2 ns
Reg
2 ns
Designing Instruction sets for
pipelining


All MIPS instructions are the same
length
Only a few instruction formats


Memory operands only appear in loads
and stores


Source register fields in same place
Can calculate address in execute stage
Operands must be aligned in memory
Single Cycle Datapath
IF: Instruction fetch
ID: Instruction decode/
register file read
EX: Execute/
address calculation
MEM: Memory access
WB: Write back
0
M
u
x
1
Add
4
Add
Add
result
Shift
left 2
PC
Read
register 1
Address
Instruction
Instruction
memory
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
32
Read
data
1
M
u
x
0
Pipelined version of Datapath
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
32
Read
data
1
M
u
x
0

Pipeline Reg naming conventions


Named by two stages separated by that
register e.g. ID/EX
Work trough stages of executing a store
instruction

Instruction Fetch



PC + 4 -> IF/ID
PC + 4 -> PC
Imem[PC] -> IF/ID
LW instruction

Instruction Decode




Execute or address calculation


Contents reg and signed extended immediate added ->
Ex/Mem
Memory Access


IF/ID supplies 16 bit Immed which is signed extended ->
ID/EX
Two regs are read -> ID/EX
PC + 4 from IF/ID -> ID/EX
Address in EX/Mem used to get data value from data
mem -> Mem/WB
Write back

Mem/Wb data val -> reg file
LW instruction



Any information that might be needed in a
later stage must be passed via the pipeline
registers
Each resource can only be used in a single
pipeline stage (otherwise have structural
hazard)
Bug in previous data-path

Which instruction provides address for destination
register in lw

Dest reg numb needs to be passed through pipeline
registers
Updated Datapath
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
Add result
4
PC
Address
Instruction
memory
Instruction
Shift
left 2
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
0
M
u
x
1
Zero
ALU ALU
result
Address
Data
memory
Write
data
16
Sign
extend
32
Read
data
1
M
u
x
0
Pipelined Control

Start off by ignoring hazards
Datapath showing control
signals
PCSrc
0
M
u
x
1
IF/ID
ID/EX
EX/MEM
MEM/WB
Add
Add
4
Instruction
memory
Instruction
Address
Read
register 1
Branch
Shift
left 2
RegWrite
PC
Add
result
MemWrite
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
ALUSrc
Zero
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Address
Data
memory
Write
data
Instruction
16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
0
M
u
x
1
RegDst
ALU
control
ALUOp
MemRead
Read
data
1
M
u
x
0
Values on the Control Lines
Instruct
Exection stage
Mem access
Write back
RegD
st
ALU
Op1
ALU
OP0
ALU
Src
Branch
Mem
read
Mem
Write
Reg
Write
Mem
to Reg
Rfromat
1
1
0
0
0
0
0
1
0
Lw
0
0
0
1
0
1
0
1
1
Sw
X
0
0
1
0
0
1
0
X
Beq
X
0
1
0
1
0
0
0
X
Control Lines for final three
stages
WB
Instruction
IF/ID
Control
M
WB
EX
M
WB
ID/EX
EX/MEM
MEM/WB
Updated Datapath
PCSrc
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB
WB
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
4
Address
Data
memory
Read
data
Write
data
Instruction 16
[15– 0]
Instruction
[20– 16]
Instruction
[15– 11]
Sign
extend
32
6
ALU
control
0
M
u
x
1
ALUOp
RegDst
MemRead
1
M
u
x
0
Pipeline Hazards

Situations in pipelining where next
instruction cannot execute in following
clock cycle
Structural Hazards



Hardware cannot support combination
of instructions we want to execute in
same cycle
E.g. If used a washer/dyer combination
or flamate doing something else
E.g. MIPS Only a single memory
avaliable
Control Hazards


Need to make a decision based on the
results of one instruction while it is still
executing
Example – The branch instruction
beq $7, $8 label
?
IF
Decode Execute
IF
Decode Execute
Control Hazards

Can either:

Stall


Predict



Bubble(s) inserted into pipeline
Predict outcome of branch
Stall if prediction wrong
Delay branch

Execute instruction after branch regardless of
whether branch is taken or not
Branch Prediction
Program
execution
Time
order
(in instructions)
add $4, $5, $6
2
6
Instruction
Reg
fetch
2 ns
lw $3, 300($0)
2
4
Instruction
Reg
fetch
beq $1, $2, 40
2 ns
ALU
Instruction
Reg
fetch
bubble
or $7, $8, $9
Data
access
ALU
6
4 ns
10
14
Reg
Data
access
ALU
8
Data
access
12
Reg
Instruction
Reg
fetch
2 ns
Program
execution
Time
order
(in instructions)
8
Data
access
ALU
Instruction
Reg
fetch
beq $1, $2, 40
add $4, $5 ,$6
4
10
Reg
14
12
Reg
ALU
Data
access
Reg
bubble
bubble
bubble
Instruction
Reg
fetch
ALU
bubble
Data
access
Reg
Branch Prediction



Assume all branches will fail
Assume backwards branches always
succeed while forward branches always
fail
Use a dynamic branch predictor

Uses a table to record the outcomes of
previous time branch instructions were
executed


Gets about 90% accuracy
Longer pipelines amplify problem
Delayed Branches

Used on MIPS R3000 processors
Program
execution
order
Time
(in instructions)
beq $1, $2, 40
2
Instruction
fetch
add $4, $5, $6
(Delayed branch slot) 2 ns
lw $3, 300($0)
4
Reg
Instruction
fetch
2 ns
6
ALU
Reg
Instruction
fetch
2 ns
8
Data
access
ALU
Reg
10
12
Reg
Data
access
ALU
Reg
Data
access
Reg
14
Data Hazards


Instruction depends on outcome of a
previous instruction still in the pipeline
E.g. add $s0, $t0, $t1
sub $t2, $s0, $t3
add $S0, $t0, $t1
sub $t2, $S0, $t3
IF
Decode Execute mem writeback
IF
Decode
Execute mem writeback
Forwarding
Program
execution
order
Time
(in instructions)
add $s0, $t0, $t1
sub $t2, $s0, $t3
2
IF
4
6
8
ID
EX
MEM
IF
ID
EX
10
WB
MEM
WB
Forwarding
2
Time
Program
execution
order
(in instructions)
lw $s0, 20($t1)
sub $t2, $s0, $t3
IF
4
6
ID
EX
bubble
bubble
IF
8
10
12
MEM
WB
bubble
bubble
bubble
ID
EX
MEM
14
WB
Data Hazards and Forwarding

Consider execution on following
sequence of instructions:
Sub
And
Or
Add
Sw
$2,
$12,
$13,
$14,
$15,
$1, $3
$2, $5
$6, $2
$2, $2
100($2)
Pipelined Dependencies
T i m e ( i n c l o c k c y c le s )
V a lu e o f
r e g is te r $ 2 :
CC 1
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
CC 9
10
10
10
10
1 0 /– 2 0
– 20
– 20
– 20
– 20
IM
Reg
P ro g ra m
e x e c u tio n
orde r
( in i n s tr u c t io n s )
su b $2 , $ 1, $3
and $1 2, $2, $5
or $13, $6, $2
ad d $1 4, $2 , $ 2
s w $ 1 5 , 1 0 0 ($ 2 )
IM
DM
DM
R eg
IM
Reg
DM
R eg
IM
R eg
DM
R eg
IM
R eg
R eg
R eg
DM
R eg
T i m e ( in c lo c k c y c le s )
CC 1
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
CC 9
V a lu e o f r e g i s t e r $ 2 :
10
10
10
10
1 0 /– 2 0
– 20
– 20
– 20
– 20
V a lu e o f E X /M E M :
X
X
X
– 20
X
X
X
X
X
V a lu e o f M E M / W B :
X
X
X
X
– 20
X
X
X
X
IM
Reg
DM
R eg
P ro g ram
e x e c u ti o n o r d e r
( i n in s tr u c t io n s )
s u b $ 2, $1 , $ 3
a nd $ 12 , $ 2, $5
o r $ 13 , $ 6, $2
a dd $ 14 , $ 2, $2
s w $ 1 5 , 1 0 0 ($ 2 )
IM
R eg
IM
DM
R eg
IM
R eg
DM
R eg
IM
Reg
DM
R eg
Reg
DM
R eg
Hazard notation

IN our previous example the two pairs
of hazrard conditions are





1a.
1b.
2a.
2b.
EX/Mem.RegisterRd
EX/Mem.RegisterRd
Mem/WB.RegisterRd
Mem/WB.RegisterRd
=
=
=
=
ID/EX.RegisterRs
ID/EX.RegisterRt
ID/EX.RegisterRs
ID/EX.RegisterRs
Hazard on $2 between the sub and and
instructions is:

EX/Mem.RegisterRd = ID/EX.RegisterRs = $2
Hazards

Above notation will do forwarding unnecessarily
as some instruction do not write registers


Soln: can check WB control to check if register writes
back
What about $0

Do not forward as should always be zero
Time (in clock cycles)
CC 1
Value of register $2 : 10
Value of EX/MEM : X
Value of MEM/WB : X
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
CC 9
10
X
X
10
X
X
10
– 20
X
10/– 20
X
– 20
– 20
X
X
– 20
X
X
– 20
X
X
– 20
X
X
DM
Reg
Program
execution order
(in instructions)
sub $2, $1, $3
and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
IM
Reg
IM
Reg
IM
DM
Reg
IM
Reg
DM
Reg
IM
Reg
DM
Reg
Reg
DM
Reg
No Forwarding
ID / E X
R e g is te rs
E X /M E M
M E M /W B
ALU
D a ta
m e m o ry
M
u
x
a . N o f o r w a r d in g
Forwarding
ID / E X
E X /M E M
M E M /W B
M
u
x
R e g is t e rs
F o r w a rd A
ALU
M
u
x
Rs
D a ta
m e m o ry
F o rw a rd B
Rt
Rt
Rd
M
u
x
E X /M E M .R e g is te r R d
F o rw a rd in g
u n it
b . W i th fo r w a r d in g
M E M /W B .R e g i s t e r R d
M
u
x
EX Hazard


If (EX/Mem.RegWrite) and
(EX/Mem.RegRd != 0) and
(EM/Mem.RegRd = ID/EX.RegRs) then
ForwardA = 10
If (EX/Mem.RegWrite) and
(EX/Mem.RegRd != 0) and
(EM/Mem.RegRd = ID/EX.RegRt) then
ForwardB = 10
Mem Hazard

If (Mem/WB.RegWrite) and
(Mem/WB.RegRd != 0) and
(Mem/WB.RegRd = ID/EX.RegRs) then
ForwardA = 01
If (Mem/WB.RegWrite) and
(Mem/WB.RegRd != 0) and
(Mem/WB.RegRd = ID/EX.RegRt) then
ForwardB = 01
What about:
Add $1, $1, $2
Add $1, $1, $3
Add $1, $1, $4

If (Mem/WB.RegWrite) and
(Mem/WB.RegRd != 0) and
(EX/Mem.RegRd != ID/EX.RegRs) and
(Mem/WB.RegRd = ID/EX.RegRs) then
ForwardA = 01
If (Mem/WB.RegWrite) and
(Mem/WB.RegRd != 0) and
(EX/Mem.RegRd != ID/EX.RegRt) and
(Mem/WB.RegRd = ID/EX.RegRt) then
ForwardB = 01
Updated datapath to resolve
data hazards
ID/EX
WB
Control
PC
Instruction
memory
Instruction
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB
WB
M
u
x
Registers
ALU
Data
memory
M
u
x
IF/ID.RegisterRs
Rs
IF/ID.RegisterRt
Rt
IF/ID.RegisterRt
Rt
IF/ID.RegisterRd
Rd
M
u
x
EX/MEM.RegisterRd
Forwarding
unit
MEM/WB.RegisterRd
M
u
x
Data Hazards and Stalls

Forwarding will not work where an instruction tries to read a
register following a load instruction that writes the same register
Program
Time (in clock cycles)
execution
CC 1
CC 2
order
(in instructions)
lw $2, 20($1)
and $4, $2, $5
or $8, $2, $6
IM
CC 3
Reg
IM
Reg
IM
CC 4
CC 5
DM
Reg
Reg
IM
CC 6
CC 7
DM
Reg
Reg
DM
CC 8
CC 9
CC 10
Reg
bubble
add $9, $4, $2
slt $1, $6, $7
IM
DM
Reg
IM
Reg
Reg
DM
Reg
Stalls

Hazard detection unit required at the ID
stage

If (ID/EX.MemRead) and
(ID/EX.RegRt = IF/ID.RegRs) or
(ID/EX.RegRt = IF/ID.RegRt then
stall the pipeline