Training - Ann Gordon-Ross

Download Report

Transcript Training - Ann Gordon-Ross

EEL-4713
Computer Architecture
Designing a Multiple-Cycle Processor
1 EEL-4713 Ann Gordon - Ross
Outline of today’s lecture
°Recap and Introduction
°Introduction to the Concept of Multiple Cycle Processor
°Multiple Cycle Implementation of R-type Instructions
°What is a Multiple Cycle Delay Path and Why is it Bad?
°Multiple Cycle Implementation of Or Immediate
°Multiple Cycle Implementation of Load and Store
°Putting it all Together
2 EEL-4713 Ann Gordon - Ross
Abstract view of our single cycle processor
Main
Control
op
3 EEL-4713 Ann Gordon - Ross
Result Store
MemWr
RegDst
RegWr
Reg.
Wrt
Data
Mem
Mem
Access
Ext
ExtOp
ALUSrc
ALUctr
Equal
Register
Fetch
Instruction
Fetch
PC
Next PC
nPC_sel
°looks like an FSM with PC as state
ALU
MemRd
MemWr
ALU
control
fun
What’s wrong with our CPI=1 processor?
Arithmetic & Logical
PC
Inst Memory
Reg File
mux
ALU
mux
Reg File
Load
PC
Inst Memory
mux
Reg File
Critical Path
ALU
Data Mem
Store
PC
Inst Memory
Reg File
ALU
Data Mem
Branch
PC
Inst Memory
Reg File
mux
cmp
mux
°Long Cycle Time
°All instructions take as much time as the slowest
°Real memory is not so nice as our idealized memory
• cannot always get the job done in one (short) cycle
4 EEL-4713 Ann Gordon - Ross
mux
Reg File
Drawbacks of this single cycle processor
°Long cycle time:
• Cycle time is much longer than needed for all other instructions.
Examples:
• R-type instructions do not require data memory access
• Jump does not require ALU operation nor data memory access
°Need for multiple functional units
• Can’t share functional units for multiple operations in the same
instruction
• E.g., instruction/data memory, adders (PC, ALU, branch target, etc.)
5 EEL-4713 Ann Gordon - Ross
Overview of a multiple cycle implementation
°The root of the single cycle processor’s problems:
• The cycle time has to be long enough for the slowest instruction
°Solution:
• Break the instruction into smaller steps
• Execute each step (instead of the entire instruction) in one cycle
- Cycle time: time it takes to execute the longest step
- All the steps have similar length
• This is the essence of the multiple cycle processor
°The advantages of the multiple cycle processor:
• Cycle time is much shorter
• Different instructions take different number of cycles to complete
(for now)
- Load takes five cycles
- Jump only takes three cycles
• Allows a functional unit to be used more than once per instruction
6 EEL-4713 Ann Gordon - Ross
What and when:
°When designing multi-cycle implementations, you must think about:
• What to do on each cycle
• When results are ready for next cycle
°What to do on each cycle:
• Always need to fetch instruction
• Always need to decode instruction (know what to do next)
• Next need to perform actual operation (varies from instruction to
instruction)
• E.g.:
- Load will require address calculation, memory read, reg write
- Branch will require comparison and PC update
-
R-type will require ALU operation, reg write
7 EEL-4713 Ann Gordon - Ross
Overview: Control State Diagram
Ifetch
AdrCal
Fetch and
store in IR
PC=PC+4
ALU
Computes
rs+imm
lw or sw
lw
Read mem.
with address
calculated
above
Decode
operation
Index registers
Register operands
In busA, busB
Rtype
sw
LWmem
Rfetch/Decode
SWMem
RExec
Write rt
to memory
address calc.
above
Rfinish
LWwr
Latch data
read from
memory in rt
8 EEL-4713 Ann Gordon - Ross
BrComplete
beq
Finish branch
Calculate the
Target address
Select mux,
Write PC
Ori
ALU has
inputs and
control set,
can compute
OriExec
ALU has
inputs and
control set,
can compute
OriFinish
Latch ALU
computed
result in
rd
Latch ALU
computed
result in
rt
Example: the five steps of a Load instruction
Instruction Fetch
Address
Data Memory
Reg Wr
Reg. Fetch
Clk
PC
Instr Decode /
Old Value
Clk-to-Q
New Value
Instruction Memory Access Time
New Value
Rs, Rt, Rd,
Op, Func
Old Value
ALUctr
Old Value
ExtOp
Old Value
New Value
ALUSrc
Old Value
New Value
RegWr
Old Value
New Value
busB
Register File Access Time
New Value
Old Value
Delay through Extender & Mux
Old Value
New Value
ALU Delay
Address
Old Value
New Value
Data Memory Access Time
busW
Old Value
9 EEL-4713 Ann Gordon - Ross
New
Register File Write Time
busA
Delay through Control Logic
New Value
Multicycle datapath
°Similar to the single-cycle datapath; use latches for instruction and
branch target address
°Control signals generated for multiple clock cycles per instruction FSM
PCWr=0
PCWrCond=0
Zero
MemWr=0 IRWr=0
ALUSelA=0
1
WrAdr
32
Din Dout
32
32
Rt 0
5
Rd
Ra
Rb
busA
Reg File
32
busW busB 32
<< 2
Control Op
Func
6
6
:
10 EEL-4713 Ann Gordon - Ross
Imm
16
Extend
ExtOp=1
Zero
32
1
Rw
1
32
Beq
Rtype
Ori
Memory
Mux
Ideal
Memory
Rt
5
4
0
1
2
3
Target
32
ALU
RAdr
Rs
1
0
0
Mux
Mux
0
Instruction Reg
32
32
RegWr=0
32
PC
32
RegDst=x
BrWr=1
Mux
IorD=x
PCSrc=x
32
32
ALU
Control
ALUSelB=10
32
ALUOp=Add
Overview: Control State Diagram
Ifetch
AdrCal
Fetch and
store in IR
PC=PC+4
ALU
Computes
rs+imm
lw or sw
lw
Read mem.
with address
calculated
above
Decode
operation
Index registers
Register operands
In busA, busB
Rtype
sw
LWmem
Rfetch/Decode
SWMem
RExec
Write rt
to memory
address calc.
above
Rfinish
LWwr
Latch data
read from
memory in rt
11 EEL-4713 Ann Gordon - Ross
BrComplete
beq
Finish branch
Calculate the
Target address
Select mux,
Write PC
Ori
ALU has
inputs and
control set,
can compute
OriExec
ALU has
inputs and
control set,
can compute
OriFinish
Latch ALU
computed
result in
rd
Latch ALU
computed
result in
rt
Cycle 1: Fetch
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
12 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
1 Instruction fetch cycle: beginning
°Every cycle begins right AFTER the clock tick:
• mem[PC] PC<31:0> + 4
Clk
One “Logic” Clock Cycle
You are here!
Use the
main
ALU
PCWr=?
PC
32
32
IRWr=?
Instruction Reg
Clk
MemWr=?
RAdr
Ideal
Memory
32
32
WrAdr
Din
Dout
ALU
32
4
32
32
32
ALU
Control
ALUop=?
Clk
13 EEL-4713 Ann Gordon - Ross
32
1 Instruction fetch cycle: end
°Every cycle ends AT the next clock tick (storage element updates):
• IR <-- mem[PC]
PC<31:0> <-- PC<31:0> + 4
Clk
One “Logic” Clock Cycle
You are here!
PCWr=1
PC
32
MemWr=0
IRWr=1
ALU
32
00
32
32
32
Instruction Reg
Clk
RAdr
Ideal
Memory
WrAdr
Din Dout
4
32
32
ALU
Control
ALUOp = Add
32
Clk
14 EEL-4713 Ann Gordon - Ross
32
1 Instruction Fetch Cycle: Overall Picture
Ifetch
1.
2.
Latch IR=mem[PC]
Set PC=PC+4
PCWr=1
PCWrCond=x
Zero
ALUSelA=0
MemWr=0 IRWr=1
32
PC
WrAdr
32
Din Dout
32
busA
32
4
busB 32
Zero
32
0
1
2
3
ALUSelB=00
Target
32
1
32
15 EEL-4713 Ann Gordon - Ross
1
ALU
Ideal
Memory
1
Mux
RAdr
Instruction Reg
Mux
0
BrWr=0
0
0
32
32
PCSrc=0
Mux
IorD=0
32
Fetch and
store in IR
PC=PC+4
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
32
32
ALU
Control
ALUOp=Add
Cycle 2: Register fetch, decode
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
16 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
2 Register Fetch / Instruction Decode
°busA <- RegFile[rs] ; busB <- RegFile[rt] ;
°ALU can be also be used to compute branch target address (next slide)
in this cycle (latch target); register compare done on next cycle
PCWr=0
PCWrCond=0
Zero
MemWr=0 IRWr=0
ALUSelA=x
1
WrAdr
32
Din Dout
32
32
17 EEL-4713 Ann Gordon - Ross
5
Rd
1
32
Go to the Op
Control Func
Rt 0
Mux
Ideal
Memory
Rt
5
6
Imm
6
16
Ra
Rb
busA
Reg File
32
Zero
32
ALU
RAdr
Rs
1
0
0
Mux
Mux
0
Instruction Reg
32
32
RegWr=0
32
PC
32
RegDst=x
Mux
IorD=x
PCSrc=x
1
32
Rw
busW busB 32
4
0
1
2
3
32
32
ALU
Control
ALUSelB=xx
ALUOp=xx
2 Register Fetch / Instruction Decode (Continue)
Rfetch/Decode
°busA <- Reg[rs] ; busB <- Reg[rt] ;
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
°Target <- PC + SignExt(Imm16)*4
°Generate control signals
PCWr=0
PCWrCond=0
Zero
MemWr=0 IRWr=0
ALUSelA=0
1
WrAdr
32
Din Dout
32
32
Rt 0
5
Rd
Ra
Rb
busA
Reg File
32
busW busB 32
<< 2
Control Op
Func
6
6
:
18 EEL-4713 Ann Gordon - Ross
Imm
16
Extend
ExtOp=1
4
Zero
32
0
1
2
3
Target
32
1
Rw
1
32
Beq
Rtype
Ori
Memory
Mux
Ideal
Memory
Rt
5
1
ALU
RAdr
Rs
BrWr=1
0
0
Mux
Mux
0
Instruction Reg
32
32
RegWr=0
32
PC
32
RegDst=x
PCSrc=x
Mux
IorD=x
Decode
operation
Index registers
Register operands
In busA, busB
32
32
ALU
Control
ALUSelB=10
32
ALUOp=Add
Cycle 3: Branch completion
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
19 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
3 Branch Completion
BrComplete
°if (busA == busB)
• PC <- Target
PCWr=0
RegWr=0
ALUSelA=1
1
WrAdr
32
Din Dout
32
32
Rt 0
5
Rd
Mux
Ideal
Memory
Rt
5
Rb
busA
Reg File
32
busW busB 32
<< 2
Imm
16
20 EEL-4713 Ann Gordon - Ross
Extend
ExtOp=x
Zero
32
1
Rw
1
32
Ra
4
0
1
2
3
Target
32
ALU
RAdr
Rs
1
0
0
Mux
Mux
0
Instruction Reg
32
32
RegDst=x
BrWr=0
Mux
MemWr=0 IRWr=0
PCSrc=1
32
PC
32
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
PCWrCond=1
Zero
IorD=x
Finish branch
Calculate the
Target address
Select mux,
Write PC
32
32
ALU
Control
ALUSelB=01
32
ALUOp=Sub
Cycle 3: Rtype execution
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
21 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
3 R-type Execution
RExec
1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
°ALU Output <- busA op busB
Jump
PCWr=0
PCWrCond=0
Zero
MemWr=0 IRWr=0
ALUSelA=1
1
WrAdr
32
Din Dout
32
32
32
Rt 0
5
Rd
Mux
Ideal
Memory
Rt
5
Rb
busA
Reg File
32
4
busW busB 32
1 Mux 0
Extend
ExtOp=x
32
1
Rw
1
Imm 16
22 EEL-4713 Ann Gordon - Ross
Ra
1
<< 2
32
MemtoReg=x
Target
JumpAddr
32
0
1
2
3
Zero
ALU
RAdr
Rs
BrWr=0
0
0
Mux
Mux
0
Instruction Reg
32
32
RegWr=0
32
PC
32
RegDst=1
PCSrc=x
Mux
IorD=x
ALU has
inputs and
control set,
can compute
32
32
ALU
Control
Funct
ALUOp=Rtype
ALUSelB=01
Cycle 4: Rtype completion
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
23 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
4 R-type Completion
Rfinish
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
°R[rd] <- ALU Output
PCWr=0
PCWrCond=0
Zero
MemWr=0 IRWr=0
ALUSelA=1
1
WrAdr
32
Din Dout
32
32
32
Rt 0
5
Rd
Mux
Ideal
Memory
Rt
5
Rb
busA
Reg File
32
4
busW busB 32
1 Mux 0
Extend
ExtOp=x
Zero
32
<< 2
32
MemtoReg=0
0
1
2
3
Target
32
1
Rw
1
Imm 16
24 EEL-4713 Ann Gordon - Ross
Ra
1
ALU
RAdr
Rs
BrWr=0
0
0
Mux
Mux
0
Instruction Reg
32
32
RegWr=1
32
PC
32
RegDst=1
PCSrc=x
Mux
IorD=x
Latch ALU
computed
result in
rd
32
32
ALU
Control
ALUOp=Rtype
ALUSelB=01
Cycle 3: Ori execution
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
25 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
3 Ori Execution
ALUOp=Or
1: ALUSelA
ALUOp=Or OriExec
°ALU output <- busA or ZeroExt[Imm16]
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
ALUSelB=11
x: MemtoReg
IorD, PCSrc
PCWr=0
PCWrCond=0
Zero
MemWr=0 IRWr=0
ALUSelA=1
1
WrAdr
32
Din Dout
32
32
32
Rt 0
5
Rd
Mux
Ideal
Memory
Rt
5
Rb
busA
Reg File
32
4
busW busB 32
1 Mux 0
Extend
ExtOp=0
Zero
32
1
Rw
1
Imm 16
26 EEL-4713 Ann Gordon - Ross
Ra
<< 2
32
MemtoReg=x
0
1
2
3
Target
32
ALU
RAdr
Rs
1
0
0
Mux
Mux
0
Instruction Reg
32
32
RegWr=0
32
PC
32
RegDst=0
BrWr=0
Mux
IorD=x
PCSrc=x
32
32
ALU
Control
ALUOp=Or
ALUSelB=11
Cycle 4: Ori completion
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
27 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
4 Ori Completion
ALUOp=Or
°Reg[rt] <- ALU output
PCWr=0
MemWr=0 IRWr=0
RegWr=1
ALUSelA=1
32
PC
1
WrAdr
32
Din Dout
32
32
32
Rt 0
5
Rd
Mux
Ideal
Memory
Rt
5
Rb
busA
Reg File
32
4
busW busB 32
1 Mux 0
Extend
ExtOp=0
Zero
32
<< 2
32
MemtoReg=0
0
1
2
3
Target
32
1
Rw
1
Imm 16
28 EEL-4713 Ann Gordon - Ross
Ra
1
ALU
RAdr
Rs
BrWr=0
0
0
Mux
Mux
0
Instruction Reg
32
32
RegDst=0
PCSrc=x
Mux
IorD=x
32
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
PCWrCond=0
Zero
Latch ALU
computed
result in
rt
OriFinish
32
32
ALU
Control
ALUOp=Or
ALUSelB=11
Cycle 3: Address calculation
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
29 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
3 Memory Address Calculation
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
°ALU output <- busA + SignExt[Imm16]
PCWr=0
PCWrCond=0
Zero
MemWr=0 IRWr=0
RegWr=0
ALUSelA=1
1
WrAdr
32
Din Dout
32
32
32
Rt 0
5
Rd
Mux
Ideal
Memory
Rt
5
Rb
busA
Reg File
32
4
busW busB 32
1 Mux 0
Extend
ExtOp=1
Zero
32
<< 2
32
MemtoReg=x
0
1
2
3
Target
32
1
Rw
1
Imm 16
30 EEL-4713 Ann Gordon - Ross
Ra
1
ALU
RAdr
Rs
BrWr=0
0
0
Mux
Mux
0
Instruction Reg
32
32
RegDst=x
32
PC
32
PCSrc=x
Mux
IorD=x
ALU
Computes
rs+imm
32
32
ALU
Control
ALUOp=Add
ALUSelB=11
Cycle 4: Memory access, store
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
31 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
4 Memory Access for Store
°mem[ALU output] <- busB
PCWr=0
PCWrCond=0
Zero
MemWr=1 IRWr=0
ALUSelA=1
1
WrAdr
32
Din Dout
32
32
32
Rt 0
5
Rd
Mux
Ideal
Memory
Rt
5
Rb
busA
Reg File
32
4
busW busB 32
1 Mux 0
Extend
ExtOp=1
Zero
32
1
Rw
1
Imm 16
32 EEL-4713 Ann Gordon - Ross
Ra
<< 2
32
MemtoReg=x
0
1
2
3
Target
32
ALU
RAdr
Rs
1
0
0
Mux
Mux
0
Instruction Reg
32
32
RegWr=0
32
PC
32
RegDst=x
BrWr=0
Mux
IorD=x
Write rt
to memory
address calc.
above
1: ExtOp SWmem
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
PCSrc=x
32
32
ALU
Control
ALUOp=Add
ALUSelB=11
Cycle 4: Memory access, load
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
33 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
4 Memory Access for Load
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
°Mem Dout <- mem[ALU output]
PCWr=0
PCWrCond=0
Zero
MemWr=0 IRWr=0
RegWr=0
ALUSelA=1
1
WrAdr
32
Din Dout
32
32
32
Rt 0
5
Rd
Mux
Ideal
Memory
Rt
5
Rb
busA
Reg File
32
4
busW busB 32
1 Mux 0
Extend
ExtOp=1
Zero
32
<< 2
32
MemtoReg=x
0
1
2
3
Target
32
1
Rw
1
Imm 16
34 EEL-4713 Ann Gordon - Ross
Ra
1
ALU
RAdr
Rs
BrWr=0
0
0
Mux
Mux
0
Instruction Reg
32
32
RegDst=0
32
PC
32
PCSrc=x
Mux
IorD=1
Read mem.
with address
calculated
above
32
32
ALU
Control
ALUOp=Add
ALUSelB=11
Cycle 4: Memory access, load
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
35 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
5 Write Back for Load
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
°Reg[rt] <- Mem Dout
PCWr=0
PCWrCond=0
Zero
MemWr=0 IRWr=0
ALUSelA=1
1
WrAdr
32
Din Dout
32
32
32
Rt 0
5
Rd
Mux
Ideal
Memory
Rt
5
Rb
busA
Reg File
32
4
busW busB 32
1 Mux 0
Extend
ExtOp=1
Zero
32
<< 2
32
MemtoReg=1
0
1
2
3
Target
32
1
Rw
1
Imm 16
36 EEL-4713 Ann Gordon - Ross
Ra
1
ALU
RAdr
Rs
BrWr=0
0
0
Mux
Mux
0
Instruction Reg
32
32
RegWr=1
PCSrc=x
32
PC
32
RegDst=0
LWwr
Mux
IorD=x
Latch data
read from
memory in rt
32
32
ALU
Control
ALUOp=Add
ALUSelB=11
Putting it all together: Multiple Cycle Datapath
PCWr
PCWrCond
Zero
MemWr
ALUSelA
RegWr
1
1
WrAdr
32
Din Dout
32
32
32
Rt 0
5
Rd
Mux
Ideal
Memory
Rt
5
Rb
busA
Reg File
32
busW busB 32
1 Mux 0
Extend
ExtOp
32
1
Rw
1
Imm 16
37 EEL-4713 Ann Gordon - Ross
Ra
Zero
ALU
RAdr
Rs
<< 2
4
0
1
32
32
2
3
32
MemtoReg
Target
32
0
0
Mux
Mux
0
Instruction Reg
32
32
RegDst
32
PC
32
IRWr
BrWr
Mux
IorD
PCSrc
ALU
Control
ALUOp
ALUSelB
Putting it all together: Control State Diagram
Ifetch
AdrCal
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
lw or sw
LWwr
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
38 EEL-4713 Ann Gordon - Ross
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
Rtype
sw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
Rfetch/Decode
SWMem
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
BrComplete
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
Rfinish
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
Note: there is a multiple-cycle delay path
°There is no register to save the results between:
• 2) Register Fetch: busA <- Reg[rs] ; busB <- Reg[rt]
• 3) R-type Execution: ALU output <- busA op busB
• 4) R-type Completion: Reg[rd] <- ALU output
Register here to save
outputs of Rfetch?
ALUselA
IRWr
Rt 0
5
Rd
1
1 Mux 0
Ra
Rb
busA
Reg File
1
32
Rw
busW busB 32
4
32
0
1
2
3
ALUselB
39 EEL-4713 Ann Gordon - Ross
Zero
ALU
32
Rt
5
Mux
Instruction Reg
Rs
Mux
0
32
32
ALU
Control
ALUOp
Register here to save
outputs of RExec?
A Multiple Cycle Delay Path (Continue)
°Register is NOT needed to save the outputs of Register Fetch:
• IRWr = 0: busA and busB will not change after Register Fetch
°Register is NOT needed to save the outputs of R-type Execution:
• busA and busB will not change after Register Fetch
• Control signals ALUSelA, ALUSelB, and ALUOp
will not change after R-type Execution
• Consequently ALU output will not change after R-type Execution
°In theory, you need a register to hold a signal value if:
• (1) The signal is computed in one clock cycle and used in another.
• (2) AND the inputs to the functional block that computes this signal
can change before the signal is written into a state element.
°You can save a register if Cond 1 is true BUT Cond 2 is false:
• But in practice, this will introduce a multiple cycle delay path:
-
A logic delay path that takes multiple cycles to propagate from
one storage element to the next storage element
40 EEL-4713 Ann Gordon - Ross
Pros and Cons of a Multiple Cycle Delay Path
°A 3-cycle path example:
• IR (storage) -> Reg File Read -> ALU -> Reg File Write (storage)
°Advantages:
• Register savings
• We can share time among cycles:
-
If ALU takes longer than one cycle, still OK as long
as the entire path takes less than 3 cycles to finish
Rt 0
5
Rd
1
1 Mux 0
Ra
Rb
busA
Reg File
1
32
Rw
busW busB 32
4
32
0
1
2
3
ALUselB
41 EEL-4713 Ann Gordon - Ross
Zero
ALU
32
Rt
5
Mux
Instruction Reg
Rs
Mux
0
32
32
ALU
Control
Pros and Cons of a Multiple Cycle Delay Path (Continue)
°Disadvantage:
• Static timing analyzer, which ONLY looks at delay between two
storage elements, will report this as a timing violation
• You have to ignore the static timing analyzer’s warnings
Rt
5
Rt 0
5
Rd
1
1 Mux 0
Ra
Rb
busA
Reg File
1
32
Rw
busW busB 32
4
32
0
1
2
3
ALUselB
42 EEL-4713 Ann Gordon - Ross
Zero
ALU
32
Mux
Instruction Reg
Rs
Mux
0
32
32
ALU
Control
Summary
°Disadvantages of the Single Cycle Processor
• Long cycle time
• Cycle time is too long for all instructions except the Load
°Multiple Cycle Processor:
• Divide the instructions into smaller steps
• Execute each step (instead of the entire instruction) in one cycle
°Do NOT confuse Multiple Cycle Processor with multiple cycle delay
path
• Multiple Cycle Processor executes each
instruction in multiple clock cycles
• Multiple Cycle Delay Path: a combinational logic path between two
storage elements that takes more than one clock cycle to complete
°It is possible (desirable) to build a MC Processor without MCDP:
• Use a register to save a signal’s value whenever a signal is
generated in one clock cycle and used in another cycle later
43 EEL-4713 Ann Gordon - Ross
Control logic
°Review of Finite State Machine (FSM) control
°From Finite State Diagrams to Microprogramming
44 EEL-4713 Ann Gordon - Ross
Overview
°Control may be designed using one of several initial representations.
The choice of sequence control, and how logic is represented, can then
be determined independently; the control can then be implemented with
one of several methods using a structured logic technique.
Initial Representation
Sequencing Control
Logic Representation
Implementation Technique
45 EEL-4713 Ann Gordon - Ross
Finite State Diagram
Microprogram
Explicit Next State
Function
Microprogram counter
+ Dispatch ROMs
Logic Equations
Truth Tables
PLA
ROM
“hardwired control”
“microprogrammed control”
Initial Representation: Finite State Diagram
0
2
AdrCal
Ifetch
1: ExtOp
ALUSelA
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
lw
1: ExtOp LWmem
ALUSelA, IorD
ALUSelB=11
ALUOp=Add
x: MemtoReg
PCSrc
LWwr
9: Jump
See Fig C.3.1
Rtype
sw
1: ALUSelA
RegWr, ExtOp
MemtoReg
ALUSelB=11
ALUOp=Add
x: PCSrc
IorD
46 EEL-4713 Ann Gordon - Ross
5
SWMem
6
1: ExtOp
MemWr
ALUSelA
ALUSelB=11
ALUOp=Add
x: PCSrc,RegDst
MemtoReg
8
Rfetch/Decode
ALUOp=Add
1: BrWr, ExtOp
ALUSelB=10
x: RegDst, PCSrc
IorD, MemtoReg
Others: 0s
lw or sw
3
4
1
ALUOp=Add
1: PCWr, IRWr
x: PCWrCond
RegDst, Mem2R
Others: 0s
beq
Ori
RExec 1: RegDst
ALUSelA
ALUSelB=01
ALUOp=Rtype
x: PCSrc, IorD
MemtoReg
ExtOp
BrComplete
ALUOp=Sub
ALUSelB=01
x: IorD, Mem2Reg
RegDst, ExtOp
1: PCWrCond
ALUSelA
PCSrc
10
OriExec
ALUOp=Or
1: ALUSelA
ALUSelB=11
x: MemtoReg
IorD, PCSrc
11
7
Rfinish
ALUOp=Rtype
1: RegDst, RegWr
ALUselA
ALUSelB=01
x: IorD, PCSrc
ExtOp
OriFinish
ALUOp=Or
x: IorD, PCSrc
ALUSelB=11
1: ALUSelA
RegWr
Sequencing Control: Explicit Next State Function
Control Logic and
Next State Logic
Inputs
Opcode
O
u
t
p
u
t
s
State Reg
°Next state number is encoded just
like datapath controls
47 EEL-4713 Ann Gordon - Ross
Multicycle
Datapath
Interface in detail
48 EEL-4713 Ann Gordon - Ross
Logic Representation: Logic Equations
°Next state from current state
• State 0 -> State1
• State 1 -> S2, S6, S8, S10
• State 2 -> S3, S5
• State 3 -> State 4
• State 4 ->State 0
• State 5 -> State 0
• State 6 -> State 7
• State 7 -> State 0
• State 8 -> State 0
• State 9-> State 0
• State 10 -> State 11
• State 11 -> State 0
°Alternatively,
prior state & condition
S4, S5, S7, S8, S9, S11 -> State0
State 0_____________ -> State 1
State 1 & op = lw|sw -> State 2
State2 & op = lw ____ -> State 3
State 3 ____________ -> State 4
State2 & op = sw ____ -> State 5
State 1 & op = R-type -> State 6
State 6 _____________-> State 7
State 1 & op = beq___ -> State 8
State2 & op = jmp ___-> State 9
State 1& op = ORi__ -> State 10
State 10 __________ -> State 11
See Fig. C.3.3
49 EEL-4713 Ann Gordon - Ross
Multicycle Control
°Given numbers assigned to FSM, can in turn determine next state as
function of the inputs and current state
°Can turn these into Boolean equations for each bit of the next state
lines
• Implement easily using PLA (programmable logic array) or ROM
storing truth tables
• See Figs. C.3.6 and C.3.8 for tables showing outputs and next state
as function of current state and opcode
°What if many more states, many more conditions?
• State machine gets too large; very large ROMs/PLAs
°What if need to add a state?
• May need to increase address for ROM, number of inputs for PLA
gates
°Or just implement FSM in VHDL
52 EEL-4713 Ann Gordon - Ross