CS152 Computer Organization and Design

Download Report

Transcript CS152 Computer Organization and Design

EEM 486: Computer Architecture
Lecture 3
Designing Single Cycle Control
The Big Picture: Where are We Now?
Processor
Input
Control
Memory
Datapath
Output
Lec 3.2
An Abstract View of the Implementation
Ideal
Instruction
Memory
Rd Rs
5
5
32 32-bit
Registers
Clk
Clk
Control Signals Conditions
Rt
5
Rw Ra Rb
PC
32
Instruction
A
32
32
ALU
Next Address
Instruction
Address
Control
B
32
Data
Address
Data
In
Ideal
Data
Memory
Data
Out
Clk
Datapath
Lec 3.3
Recap: A Single Cycle Datapath
 We have everything except control signals (underline)
Instruction<31:0>
1 Mux 0
RegWr
5
Rs Rt
5
5
Rt
Zero
ALUctr
busA
32
0
1
Clk
Imm16
MemWr
MemtoReg
0
32
Data In32
ALUSrc
Rd
WrEn Adr
32
Mux
16
Extender
imm16
32
Mux
32
Clk
Rw Ra Rb
32 32-bit
Registers
busB
32
ALU
busW
Rs
<0:15>
Clk
<11:15>
RegDst
Rt
<21:25>
Rd
Instruction
Fetch Unit
<16:20>
nPC_sel
1
Data
Memory
ExtOp
Lec 3.4
Recap: Meaning of the Control Signals
 nPC_MUX_sel:
0  PC <– PC + 4
1  PC <– PC + 4 + SignExt(Im16) || 00
nPC_MUX_sel
Adr
4
00
Adder
PC
Mux
Adder
PC Ext
imm16
Inst
Memory
Clk
Lec 3.5
Recap: Meaning of the Control Signals
 MemWr:
 ExtOp: “zero”, “sign”
 MemtoReg: 0  ALU; 1  Mem
 ALUsrc: 0  regB; 1  immed
 ALUctr: “add”, “sub”, “or”
RegDst
 RegDst:
0  “rt”; 1  “rd”
 RegWr:
1  write register
Equal
Rd Rt
0
1
Rs Rt
5
5
busA
Rw Ra Rb
32 32-bit
Registers
busB
32
1  write memory
ALUctr MemWr MemtoReg
RegWr 5
ExtOp
1
ALUSrc
32
32
WrEn Adr
Data In
Data
Memory
Clk
0
Mux
16
32
0
Mux
imm16
Extender
Clk
32
ALU
busW
32
=
1
Lec 3.6
RTL: The Add Instruction
31
26
21
op
6 bits
 add
rs
5 bits
16
rt
5 bits
11
rd
5 bits
6
0
shamt
funct
5 bits
6 bits
rd, rs, rt
• mem[PC]
Fetch the instruction
from memory
• R[rd] <- R[rs] + R[rt]
The actual operation
• PC <- PC + 4
Calculate the next
instruction’s address
Lec 3.7
Instruction Fetch Unit at the Beginning of Add
 Fetch the instruction from Instruction memory:
Instruction <- mem[PC]
Inst
Memory
 Same for all instructions
Instruction<31:0>
Adr
nPC_MUX_sel
Adder
0
PC
Mux
Adder
imm16
00
4
1
Clk
Lec 3.8
The Single Cycle Datapath During Add
 R[rd] <- R[rs] + R[rt]
Zero
ALU
16
Extender
imm16
32
1
Imm16
MemtoReg = 0
MemWr=0
0
32
Data In 32
ALUSrc=0
Rd
Clk
WrEn Adr
32
Mux
busA
Rw Ra Rb
32
32 32-bit
Registers
busB
0
32
Rs
<0:15>
Rt
ALUctr = Add
<11:15>
5
Rs Rt
5
5
Mux
32
Clk
Clk
1 Mux 0
RegWr = 1
busW
Rt
<16:20>
RegDst = 1
Rd
Instruction
Fetch Unit
<21:25>
nPC_sel= +4
Instruction<31:0>
1
Data
Memory
ExtOp = x
Lec 3.9
Instruction Fetch Unit at the End of Add
 PC <- PC + 4
• This is the same for all instructions except: Branch and Jump
Inst
Memory
Instruction<31:0>
Adr
nPC_MUX_sel
Adder
0
PC
Mux
Adder
imm16
00
4
1
Clk
Lec 3.10
The Single Cycle Datapath During Or Immediate
 R[rt] <- R[rs] or ZeroExt[Imm16]
Instruction<31:0>
Zero
ALU
16
Extender
imm16
32
1
MemWr =
Clk
Imm16
MemtoReg =
0
32
Data In 32
ALUSrc =
Rd
WrEn Adr
32
Mux
busA
Rw Ra Rb
32
32 32-bit
Registers
busB
0
32
Rs
<0:15>
Rt
ALUctr =
<11:15>
5
Rs Rt
5
5
Mux
32
Clk
Clk
1 Mux 0
RegWr =
busW
Rt
<16:20>
RegDst =
Rd
Instruction
Fetch Unit
<21:25>
nPC_sel =
1
Data
Memory
ExtOp =
Lec 3.11
The Single Cycle Datapath During Load
 R[rt] <- Data Memory [ R[rs] + SignExt[imm16] ]
Zero
ALU
32
Imm16
MemtoReg = 1
MemWr = 0
0
32
Data In 32
ALUSrc = 1
Rd
Clk
Mux
16
Extender
imm16
1
Rs
<0:15>
busA
Rw Ra Rb
32
32 32-bit
Registers
busB
0
32
Rt
<11:15>
ALUctr=Add
Rs Rt
5
5
Mux
32
Clk
Clk
1 Mux 0
RegWr = 1 5
busW
Rt
<16:20>
RegDst = 0
Rd
Instruction
Fetch Unit
<21:25>
nPC_sel= +4
Instruction<31:0>
1
WrEn Adr
Data
Memory
32
ExtOp = 1
Lec 3.13
The Single Cycle Datapath During Store
 Data Memory [ R[rs] + SignExt[imm16] ] <- R[rt]
Zero
ALU
16
Extender
imm16
32
1
MemWr =
Clk
Imm16
MemtoReg =
0
32
Data In 32
ALUSrc =
Rd
WrEn Adr
32
Mux
busA
Rw Ra Rb
32
32 32-bit
Registers
busB
0
32
Rs
<0:15>
Rt
ALUctr =
<11:15>
Rs Rt
5
5
Mux
32
Clk
Clk
1 Mux 0
RegWr = 5
busW
Rt
<16:20>
RegDst =
Rd
Instruction
Fetch Unit
<21:25>
nPC_sel =
Instruction<31:0>
1
Data
Memory
ExtOp =
Lec 3.14
The Single Cycle Datapath During Store
Zero
ALU
32
MemtoReg = x
MemWr = 1
0
32
Data In 32
ALUSrc = 1
Imm16
Clk
WrEn Adr
32
Mux
16
Extender
imm16
1
Rd
<0:15>
busA
Rw Ra Rb
32
32 32-bit
Registers
busB
0
32
Rs
<11:15>
5
Rt
ALUctr= Add
Rs Rt
5
5
Mux
32
Clk
Clk
1 Mux 0
RegWr = 0
busW
Rt
<16:20>
RegDst = x
Rd
Instruction
Fetch Unit
<21:25>
nPC_sel= +4
Instruction<31:0>
1
Data
Memory
ExtOp = 1
Lec 3.15
The Single Cycle Datapath During Branch
 if (R[rs] - R[rt] == 0) then Zero <- 1 ; else Zero <- 0
Instruction<31:0>
Zero
ALU
16
Extender
imm16
32
1
Imm16
MemtoReg = x
MemWr = 0
0
32
Data In 32
ALUSrc = 0
Rd
Clk
WrEn Adr
32
Mux
busA
Rw Ra Rb
32
32 32-bit
Registers
busB
0
32
Rs
<0:15>
Rt
ALUctr= Sub
<11:15>
5
Rs Rt
5
5
Mux
32
Clk
Clk
1 Mux 0
RegWr = 0
busW
Rt
<16:20>
RegDst = x
Rd
Instruction
Fetch Unit
<21:25>
nPC_sel= “Br”
1
Data
Memory
ExtOp = x
Lec 3.16
Instruction Fetch Unit at the End of Branch
Inst
Memory
nPC_sel
Adr
Instruction<31:0>
Zero
nPC_MUX_sel
• Direct MUX select?
• Branch / not branch
4
Adder
00
0
PC
Mux
Adder
imm16
 What is encoding of nPC_sel?
1
nPC_sel
0
1
1
zero?
x
0
1
MUX
0
0
1
Clk
Lec 3.17
Step 4: Given Datapath: RTL -> Control
Instruction<31:0>
Rd
<0:15>
Rs
<11:15>
Rt
<16:20>
Op Fun
<21:25>
Adr
<21:25>
Inst
Memory
Imm16
Control
nPC_sel RegWr RegDst ExtOp ALUSrcALUctr MemWr MemtoReg
Zero
DATA PATH
Lec 3.18
Summary of Control Signals
inst
Register Transfer
ADD
R[rd] <– R[rs] + R[rt];
PC <– PC + 4
ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”
SUB
R[rd] <– R[rs] – R[rt];
PC <– PC + 4
ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”
ORi
R[rt] <– R[rs] + zero_ext(Imm16);
PC <– PC + 4
ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel = “+4”
LOAD
R[rt] <– MEM[ R[rs] + sign_ext(Imm16)];
PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg,
RegDst = rt, RegWr, nPC_sel = “+4”
STORE
MEM[ R[rs] + sign_ext(Imm16)] <– R[rt];
PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”
BEQ
if ( R[rs] == R[rt] ) then PC <– [PC + sign_ext(Imm16)] || 00 else PC <– PC + 4
nPC_sel = “Br”, ALUctr = “sub”
Lec 3.19
Summary of the Control Signals
See
Appendix A
We Don’t Care :-)
func 10 0000 10 0010
op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100
add
sub
ori
lw
sw
beq
RegDst
1
1
0
0
x
x
ALUSrc
0
0
1
1
1
0
MemtoReg
0
0
0
1
x
x
RegWrite
1
1
1
1
0
0
MemWrite
0
0
0
0
1
0
nPCsel
0
0
0
0
0
1
ExtOp
x
x
0
1
1
x
Add
Sub
Or
Add
ALUctr<2:0>
Add
Sub
Lec 3.20
Concept of Local Decoding
op
00 0000
00 1101 10 0011 10 1011 00 0100
R-type
ori
lw
sw
beq
RegDst
1
0
0
x
x
ALUSrc
0
1
1
1
0
MemtoReg
0
0
1
x
x
RegWrite
1
1
1
0
0
MemWrite
0
0
0
1
0
Branch
0
0
0
0
1
ExtO
p
ALUop<N:0>
x
0
1
1
x
“R-type”
Or
func
op
6
Main
Control
6
ALUop
ALU
Control
(Local)
Add Sub
ALUctr
3
ALU
N
Add
Lec 3.21
Encoding of ALUop
func
op
6
Main
Control
6
ALUop
N
ALU
Control
(Local)
ALUctr
3
 In this exercise, ALUop has to be 2 bits wide to represent:
• (1) “R-type” instructions
• “I-type” instructions that require the ALU to perform:
-
(2) Or, (3) Add, and (4) Subtract
 To implement the full MIPS ISA, ALUop has to be 3 bits to represent:
• (1) “R-type” instructions
• “I-type” instructions that require the ALU to perform:
-
(2) Or, (3) Add, (4) Subtract, (5) And, and (6) Xor
ALUop (Symbolic)
ALUop<2:0>
R-type
ori
lw
sw
“R-type”
Or
Add
0 10
0 00
Add Sub
0 00
0 01
1 00
beq
Lec 3.22
Decoding of the “func” Field
func
op
Main
Control
6
N
ALUop (Symbolic)
R-type
op
lw
sw
“R-type”
Or
Add
0 10
0 00
Add Sub
0 00
0 01
21
rs
funct<5:0> Instruction Operation
add
10 0010
subtract
10 0100
and
10 0101
or
10 1010
set-on-less-than
16
rt
11
rd
ALUctr
ALU
10 0000
3
ori
1 00
26
ALUctr
R-type
ALUop<2:0>
31
ALU
Control
(Local)
6
ALUop
beq
6
shamt
0
funct
ALUctr<2:0>
ALU Operation
000
And
001
Or
010
Add
110
Subtract
111
Set-on-less-than
Lec 3.23
Truth Table for ALUctr
funct<3:0>
R-type
ALUop
(Symbolic) “R-type”
ALUop<2:0> 1 00
Instruction Op.
0000
add
ori
lw
sw
beq
0010
subtract
Or
Add
Add
0100
and
0 10
0 00
0 00
Sub
0 01
0101
or
1010
set-on-less-than
ALUop
func
ALU
bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> Operation
0
0
0
x
x
x
x
0
x
1
x
x
x
x
0
1
x
x
x
x
x
1
x
x
0
0
0
0
1
x
x
0
0
1
0
1
x
x
0
1
0
0
1
x
x
0
1
0
1
x
x
1
0
1
Add
ALUctr
bit<2> bit<1> bit<0>
0
1
0
1
1
0
Or
0
0
1
Add
0
1
0
1
1
0
And
0
0
0
1
Or
0
0
1
0
Set on <
1
1
1
Subtract
Subtract
Lec 3.24
Logic Equation for ALUctr<2>
ALUop
func
bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> ALUctr<2>
0
x
1
x
x
x
x
1
1
x
x
0
0
1
0
1
1
x
x
1
0
1
0
1
This makes func<3> a don’t care
ALUctr<2> = !ALUop<2> & ALUop<0> +
ALUop<2> & !func<2> & func<1> & !func<0>
Lec 3.25
Logic Equation for ALUctr<1>
ALUop
func
bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> ALUctr<1>
0
0
0
x
x
x
x
1
0
x
1
x
x
x
x
1
1
x
x
0
0
0
0
1
1
x
x
0
0
1
0
1
1
x
x
1
0
1
0
1
ALUctr<1> = !ALUop<2> & !ALUop<1> +
!ALUop<2> & ALUop<0> +
ALUop<2> & !func<2> & !func<0>
Lec 3.26
Logic Equation for ALUctr<0>
ALUop
func
bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> ALUctr<0>
0
1
x
x
x
x
x
1
1
x
x
0
1
0
1
1
1
x
x
1
0
1
0
1
ALUctr<0> = !ALUop<2> & ALUop<1>
+ ALUop<2> & !func<3> & func<2> & !func<1> & func<0>
+ ALUop<2> & func<3> & !func<2> & func<1> & !func<0>
Lec 3.27
ALU Control Block
func
6
ALUop
3
ALU
Control
(Local)
ALUctr
3
ALUctr<2> = !ALUop<2> & ALUop<0> +
ALUop<2> & !func<2> & func<1> & !func<0>
ALUctr<1> = !ALUop<2> & !ALUop<1> + !ALUop<2> & ALUop<0>
ALUop<2> & !func<2> & !func<0>
ALUctr<0> = !ALUop<2> & ALUop<1> + ALUop<2> & !func<3> &
func<2> & !func<1> & func<0> + ALUop<2> &
func<3> & !func<2> & func<1> & !func<0>
Lec 3.28
Step 5: Logic For Each Control Signal
 nPC_sel
<= if (OP == BEQ) then “Br” else “+4”
 ALUsrc
<= if (OP == “Rtype”) then “regB” else “immed”
 ALUctr
<= if (OP == “Rtype”) then funct
elseif (OP == ORi) then “OR”
elseif (OP == BEQ) then “sub”
else “add”
 ExtOp
<= _____________
 MemWr
<= _____________
 MemtoReg <= _____________
 RegWr:
<=_____________
 RegDst:
<= _____________
Lec 3.29
“Truth Table” for the Main Control
RegDst
op
6
Main
Control
func
ALUSrc
ALU
Control
(Local)
6
:
ALUop
3
op
RegDst
ALUSrc
MemtoReg
RegWrite
MemWrite
nPC_sel
ExtOp
ALUop
ALUop
ALUop
ALUop
00 0000
R-type
1
0
0
1
0
0
x
(Symbolic)
<2>
<1>
<0>
“R-type”
1
0
0
ALUctr
3
00 1101 10 0011 10 1011 00 0100
ori
0
1
0
1
0
0
lw
0
1
1
1
0
0
sw
x
1
x
0
1
0
beq
x
0
x
0
0
1
0
1
1
x
Or
0
1
0
Add
0
0
0
Add Subtract
0
0
0
0
0
1
Lec 3.31
“Truth Table” for RegWrite
op
00 0000
R-type
RegWrite
00 1101 10 0011 10 1011 00 0100
ori
lw
sw
beq
1
1
0
0
1
RegWrite = R-type + ori + lw
= !op<5> & !op<4> & !op<3> & !op<2> & !op<1> & !op<0>
+ !op<5> & !op<4> & op<3> & op<2> & !op<1> & op<0>
+ op<5> & !op<4> & !op<3> & !op<2> & op<1> & op<0>
op<5>
..
op<5>
<0>
R-type
..
op<5>
<0>
ori
..
op<5>
<0>
lw
..
op<5>
<0>
sw
(R-type)
(ori)
(lw)
..
<0>
beq
RegWrite
Lec 3.32
PLA Implementation of the Main Control
op<5> .
. op<5> .. op<5> .. op<5> .. op<5> ..
<0>
R-type
<0>
ori
<0>
lw
<0>
sw
<0>
beq
RegWrite
ALUSrc
RegDst
MemtoReg
MemWrite
Branch
ExtOp
ALUop<2>
ALUop<1>
ALUop<0>
Lec 3.33
Putting it All Together: A Single Cycle Processor
ALUop
Instr<31:26>
op
6
RegDst
Main
Control
Clk
1 Mux 0
RegWr 5
Rs Rt
5
5
Rt
ALUctr
busA
32
0
1
Rd
Clk
Imm16
MemtoReg
MemWr
0
32
Data In32
ALUSrc
Rs
WrEn Adr
32
Mux
Extender
imm16
Instr<15:0> 16
32
Mux
32
Clk
Rw Ra Rb
32 32-bit
Registers
busB
32
Zero
ALU
busW
Instruction<31:0>
<0:15>
Rt
Instruction
Fetch Unit
3
<11:15>
nPC_sel
ALUctr
<16:20>
RegDst
func
Instr<5:0> 6
ALUSrc
:
ALU
Control
<21:25>
Rd
3
1
Data
Memory
ExtOp
Lec 3.34
Recap: An Abstract View of the Critical Path (Load)
Ideal
Instruction
Memory
Instruction
Rd Rs
5
5
Instruction
Address
Clk
PC
32
Clk
Rt
5
Rw Ra Rb
32 32-bit
Registers
Imm
16
A
32
32
ALU
Next Address
Critical Path (Load Operation) =
PC’s Clk-to-Q +
Instruction Memory’s Access Time +
Register File’s Access Time +
ALU to Perform a 32-bit Add +
Data Memory Access Time +
Setup Time for Register File Write +
Clock Skew
B
32
Data
Address
Data
In
Ideal
Data
Memory
Clk
Lec 3.35
Worst Case Timing (Load)
Clk
PC
Old Value
Clk-to-Q
New Value
Instruction Memory Access Time
New Value
Rs, Rt, Rd,
Op, Func
Old Value
ALUctr
Old Value
ExtOp
Old Value
New Value
ALUSrc
Old Value
New Value
MemtoReg
Old Value
New Value
RegWr
Old Value
New Value
busA
busB
Delay through Control Logic
New Value
Register
Write Occurs
Register File Access Time
New Value
Old Value
Delay through Extender & Mux
Old Value
New Value
ALU Delay
Address
Old Value
New Value
Data Memory Access Time
busW
Old Value
New
Lec 3.36
Drawback of this Single Cycle Processor
 Long cycle time:
• Cycle time must be long enough for the load instruction:
PC’s Clock -to-Q +
Instruction Memory Access Time +
Register File Access Time +
ALU Delay (address calculation) +
Data Memory Access Time +
Register File Setup Time +
Clock Skew
 Cycle time for load is much longer than needed for all other
instructions
Lec 3.37
Summary
 Single cycle datapath => CPI=1, CCT => long
 5 steps to design a processor
• 1. Analyze instruction set => datapath requirements
• 2. Select set of datapath components & establish clock methodology
• 3. Assemble datapath meeting the requirements
• 4. Analyze implementation of each instruction to determine setting of control
points that effects the register transfer.
• 5. Assemble the control logic
 Control is the hard part
Processor
Control
Memory
 MIPS makes control easier
• Instructions same size
• Source registers always in same place
Input
Datapath
Output
• Immediates same size, location
• Operations always on registers/immediates
Lec 3.38