CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single Cycle MIPS CPU Instructors: Randy H.

Download Report

Transcript CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single Cycle MIPS CPU Instructors: Randy H.

CS 61C: Great Ideas in Computer
Architecture (Machine Structures)
Single Cycle MIPS CPU
Instructors:
Randy H. Katz
David A. Patterson
http://inst.eecs.Berkeley.edu/~cs61c/sp11
11/6/2015
Spring 2011 -- Lecture #18
1
You Are Here!
Software
• Parallel Requests
Assigned to computer
e.g., Search “Katz”
Hardware
Harness
Smart
Phone
Warehouse
Scale
Computer
• Parallel Threads Parallelism &
Assigned to core
e.g., Lookup, Ads
Achieve High
Performance
Computer
• Parallel Instructions
>1 instruction @ one time
e.g., 5 pipelined instructions
• Parallel Data
>1 data item @ one time
e.g., Add of 4 pairs of words
• Hardware descriptions
All gates functioning in
parallel at same time
11/6/2015
…
Core
Memory
Core
(Cache)
Input/Output
Instruction Unit(s)
Core
Functional
Unit(s)
A0+B0 A1+B1 A2+B2 A3+B3
Main Memory
Today
Logic Gates
Spring 2011 -- Lecture #18
3
Levels of
Representation/Interpretation
High Level Language
Program (e.g., C)
Compiler
Assembly Language
Program (e.g., MIPS)
Assembler
Machine Language
Program (MIPS)
temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;
lw
lw
sw
sw
0000
1010
1100
0101
$t0, 0($2)
$t1, 4($2)
$t1, 0($2)
$t0, 4($2)
1001
1111
0110
1000
1100
0101
1010
0000
Anything can be represented
as a number,
i.e., data or instructions
0110
1000
1111
1001
1010
0000
0101
1100
1111
1001
1000
0110
0101
1100
0000
1010
1000
0110
1001
1111
Machine
Interpretation
Hardware Architecture Description
(e.g., block diagrams)
Architecture
Implementation
Logic Circuit Description
(Circuit Schematic Diagrams)Spring 2011 -- Lecture #18
11/6/2015
4
Review
• Clocks tell us when D-flip-flops change
– Setup and Hold times important
• We pipeline long-delay CL for faster clock
• Finite State Machines extremely useful
• Use muxes to select among input
– S input bits selects 2S inputs
– Each input can be n-bits wide, indep of S
• Can implement muxes hierarchically
• Can implement FSM with register + logic
11/6/2015
Spring 2011 -- Lecture #18
5
Agenda
•
•
•
•
•
•
•
MIPS-lite Datapath
Administrivia
CPU Timing
MIPS-lite Control
Datapath Control
Technology Break
Control Implementation
11/6/2015
Spring 2011 -- Lecture #18
6
The MIPS-lite Subset
• ADDU and SUBU
31
op
– addu rd,rs,rt
– subu rd,rs,rt
• OR Immediate:
26
rs
6 bits
31
op
31
– lw rt,rs,imm16
– sw rt,rs,imm16
• BRANCH:
31
26
op
– beq rs,rt,imm16 6 bits
11/6/2015
5 bits
Spring 2011 -- Lecture #18
rd
shamt
funct
5 bits
5 bits
6 bits
0
16 bits
0
immediate
5 bits
21
rs
0
16
rt
5 bits
6
immediate
5 bits
21
rs
11
16
rt
5 bits
26
6 bits
5 bits
21
rs
op
16
rt
5 bits
26
– ori rt,rs,imm16 6 bits
• LOAD and
STORE Word
21
16 bits
16
rt
5 bits
0
immediate
16 bits
7
Processor Design Process
• Five steps to design a processor:
Step 1: Analyze instruction set to determine datapath
requirements (see next slide)
Step 2: Select set of datapath components & establish
clocking methodology
Step 3: Assemble datapath components that meet the
requirements
Step 4: Analyze implementation of each instruction to
determine setting of control points that realizes the
register transfer
Step 5: Assemble the control logic
11/6/2015
Spring 2011 -- Lecture #18
8
Register Transfer Language (RTL)
• RTL gives the meaning of the instructions
{op , rs , rt , rd , shamt , funct}  MEM[ PC ]
{op , rs , rt ,
Imm16}  MEM[ PC ]
• All start by fetching the instruction
Inst
Register Transfers
ADDU
R[rd]  R[rs] + R[rt]; PC  PC + 4
SUBU
R[rd]  R[rs] – R[rt]; PC  PC + 4
ORI
R[rt]  R[rs] | zero_ext(Imm16); PC  PC + 4
LOAD
R[rt]  MEM[ R[rs] + sign_ext(Imm16)]; PC  PC + 4
STORE
MEM[ R[rs] + sign_ext(Imm16) ]  R[rt]; PC  PC + 4
BEQ
if ( R[rs] == R[rt] )
then PC  PC + 4 + (sign_ext(Imm16) || 00)
else PC  PC + 4
11/6/2015
Spring 2011 -- Lecture #18
9
Step 1: Requirements of the
Instruction Set
• Memory (MEM)
– Instructions & data (will use one for each: really caches)
• Registers (R: 32 x 32)
– Read rs
– Read rt
– Write rt or rd
• PC
• Extender (sign/zero extend)
• Add/Sub/OR unit for operation on register(s) or extended
immediate
• Add 4 (+ maybe extended immediate) to PC
• Compare if registers equal?
11/6/2015
Spring 2011 -- Lecture #18
10
mux
+4
1. Instruction
Fetch
11/6/2015
rd
rs
rt
ALU
Data
memory
registers
PC
instruction
memory
Generic Steps of Datapath
imm
2. Decode/
Register
Read
3. Execute 4. Memory
Spring 2011 -- Lecture #18
5. Register
Write
11
Step 2: Components of the Datapath
• Combinational Elements
• Storage Elements + Clocking Methodology
• Building Blocks
OP
CarryIn
A
A
CarryOut
32
Adder
11/6/2015
B
32
32
Y
B
32
Multiplexer
Spring 2011 -- Lecture #18
32
ALU
32
Sum
A
MUX
Adder
B
32
Select
32
Result
32
ALU
12
ALU Needs for MIPS-lite + Rest of MIPS
• Addition, subtraction, logical OR, ==:
ADDU
SUBU
ORI
R[rd] = R[rs] + R[rt]; ...
R[rd] = R[rs] – R[rt]; ...
R[rt] = R[rs] | zero_ext(Imm16)...
BEQ
if ( R[rs] == R[rt] )...
• Test to see if output == 0 for any ALU operation
gives == test. How?
• P&H also adds AND, Set Less Than (1 if A < B, 0
otherwise)
• ALU from Appendix C, section C.5
11/6/2015
Spring 2011 -- Lecture #18
13
Storage Element: Idealized Memory
Write Enable
Address
• Memory (idealized)
– One input bus: Data In
– One output bus: Data Out
• Memory word is found by:
Data In
32
Clk
DataOut
32
– Address selects the word to put on Data Out
– Write Enable = 1: address selects the memory
word to be written via the Data In bus
• Clock input (CLK)
– CLK input is a factor ONLY during write operation
– During read operation, behaves as a combinational logic
block: Address valid  Data Out valid after “access time”
11/6/2015
Spring 2011 -- Lecture #18
14
Storage Element: Register (Building Block)
Write Enable
• Similar to D Flip Flop except
– N-bit input and output
– Write Enable input
• Write Enable:
Data In
Data Out
N
N
clk
– Negated (or deasserted) (0): Data Out will not
change
– Asserted (1): Data Out will become Data In on
rising edge of clock
11/6/2015
Spring 2011 -- Lecture #18
15
Storage Element: Register File
RW RA RB
Write Enable 5 5 5
• Register File consists of 32 registers:
– Two 32-bit output busses:
busA and busB
– One 32-bit input bus: busW
• Register is selected by:
busW
32
Clk
32 x 32-bit
Registers
busA
32
busB
32
– RA (number) selects the register to put on busA (data)
– RB (number) selects the register to put on busB (data)
– RW (number) selects the register to be written
via busW (data) when Write Enable is 1
• Clock input (clk)
– Clk input is a factor ONLY during write operation
– During read operation, behaves as a combinational logic block:
• RA or RB valid  busA or busB valid after “access time.”
11/6/2015
Spring 2011 -- Lecture #18
16
Step 3: Assemble DataPath Meeting
Requirements
• Register Transfer Requirements
 Datapath Assembly
• Instruction Fetch
• Read Operands and Execute
Operation
• Common RTL operations
clk
– Fetch the Instruction:
mem[PC]
– Update the program counter:
• Sequential Code:
PC  PC + 4
• Branch and Jump:
PC  “something else”
11/6/2015
Spring 2011 -- Lecture #18
PC
Next Address
Logic
Address
Instruction Word
Instruction
Memory
32
17
Step 3: Add & Subtract
• R[rd] = R[rs] op R[rt] (addu rd,rs,rt)
– Ra, Rb, and Rw come from instruction’s Rs, Rt, and Rd fields
31
26
op
6 bits
21
rs
5 bits
16
rt
5 bits
11
rd
5 bits
6
shamt
5 bits
0
funct
6 bits
– ALUctr and RegWr: control logic after decoding the instruction
rd rs rt
RegWr 5 5 5
Rw Ra Rb
32 x 32-bit
Registers
busA
32
busB
clk
ALU
busW
32
ALUctr
Result
32
32
• … Already defined the register file & ALU
11/6/2015
Spring 2011 -- Lecture #18
18
Administrivia
• Project 3, Part 2 due Sunday 4/3
– Threads Level Parallelism and OpenMP
• Project 4, Part 1 due Sunday 4/10
– Design a 16-bit pipelined computer in Logisim
– Last homework due Sunday 4/10
– Project 4, Part 2 due Sunday 4/17
• Extra Credit due 4/24 – Fastest Matrix
Multiply
• Final Exam Monday 5/9 11:30-2:30PM
11/6/2015
Spring 2011 -- Lecture #18
19
Project 3 speeds
45
40
Fall
61C
Average
35
How many teams got it
Speed of
Math
Library
30
25
20
15
10
5
0
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Gflop/s
11/6/2015
Spring 2011 -- Lecture #18
20
Lines of Code vs. Performance
512
18.5 GFLOPS
535 LOC
Lines of code
256
128
64
32
16.7 GFLOPS
39 LOC
16
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Gflop/s
11/6/2015
Spring 2011 -- Lecture #18
21
Administrivia
• What classes should I take (now)?
• Take classes from great teachers! (teacher > class)
– Distinguished Teaching Award (very hard to get: ~3/year)
• http://teaching.berkeley.edu/dta-dept.html
– HKN Course evaluations (≥6 is very good)
• https://hkn.eecs.berkeley.edu/coursesurveys
– EECS web site has plan for year (up in late spring: now)
• http://www.eecs.berkeley.edu/Scheduling/CS/schedule-draft.html
• If have choice of multiple great teachers
–
–
–
–
–
11/6/2015
EE122 Networking
CS152 Computer Architecture and Engineering
CS162 Operating Systems and Systems Programming
CS169 Software Engineering (for SaaS with Fox)
CS194 Engineering Parallel Software (offered in Fall?)
Spring 2011 -- Lecture #18
22
61c in the News
• World’s most admired
companies
– Published by Fortune
magazine, March 2011
• Picked by CEOs of 5000
companies
• IT companies: 3 of top 5,
1/3 top 12 in all industries
– Client and Cloud focus
11/6/2015
1 Apple (mobile client)
2 Google (cloud)
3 Berkshire Hathaway (invest)
4 Johnson & Johnson (health)
5 Amazon.com (cloud)
6 Procter & Gamble (consumer)
7 Toyota Motor (car)
8 Goldman Sachs (finance)
9 Wal-Mart Stores (retail)
10 Coca-Cola (beverage)
11 Microsoft (PC->client-cloud)
12 Southwest Airlines (airline)
Spring 2011 -- Lecture #17
23
Agenda
•
•
•
•
•
•
•
MIPS-lite Datapath
Administrivia
CPU Timing
MIPS-lite Control
Datapath Control
Technology Break
Control Implementation
11/6/2015
Spring 2011 -- Lecture #18
24
Clocking Methodology
Clk
.
.
.
.
.
.
.
.
.
.
.
.
• Storage elements clocked by same edge
• “Critical path” (longest path through logic) determines length
of clock period
• Have to allow for Clock to Q and Setup Times too
• This lecture (and P&H sections) 4.3-4.4 do whole instruction
in 1 clock cycle for pedagogic reasons
– Project 4 will do it in 2 clock cycles via simple pipelining
– Next week explain pipelining and use 5 clock cycles per instruction
11/6/2015
Spring 2011 -- Lecture #18
25
Register-Register Timing:
One Complete Cycle
Clk
PC Old Value
Rs, Rt, Rd,
Op, Func
Old Value
ALUctr
Old Value
RegWr
Old Value
busA, B
Old Value
busW
Old Value
New Value
Instruction Memory Access Time
New Value
Delay through Control Logic
New Value
New Value
Register File Access Time
New Value
ALU Delay
New Value
ALUctr
RegWr Rd Rs Rt
5
Rw
busW
5
5
Ra Rb
11/6/2015
clk
32
ALU
RegFile
busA
busB
32
Register Write
Occurs Here
32
Spring 2011 -- Lecture #18
26
Logical Operations with Immediate
• R[rt] = R[rs] op ZeroExt[imm16]
31
26
21
op
16 15
rs
31 6 bits
0
rt
5 bits
immediate
5 bits 16 15
16 bits
0
immediate
0000000000000000
16 bits
16 bits
But we’re writing to Rt register??
ALUctr
RegWr Rd Rs Rt
5
Rw
busW
5
Ra Rb
11/6/2015
busA
32
ALU
RegFile
clk
5
busB
32
32
Spring 2011 -- Lecture #18
27
Logical Operations with Immediate
• R[rt] = R[rs] op ZeroExt[imm16]
31
26
21
op
rd
rt
1
0
RegWr
5
Rw
0
rt
5 bits
immediate
5 bits 16 15
0000000000000000
16 bits
16 bits
0
immediate
16 bits
What about rt register write??
rs
5
rt
ALUctr
5
Ra Rb
busA
busB
32
clk
16
ZeroExt
imm16
32
ALU
RegFile
32
11/6/2015
rs
31 6 bits
RegDst
16
0
32
• Already defined
32-bit MUX;
Zero Ext?
1
32
ALUSrc
Spring 2011 -- Lecture #18
28
Load Operations
• R[rt] = Mem[R[rs] + SignExt[imm16]]
Example: lw rt,rs,imm16
31
26
21
op
16
rs
6 bits
0
rt
5 bits
immediate
5 bits
16 bits
RegDst rd rt
1
RegWr
0
5
Rw
5
ALUctr
5
Ra Rb
busA
32
clk
imm16
16
32
ALU
busB
ZeroExt
11/6/2015
rt
RegFile
32
What sign
extending??
rs
32
0
1
32
Spring 2011 -- Lecture #18
ALUSrc
29
Load Operations
• R[rt] = Mem[R[rs] + SignExt[imm16]]
Example: lw rt,rs,imm16
31
26
21
op
16
rs
6 bits
0
rt
5 bits
immediate
5 bits
16 bits
ALUctr
RegDst rd rt
1
RegWr
0
rs
5
5
Rw
busW
5
Ra Rb
busA
16
ExtOp
Extender
imm16
32
ALU
busB
32
clk
11/6/2015
rt
RegFile
32
MemtoReg
MemWr
32
0
0
1
? 32
Data In
ALUSrc
clk
32
Spring 2011 -- Lecture #18
WrEn Adr
Data
Memory
1
30
RTL: The Add Instruction
31
26
op
6 bits
21
rs
5 bits
16
rt
5 bits
11
6
0
rd
shamt
funct
5 bits
5 bits
6 bits
add rd, rs, rt
– MEM[PC]
Fetch the instruction from memory
– R[rd] = R[rs] + R[rt] The actual operation
– PC = PC + 4 Calculate the next instruction’s address
11/6/2015
Spring 2011 -- Lecture #18
31
Instruction Fetch Unit at the Beginning of Add
• Fetch the instruction from Instruction memory:
Instruction = MEM[PC]
Inst
Memory
– same for
all instructions
nPC_sel
Inst Address
Adder
4
Instruction<31:0>
00
PC
Mux
Adder
PC Ext
clk
imm16
11/6/2015
Spring 2011 -- Lecture #18
32
Single Cycle Datapath during Add
31
26
op
21
16
rs
11
rt
rd
6
0
shamt
funct
R[rd] = R[rs] + R[rt]
RegWr=1
rs
5
5
Rw
busW
rt
5
Ra Rb
busB
32
imm16
16
ExtOp=x
Extender
clk
Rs Rt Rd Imm16
zero ALUctr=ADD
MemtoReg=0
MemWr=0
32
=
ALU
RegFile
32
11/6/2015
busA
32
0
0
32
1
Data In
32
ALUSrc=0
Spring 2011 -- Lecture #18
<0:15>
0
<11:15>
1
<16:20>
rt
<21:25>
rd
Instruction<31:0>
instr
fetch
unit
nPC_sel=+4
RegDst=1
clk
clk
WrEn Adr
Data
Memory
1
33
Instruction Fetch Unit at End of Add
• PC = PC + 4
– Same for all
instructions except:
Branch and Jump
Inst
Memory
nPC_sel=+4
Inst Address
Adder
4
00
PC
Mux
Adder
PC Ext
clk
imm16
11/6/2015
Spring 2011 -- Lecture #18
34
Single Cycle Datapath during Or Immediate
31
26
21
op
16
rs
0
rt
immediate
• R[rt] = R[rs] OR ZeroExt[Imm16]
Rs Rt
5
5
Rw
busW
5
Ra Rb
busA
busB
32
imm16
16
ExtOp=
Extender
clk
32
=
ALU
RegFile
32
11/6/2015
Rs Rt Rd
zero ALUctr=
0
<0:15>
RegWr=
<11:15>
1
clk
Instruction<31:0>
<16:20>
Rd Rt
instr
fetch
unit
<21:25>
nPC_sel=
RegDst=
Imm16
MemtoReg=
MemWr=
32
0
0
32
1
Data In
32
ALUSrc=
Spring 2011 -- Lecture #18
clk
WrEn Adr
Data
Memory
1
35
Single Cycle Datapath during Or Immediate
31
26
21
op
16
rs
0
rt
immediate
• R[rt] = R[rs] OR ZeroExt[Imm16]
5
Rw
busW
Rs Rt
5
5
Ra Rb
busA
busB
32
imm16
16
ExtOp=zero
Extender
clk
32
=
ALU
RegFile
32
11/6/2015
Rs Rt Rd
zero ALUctr=OR
0
<0:15>
RegWr=1
<11:15>
clk
Rd Rt
1
instr
fetch
unit
<21:25>
RegDst=0
Instruction<31:0>
<16:20>
nPC_sel=+4
Imm16
MemtoReg=0
MemWr=0
32
0
0
32
1
Data In
32
ALUSrc=1
Spring 2011 -- Lecture #18
clk
WrEn Adr
Data
Memory
1
36
Single Cycle Datapath during Load
31
26
21
op
16
rs
0
rt
immediate
• R[rt] = Data Memory {R[rs] + SignExt[imm16]}
Rs Rt
5
5
Rw
busW
5
Ra Rb
busA
busB
32
imm16
16
ExtOp=
Extender
clk
32
=
ALU
RegFile
32
11/6/2015
Rs Rt Rd
zero ALUctr=
0
<0:15>
RegWr=
<11:15>
1
clk
Instruction<31:0>
<16:20>
Rd Rt
instr
fetch
unit
<21:25>
nPC_sel=
RegDst=
Imm16
MemtoReg=
MemWr=
32
0
0
32
1
Data In
32
ALUSrc=
clk
Spring 2011 -- Lecture #18
WrEn Adr
Data
Memory
1
37
Single Cycle Datapath during Load
31
26
21
op
16
rs
0
rt
immediate
• R[rt] = Data Memory {R[rs] + SignExt[imm16]}
5
Ra Rb
busB
32
imm16
16
ExtOp=sign
Extender
clk
Rs Rt Rd Imm16
zero ALUctr=ADD
MemtoReg=1
MemWr=0
32
=
ALU
RegFile
32
11/6/2015
busA
<0:15>
Rw
busW
5
<11:15>
5
Rs Rt
<16:20>
RegWr=1
0
<21:25>
Rd Rt
1
Instruction<31:0>
instr
fetch
unit
nPC_sel=+4
RegDst=0
clk
32
0
0
32
1
Data In
32
ALUSrc=1
clk
Spring 2011 -- Lecture #18
WrEn Adr
Data
Memory
1
38
Single Cycle Datapath during Branch
31
26
21
op
•
16
rs
0
rt
immediate
if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0
Rs Rt
5
5
Rw
busW
5
Ra Rb
busA
busB
32
imm16
16
ExtOp=
Extender
clk
32
=
ALU
RegFile
32
11/6/2015
Rs Rt Rd
zero ALUctr=
0
<0:15>
RegWr=
<11:15>
1
clk
<16:20>
Rd Rt
Instruction<31:0>
<21:25>
nPC_sel=
RegDst=
instr
fetch
unit
Imm16
MemtoReg=
MemWr=
32
0
0
32
1
Data In
32
ALUSrc=
clk
Spring 2011 -- Lecture #18
WrEn Adr
Data
Memory
1
39
Single Cycle Datapath during Branch
31
26
21
op
•
16
rs
0
rt
immediate
if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0
5
Rw
busW
5
Ra Rb
busB
32
imm16
16
ExtOp=x
Extender
clk
Rs Rt Rd Imm16
zero ALUctr=SUB
MemtoReg=x
MemWr=0
32
=
ALU
RegFile
32
11/6/2015
busA
<0:15>
5
<11:15>
Rs Rt
<16:20>
RegWr=0
0
<21:25>
Rd Rt
1
Instruction<31:0>
instr
fetch
unit
nPC_sel=br
RegDst=x
clk
32
0
0
32
1
Data In
32
ALUSrc=0
clk
Spring 2011 -- Lecture #18
WrEn Adr
Data
Memory
1
40
Instruction Fetch Unit at the End of Branch
31
26
op
21
16
rs
0
rt
immediate
• if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ; else PC = PC + 4
Inst
Memory
Adr
nPC_sel
Zero
MUX
ctrl
nPC_sel
• What is encoding of nPC_sel?
0
00
• Direct MUX select?
• Branch inst. / not branch
Mux
PC
Adder
11/6/2015
PC Ext
imm16
Adder
4
Instruction<31:0>
1
clk
• Let’s pick 2nd option
nPC_sel
0
1
1
zero?
x
0
1
Spring 2011 -- Lecture #18
MUX
0
0
1
Q: What logic
gate?
41
Summary: Datapath’s Control Signals
• ExtOp:
• ALUsrc:
• ALUctr:
•
•
•
•
“zero”, “sign”
0  regB;
1  immed
“ADD”, “SUB”, “OR”
MemWr:
MemtoReg:
RegDst:
RegWr:
ALUctr
MemtoReg
MemWr
RegDst Rd Rt
1
Inst Address
RegWr
4
0
Rs Rt
5
5
Rw
busW
5
Ra Rb
busA
RegFile
busB
PC
Mux
32
clk
imm16
16
Extender
PC Ext
Adder
1
imm16
0
32 WrEn Adr
1
Data In
ALUSrc
clk
32
ExtOp
11/6/2015
32
0
32
clk
32
ALU
Adder
0
00
nPC_sel
1  write memory
0  ALU; 1  Mem
0  “rt”; 1  “rd”
1  write register
Spring 2011 -- Lecture #18
1
Data
Memory
42
Agenda
•
•
•
•
•
•
•
MIPS-lite Datapath
Administrivia
CPU Timing
MIPS-lite Control
Datapath Control
Technology Break
Control Implementation
11/6/2015
Spring 2011 -- Lecture #18
43
Given Datapath: RTL  Control
Instruction<31:0>
Rd
<0:15>
Rs
<11:15>
Rt
<16:20>
Op Fun
<21:25>
<0:5>
<26:31>
Inst
Memory
Adr
Imm16
Control
nPC_sel RegWr RegDst ExtOp ALUSrc ALUctr
MemWr MemtoReg
DATA PATH
11/6/2015
Spring 2011 -- Lecture #18
44
Summary of the Control Signals (1/2)
inst
Register Transfer
add
R[rd]  R[rs] + R[rt]; PC  PC + 4
ALUsrc=RegB, ALUctr=“ADD”, RegDst=rd, RegWr, nPC_sel=“+4”
sub
R[rd]  R[rs] – R[rt]; PC  PC + 4
ALUsrc=RegB, ALUctr=“SUB”, RegDst=rd, RegWr, nPC_sel=“+4”
ori
R[rt]  R[rs] + zero_ext(Imm16); PC  PC + 4
ALUsrc=Im, Extop=“Z”, ALUctr=“OR”, RegDst=rt,RegWr, nPC_sel=“+4”
R[rt]  MEM[ R[rs] + sign_ext(Imm16)]; PC  PC + 4
lw
ALUsrc=Im, Extop=“sn”, ALUctr=“ADD”, MemtoReg, RegDst=rt, RegWr,
nPC_sel = “+4”
MEM[ R[rs] + sign_ext(Imm16)]  R[rs]; PC  PC + 4
sw
ALUsrc=Im, Extop=“sn”, ALUctr = “ADD”, MemWr, nPC_sel = “+4”
Beq
if (R[rs] == R[rt]) then PC  PC + sign_ext(Imm16)] || 00
else PC  PC + 4
nPC_sel = “br”,
11/6/2015
ALUctr = “SUB”
Spring 2011 -- Lecture #18
45
Summary of the Control Signals (2/2)
See
Appendix A
func 10 0000 10 0010
op 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010
add
sub
ori
lw
sw
beq
jump
RegDst
1
1
0
0
x
x
x
ALUSrc
0
0
1
1
1
0
x
MemtoReg
0
0
0
1
x
x
x
RegWrite
1
1
1
1
0
0
0
MemWrite
0
0
0
0
1
0
0
nPCsel
0
0
0
0
0
1
?
Jump
0
0
0
0
0
0
1
ExtOp
x
x
0
1
1
x
Add
Subtract
Or
Add
Add
x
Subtract
ALUctr<2:0>
31
26
21
16
R-type
op
rs
rt
I-type
op
rs
rt
J-type
op
11/6/2015
We Don’t Care :-)
11
rd
6
shamt
immediate
target address
Spring 2011 -- Lecture #18
x
0
funct
add, sub
ori, lw, sw, beq
jump
46
Boolean Expressions for Controller
RegDst
ALUSrc
MemtoReg
RegWrite
MemWrite
nPCsel
Jump
ExtOp
ALUctr[0]
ALUctr[1]
=
=
=
=
=
=
=
=
=
=
add + sub
ori + lw + sw
lw
add + sub + ori + lw
sw
beq
jump
lw + sw
sub + beq
(assume ALUctr is
or
00 ADD,
01: SUB,
10: OR)
Where:
rtype
ori
lw
sw
beq
jump
=
=
=
=
=
=
~op5
~op5
op5
op5
~op5
~op5






~op4
~op4
~op4
~op4
~op4
~op4






~op3
op3
~op3
op3
~op3
~op3






~op2
op2
~op2
~op2
op2
~op2






~op1
~op1
op1
op1
~op1
op1
 ~op0,
 op0
 op0
 op0
 ~op0
 ~op0
How do we
implement this in
gates?
add = rtype  func5  ~func4  ~func3  ~func2  ~func1  ~func0
sub = rtype  func5  ~func4  ~func3  ~func2  func1  ~func0
11/6/2015
Spring 2011 -- Lecture #18
47
Controller Implementation
opcode
func
“AND” logic
11/6/2015
add
sub
ori
lw
sw
beq
jump
“OR” logic
Spring 2011 -- Lecture #18
RegDst
ALUSrc
MemtoReg
RegWrite
MemWrite
nPCsel
Jump
ExtOp
ALUctr[0]
ALUctr[1]
48
AND Control in Logisim
11/6/2015
Spring 2011 -- Lecture #18
49
Summary: Single-cycle Processor
• Five steps to design a processor:
Processor
1. Analyze instruction set 
Input
datapath requirements
Control
Memory
2. Select set of datapath
components & establish
Datapath
Output
clock methodology
3. Assemble datapath meeting
the requirements
4. Analyze implementation of each instruction to determine
setting of control points that effects the register transfer.
5. Assemble the control logic
• Formulate Logic Equations
• Design Circuits
11/6/2015
Spring 2011 -- Lecture #18
50
Single Cycle Datapath during Store
31
26
21
op
16
rs
0
rt
immediate
• Data Memory {R[rs] + SignExt[imm16]} = R[rt]
Rs Rt
5
5
Rw
busW
5
Ra Rb
busA
busB
32
imm16
16
ExtOp=
Extender
clk
32
=
ALU
RegFile
32
11/6/2015
Rs Rt Rd
zero ALUctr=
0
<0:15>
RegWr=
<11:15>
1
clk
Instruction<31:0>
<16:20>
Rd Rt
instr
fetch
unit
<21:25>
nPC_sel=
RegDst=
Imm16
MemtoReg=
MemWr=
32
0
0
32
1
Data In
32
ALUSrc=
clk
Spring 2011 -- Lecture #18
WrEn Adr
Data
Memory
1
51
Single Cycle Datapath during Store
31
26
21
op
16
rs
0
rt
immediate
• Data Memory {R[rs] + SignExt[imm16]} = R[rt]
Rw
busW
5
5
Ra Rb
busB
32
imm16
16
ExtOp=sign
Extender
clk
Rs Rt Rd Imm16
zero ALUctr=ADD
MemtoReg=x
MemWr=1
32
=
ALU
RegFile
32
11/6/2015
busA
<0:15>
5
Rs Rt
<11:15>
RegWr=0
0
<16:20>
Rd Rt
<21:25>
nPC_sel=+4
RegDst=x
clk
1
Instruction<31:0>
instr
fetch
unit
32
0
0
32
1
Data In
32
ALUSrc=1
clk
Spring 2011 -- Lecture #18
WrEn Adr
Data
Memory
1
52
Single Cycle Datapath during Jump
31
J-type
26 25
0
op
target address
jump
• New PC = { PC[31..28], target address, 00 }
Instruction<31:0>
Jump=
<0:25>
Data In32
ALUSrc =
Spring 2011 -- Lecture #18
0
32
Clk
WrEn Adr
32
Mux
32
<0:15>
ExtOp =
1
<11:15>
11/6/2015
16
Extender
imm16
Rs Rd Imm16 TA26
MemtoReg =
Zero MemWr =
ALU
busA
Rw Ra Rb
32
32 32-bit
Registers busB
0
32
<16:20>
5
Rt
ALUctr =
Rs Rt
5
5
Mux
32
Clk
Clk
1 Mux 0
RegWr =
busW
Rt
<21:25>
RegDst =
Rd
Instruction
Fetch Unit
nPC_sel=
1
Data
Memory
53
Instruction Fetch Unit at the End of Jump
31
26 25
J-type
0
op
target address
• New PC = { PC[31..28], target address, 00 }
Jump
Inst
Memory
nPC_sel
jump
Instruction<31:0>
Adr
Zero
nPC_MUX_sel
Adder
0
imm16
PC
Mux
Adder
11/6/2015
00
4
How do we modify this
to account for jumps?
1
Clk
Spring 2011 -- Lecture #18
55