Introduction - AWARDSPACE.COM

Download Report

Transcript Introduction - AWARDSPACE.COM

Basic MIPS Architecture:
Single-Cycle Datapath and Control
Chapter 4
Sections 4.1 – 4.4
Appendix D.1 and D.2
Dr. Iyad F. Jafar
Outline
 Introduction
 Clocking
 Single-cycle Datapath
 Single-cycle Control
 Performance Analysis
2
Introduction
 So far, we have built a small ALU
 ADD, SUB, SLT, AND, OR, …
 What about
 Memory and registers?
 Control operations?
 Interpreting (decoding) instructions?
 The big picture
 The CPU’s datapath deals with moving data
around
 The CPU’s control manages the data
Fetch
PC = PC+4
 Generic implementation
Execute
3
Decode
Clocking
 The clocking methodology defines when signals can
be read and when they are written

An edge-triggered methodology
 Typical execution
 read contents of state elements
 send values through combinational logic
 write results to one or more state elements
State
Element
Combinational
logic
State
Element
clock
one clock cycle

4
Assumes state elements are written on every clock cycle; if not,
need explicit write control signal

write occurs only when both the write control is asserted and the clock edge
occurs
Single-Cycle Datapath
 The first implementation considered
 All instructions start and finish execution in
one cycle!
 This include the time required to fetch,
decode, and execute the instruction
 In the following, we will consider the
datapath of each of these steps
5
Single-Cycle Datapath
 Fetch Datapath
 Fetching the instruction from memory requires
 Sending the PC to memory to read the instruction
 Update the PC to point to the next instruction
+
4
Instruction
Memory
PC
Read
Address
Data
Instruction
 Do we need an explicit write signal for writing the PC?
6
 Do we need an explicit read signal for reading the memory?
Single-Cycle Datapath
 Decode Datapath
 Regardless of the instruction
 Send the opcode (31-26) and the function (5-0) fields of the
instruction to the control unit
 Read two registers; rs (25-21) and rt (20-16)
 Reading is not harmful!
Control
Unit
Read Addr 2
Instruction
Write Addr
Write Data
7
Register File
Read Addr 1
Read
Data 1
R[rs]
Read
Data 2
R[rt]
Single-Cycle Datapath
 Inside the Register File
 How can we read a register out of 32 registers?
0
Register 1
1
Register 2
….
Register 31
31
32-to-1 MUX
Register 0
Read Data 1
32-to-1 MUX
Read Register 1
Read Data 2
Read Register 2
0
1
8
31
Single-Cycle Datapath
 Inside the Register File
 How can we write a register out of 32 registers?
Clock
Write
Register Number
5-to-32 Decoder
0
C
D
1
C
D
C
31
D
C
D
C
Write Data
9
D
Register 0
Register 1
Register 2
…..
Register 31
Single-Cycle Datapath
 Execution Datapath
 R-type instructions (ADD, SUB, SLT, AND, OR)
 The two registers are read already!
 Perform operation based on OPCODE and FUNC fields
 Store the result back into the register file (the destination
register is specified in rd field of the instruction (15-11)!
RegWrite
Write Addr
Write Data
10
Read
Data 1
Read
Data 2
R[rs]
R[rt]
ALU
Read Addr 2
Write
Register File
Instruction
Read Addr 1
ALU Control
 The register file is not written on every cycle! Need an
explicit write signal
Single-Cycle Datapath
 Execution Datapath
 Load Instruction
 Compute the load address
 Store the loaded data in the register file. The destination
register is the rt field of the instruction (20-16)
RegWrite
Write Addr
Write Data
ALU Control
Read
Data 2
R[rs]
R[rt]
Address
ALU
Read
Data 1
Data Memory
Read Addr 2
Write
Register File
Instruction
Read Addr 1
MemRead
Data
Write
Data
11
Sign
Ext.
MemWrite
Single-Cycle Datapath
 Execution Datapath
 Store Instruction
 Compute the load address
 Store register in the memory
RegWrite
Write Addr
Write Data
ALU Control
R[rs]
Read R[rt]
Data 2
Address
ALU
Read
Data 1
Data Memory
Read Addr 2
Write
Register File
Instruction
Read Addr 1
MemRead
Data
Write
Data
Sign
Ext.
12
MemWrite
Single-Cycle Datapath
 Execution Datapath
 Branch Instruction
 Compare the two registers
 Compute the branch address
Branch
Address
 Change PC if true !
+
+
RegWrite
Zero
Branch
Address
13
Instruction
Read Addr 2
Write Addr
Write Data
Sign
Ext.
Write
Register File
1
PC
0
Read Addr 1
Zero
Read
Data 1
Read
Data 2
x4
ALU
4
ALU
Control
Single-Cycle Datapath
 Execution Datapath
 Jump Instruction
 Compute the jump address
 Store it in the PC
+
jump address
4
Jump
1
14
PC
0
Instruction
Memory
Read
Address
Data
Instruction
x4
Single-Cycle Datapath
 Creating the Single Datapath
 Assemble the datapath segments and add control
lines and multiplexors as needed
 Single cycle design
 Fetch, decode and execute each instructions in one clock
cycle
 No datapath resource can be used more than once per
instruction, so some must be duplicated (e.g., separate
Instruction Memory and Data Memory, several adders)
 Multiplexors needed at the input of shared elements with
control lines to do the selection
 Write signals to control writing to the Register File and
Data Memory
 Cycle time is determined by length of the longest path
15
Single-Cycle Datapath
1
Instr[25-0] Shift
left 2
+
4
0
PC[31-28]
Jump
Branch
ALUOp
Instr[31-26]
Read
Instr[31-0]
Address
MemWrite
MemtoReg
ALUSrc
RegWrite
ovf
Read Addr 1
Register Read
Instr[20-16]
Read Addr 2Data 1
File
0
Write Addr Read
1
Instr[15
-11]
Write Data
Instr[15-0]
16
16
PCSrc
MemRead
Instr[25-21]
PC
1
Control
Unit
RegDst
Instruction
Memory
+
Shift
left 2
0
zero
0
Data 2
Sign
Extend
ALU
1
32
Instr[5-0]
ALU
control
Address
Data
Memory Read Data
1
Write Data
0
Single-Cycle Control
 Need to design the control that generates the
appropriate control signals based on the Opcode and
Function fields to
 Specify the operation of the ALU
 Control the data flow by selecting the appropriate input of the
multiplexors
 With the following observations across different
instructions
 Op field is always in bits 31-26 of the instruction
 Address of registers to be read are always specified by
 The rs field (bits 25-21)
 The rt field (bits 20-16)
 For LW and SW, the rs field is the base register
 Address of register to be written is in one of two places
 For LW, the address is the rt field (bits 20-16 )
 For R-type, the address is the rd field (bits 15-11)
17
 Offset for BEQ, LW, and SW is always in bits 15-0 of the instruction
Single-Cycle Control
Signal Name
RegDst
Effect when Asserted (1)
The destination register is from rt
The destination register is from rd field
field
RegWrite
None
Enable writing to the register selected
by the Write register port
ALUSrc
The second ALU operand comes
from the second register file
output
The second ALU operand is the sign
extended offset
PCSrc
PC value is PC+4
PC is the branch address
MemRead
None
Contents of memory address are put
on Read data output
MemWrite
None
Data on the Write data input is placed
in the specified address
MemtoReg
The data fed to the register file
Write data input comes from ALU
The data fed to the register file Write
data input comes from memory
ALUOp
18
Effect when Deassereted (0)
Used with the function field of the instruction to generate the ALUOp
signal that specify the ALU operation
R-type Instruction Data/Control Flow
1
Instr[26-0] Shift
left 2
+
4
0
PC[31-28]
Jump
Branch
ALUOp
Instr[31-26]
Read
Instr[31-0]
Address
MemWrite
MemtoReg
ALUSrc
RegWrite
ovf
Read Addr 1
Register Read
Instr[20-16]
Read Addr 2Data 1
File
0
Write Addr Read
1
Instr[15
-11]
Write Data
Instr[15-0]
16
19
PCSrc
MemRead
Instr[25-21]
PC
1
Control
Unit
RegDst
Instruction
Memory
+
Shift
left 2
0
zero
0
Data 2
Sign
Extend
ALU
1
32
Instr[5-0]
ALU
control
Address
Data
Memory Read Data
1
Write Data
0
Load Word Instruction Data/Control Flow
1
Instr[26-0] Shift
left 2
+
4
0
PC[31-28]
Jump
Branch
ALUOp
Instr[31-26]
Read
Instr[31-0]
Address
MemWrite
MemtoReg
ALUSrc
RegWrite
ovf
Read Addr 1
Register Read
Instr[20-16]
Read Addr 2Data 1
File
0
Write Addr Read
1
Instr[15
-11]
Write Data
Instr[15-0]
16
20
PCSrc
MemRead
Instr[25-21]
PC
1
Control
Unit
RegDst
Instruction
Memory
+
Shift
left 2
0
zero
0
Data 2
Sign
Extend
ALU
1
32
Instr[5-0]
ALU
control
Address
Data
Memory Read Data
1
Write Data
0
Branch Equal Instruction Data/Control Flow
1
Instr[26-0] Shift
left 2
+
4
0
PC[31-28]
Jump
Branch
ALUOp
Instr[31-26]
Read
Instr[31-0]
Address
MemWrite
MemtoReg
ALUSrc
RegWrite
ovf
Read Addr 1
Register Read
Instr[20-16]
Read Addr 2Data 1
File
0
Write Addr Read
1
Instr[15
-11]
Write Data
Instr[15-0]
16
21
PCSrc
MemRead
Instr[25-21]
PC
1
Control
Unit
RegDst
Instruction
Memory
+
Shift
left 2
0
zero
0
Data 2
Sign
Extend
ALU
1
32
Instr[5-0]
ALU
control
Address
Data
Memory Read Data
1
Write Data
0
Jump Instruction Data/Control Flow
1
Instr[26-0] Shift
left 2
+
4
0
PC[31-28]
Jump
Branch
ALUOp
Instr[31-26]
Read
Instr[31-0]
Address
MemWrite
MemtoReg
ALUSrc
RegWrite
ovf
Read Addr 1
Register Read
Instr[20-16]
Read Addr 2Data 1
File
0
Write Addr Read
1
Instr[15
-11]
Write Data
Instr[15-0]
16
22
PCSrc
MemRead
Instr[25-21]
PC
1
Control
Unit
RegDst
Instruction
Memory
+
Shift
left 2
0
zero
0
Data 2
Sign
Extend
ALU
1
32
Instr[5-0]
ALU
control
Address
Data
Memory Read Data
1
Write Data
0
Single-Cycle Control
 The Main Control Unit
 The input is the Op field (6 bits) from the instruction
 The output is nine control signals
 The truth table !
23
Op4
Op3
Op2
Op1
Op0
RegDist
ALUsrc
MemtoReg
RegWrite
MemRead
MemWrite
Branch
ALUop1
ALUop0
Outputs
Op5
Inputs
R-type
0
0
0
0
0
0
1
0
0
1
0
0
0
1
0
LW
1
0
0
0
1
1
0
1
1
1
1
0
0
0
0
SW
1
0
1
0
1
1
X
1
X
0
0
1
0
0
0
BEQ
0
0
0
1
0
0
X
0
X
0
0
0
1
0
1
Single-Cycle Control
 The Main Control Unit
 To design the logic circuit, generate the appropriate
minterms for each output signal
 Simply, use a PLA!
24
Single-Cycle Control
 The ALU Control Unit
 It has two inputs
 ALUop (2 bits) from Main control
 Func (6 bits) from the instruction
 It has two outputs
 Bengate (1 bits)
 Operation (2 bits)
 Supported Operations
Function
and
or
add
sub
slt
25
Bnegate
0
0
0
1
1
Operation
00
01
10
10
11
ALUop
Func
Bnegate
ALU
control Operation
Single-Cycle Control
 The ALU Control Unit
 Truth Table !
ALUop0
F5
F4
F3
F2
F1
F0
Bnegate
Operation
1
Operation
0
26
Outputs
ALUop1
Inputs
AND
1
0
1
0
0
1
0
0
0
0
0
OR
1
0
1
0
0
1
0
1
0
0
1
ADD
1
0
1
0
0
0
0
0
0
1
0
SUB
1
0
1
0
0
0
1
0
1
1
0
SLT
1
0
1
0
1
0
1
0
1
1
1
LW
0
0
n/a
0
1
0
SW
0
0
n/a
0
1
0
BEQ
0
1
n/a
1
1
0
Single-Cycle Control
 The ALU Control Unit
 Hardware Implementation
 Generating minterms!! Minimization!!
 By inspection!
27
Performance Analysis
 All instructions have to finish in one cycle!
 How long is the cycle time?
 Different units are used in different instructions
 Each unit has its own delay
 Need to find the longest path!
 Assume the following times
Unit
Delay
ALU
2 ns
Memory
2 ns
Register File
1 ns
R-type:
Instr. Fetch
Register Read
ALU
Register Write
6ns
LW:
Instr. Fetch
Register Read
ALU
Memory Read Register Write
8ns
SW:
Instr. Fetch
Register Read
ALU
Memory Write
7ns
Branch:
Instr. Fetch
Register Read
ALU
Jump:
Instr. Fetch
28
 Thus, the cycle time should be at least 8 ns
5ns
2ns
Performance Analysis
 The cycle time is fixed!
 However, not all instructions require the same time! There
is a wasted time for some instructions?!
Cycle 1
Cycle 2
Clock
LW
SW
waste
 Possible Solution?
29
Performance Analysis
 Example 1. Example 1. consider
the following two
implementations of a single cycle machine:
 Machine A : all instructions execute in one cycle of fixed
length
 Machine B: all instructions execute in one cycle , however, the
cycle time adapts to instruction types
Use the information given in the tables to compare the two
machines
30
Instruction type
Percentage %
Unit
Time (ps)
ALU
45
Memory
200
Load
25
ALU and adders
100
Store
10
Register File
50
Branch
15
Jump
5
Performance Analysis
 Example 1. Continued.
 CPU Execution Time = IC x CPI x Clock cycle time
 CPI A = CPIB = 1
 ICA = ICB
 CCA= 600 ns
Instruction
Type
Inst.
Register
Memory
Read
ALU
Data
Register
Memory
Write
Total
R-type
200
50
100
0
50
400
Load
200
50
100
200
50
600
Store
200
50
100
200
550
Branch
200
50
100
0
350
Jump
200
200
 CCB = 600 x 0.25 + 550 x 0.1 + 400 x 0.45 + 350 x 0.15 + 200 x 0.05 =
31
447.5 ps
 PerformancB / PerformanceA = 600 / 447.5 = 1.34
 So, adaptive clock cycle is faster; however it is hard to implement !
Single Cycle Disadvantages &
Advantages
 Single-cycle implementation assumes that all
instructions can execute in one cycles
 Advantages
 Simple and easy to understand
 Disadvantages
 Hardware duplication!
 Uses the clock cycle inefficiently – the clock cycle must
be timed to accommodate the slowest instruction
(especially problematic for more complex instructions like
floating point multiply)
32