L10-Pipeline-Review - Peer Instruction for Computer Science

Download Report

Transcript L10-Pipeline-Review - Peer Instruction for Computer Science

Try to put everything together for pipelines
Before going onto caches.
Pipeline Summary
Peer Instruction Lecture Materials for Computer Architecture by Dr. Leo Porter is
licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0
Unported License.
Given our 5-stage MIPS pipeline – what is the steady state CPI
for the following code? Assume the branch is taken thousands
of times. Recall – a processor is in steady state when all stages
are active.
Steady-State CPI = (#insts+#stalls+#flushed_insts)
#insts
Loop: lw r1, 0 (r2)
add r2, r3, r4
sub r5, r1, r2
beq r5, $zero, Loop
Selection CPI
A
1
B
1.25
C
1.5
D
1.75
E
None of the above
Hardware engineers determine these to be the execution
IF = 200ps times per stage of the MIPS 5-stage pipeline processor.
Consider splitting IF and M into 2 stages each. (So IF1 IF2
ID = 100ps and M1 M2.) The most important code run by the
EX = 100ps company is (assume branch is taken most of the time):
M = 200ps Loop: lw r1, 0 (r2)
add r2, r3, r4
WB = 100ps
sub r5, r1, r2
beq r5, $zero, Loop
What would be the impact of the new 7-stage pipeline compared to the
original 5-stage MIPS pipeline.. Assume the pipeline has forwarding where
available, predicts branch not taken, and resolves branches in ID.
Selection
CPI
CT
A
Increase
Increase
B
Increase
Decrease
C
Decrease
Increase
D
Decrease
Decrease
E
Increase
No Change
Hardware engineers determine these to be the execution
IF = 200ps times per stage of the MIPS 5-stage pipeline processor.
Consider splitting IF and M into 2 stages each. (So IF1 IF2
ID = 100ps and M1 M2.) The most important code run by the
EX = 200ps company is (assume branch is taken most of the time):
M = 200ps Loop: lw r1, 0 (r2)
add r2, r3, r4
WB = 100ps
sub r5, r1, r2
beq r5, $zero, Loop
What would be the impact of the new 7-stage pipeline compared to the
original 5-stage MIPS pipeline.. Assume the pipeline has forwarding where
available, predicts branch not taken, and resolves branches in ID.
Selection
CPI
CT
A
Increase
Increase
B
Increase
Decrease
C
Decrease
Increase
D
Decrease
Decrease
E
Increase
No Change
7-stage Pipeline
Loop: lw r1, 0 (r2)
add r2, r3, r4
sub r5, r1, r2
beq r5, $zero, Loop
Pipelining -- Key Points
• Pipelining focuses on improving instruction throughput,
•
•
not individual instruction latency.
Data hazards can be handled by hardware or software – but
most modern processors have hardware support for stalling
and forwarding.
Control hazards can be handled by hardware or software –
but most modern processors use Branch Target Buffers and
advanced dynamic branch prediction to reduce the hazard.
• ET = IC*CPI*CT