Princess Sumaya Univ. Computer Engineering Dept. Princess Sumaya University 22444 – Computer Arch.
Download ReportTranscript Princess Sumaya Univ. Computer Engineering Dept. Princess Sumaya University 22444 – Computer Arch.
Slide 1
Princess Sumaya Univ.
Computer Engineering Dept.
Slide 2
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining
Applied to sequential steps to speed up operation
Example:
Compute
Ri = Ai Bi + Ci
Perform the
multiplication
first
for i=1, 2, … n
Perform the
addition next
1 / 15
Slide 3
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Sequential Execution
R1
Multiplier
Adder
R2
R3
MUL
ADD
2 / 15
Slide 4
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelined Execution
R1
Multiplier
R3
Adder
R2
R4
MUL
ADD
3 / 15
Slide 5
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelined Execution
Clock
Input A
A1
A2
A3
A4
Input B
B1
B2
B3
B4
C1
C2
C3
C4
R1
A1
A2
A3
A4
R2
B1
B2
B3
B4
R3
A1B1
A2B2
A3B3
A4B4
R4
C1
C2
C3
C4
A2B2
A3B3
A4B4
A1B1+C1
A2B2+C2
A3B3+C3
Input C
Mult.
Adder
A1B1
A4B4+C4
4 / 15
Slide 6
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelines
Clock
Input
S1
R1
Time 1
S1 T1
S2
S3
S4
S2
2
T2
T1
3
T3
T2
T1
R2
4
T4
T3
T2
T1
S3
5
T5
T4
T3
T2
6
T6
T5
T4
T3
R3
7
8
9
T6
T5
T4
T6
T5
T6
Serial Execution Time T S 6 Tasks 4 Time Units per Task
Parallel
Execution Time T P ( 4 ) ( 6 1) Time Units
R4
S4
Speedup
TS
2 . 67
TP
5 / 15
Slide 7
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelines
Speedup
n (k T )
k ( n 1) T
n = number of Tasks
k = number of segments
T = segment time
Speedup
nk
k ( n 1)
k
For n >> k
6 / 15
Slide 8
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining
Non Pipelined Process
P
C
Fetch Instr.
Get Operands
Execute
Store Result
Instr.
1
Instr.
1
Instr.
1
Instr.
1
ƮMem
ƮReg
ƮALU
ƮMem
Instr.
Mem.
Register
File
ALU
Data
Mem.
7 / 15
Slide 9
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining
Non Pipelined Process
P
C
Fetch Instr.
Get Operands
Execute
Store Result
Instr.
2
Instr.
2
Instr.
2
Instr.
2
ƮMem
ƮReg
ƮALU
ƮMem
Instr.
Mem.
Register
File
ALU
Data
Mem.
8 / 15
Slide 10
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining
Non Pipelined Process
● Clock Period =
● Clocks per Instruction =
ƮMem
P
C
Instr.
Mem.
ƮReg
Register
File
ƮALU
ALU
ƮMem
Data
Mem.
9 / 15
Slide 11
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelining
Pipelined Process
Fetch Instr.
Get Operands
Execute
Store Result
Instr.
54321
Instr.
4321
Instr.
321
Instr.
21
ƮMem
ƮReg
ƮALU
ƮMem
X
P
C
Instr.
Mem.
I
R
Register
File
ALU
Y
R
e
s
u
l
t
Data
Mem.
10 / 15
Slide 12
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelining
Pipelined Process
X
P
C
Instr.
Mem.
I
R
Register
File
ALU
Y
Stage 2
Stage 3
R
e
s
u
l
t
Data
Mem.
Time
Stage 1
Stage 4
1
Fetch Instr. 1
2
Fetch Instr. 2
Get Operands 1
3
Fetch Instr. 3
Get Operands 2
Execute 1
4
Fetch Instr. 4
Get Operands 3
Execute 2
Store Result 1
5
Fetch Instr. 5
Get Operands 4
Execute 3
Store Result 2
6
Fetch Instr. 6
Get Operands 5
Execute 4
Store Result 3
11 / 15
Slide 13
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelining
Pipelined Process
● Clock Period =
● Clocks per Instruction =
ƮMem
ƮReg
ƮALU
X
P
C
Instr.
Mem.
I
R
Register
File
ALU
Y
ƮMem
R
e
s
u
l
t
Data
Mem.
12 / 15
Slide 14
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining Hazards
Structural Hazards
Hardware can’t support instruction combination at a
certain time.
Example:
X
P
C
Instr.
Mem.
I
R
Register
File
ALU
Y
R
e
s
u
l
t
Data
Mem.
13 / 15
Slide 15
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining Hazards
Data Hazards
One instruction has to wait for another to complete.
Example:
X
P
C
Instr.
Mem.
I
R
Register
File
ALU
Y
R
e
s
u
l
t
Data
Mem.
14 / 15
Slide 16
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelining Hazards
Control Hazards
Decision depends on the result of unfinished instruction.
Example:
Time
Stage 1
1
Fetch Instr. 24
2
Fetch Instr. 28 Get Operands 24
?
?
3
4
Stage 2
Instr.
Mem.
I
R
Register
File
Execute 28
ALU
Y
Stage 4
Get Operands 28 Execute 24
X
P
C
Stage 3
R
e
s
u
l
t
Store Result 24
Data
Mem.
15 / 15
Slide 17
Princess Sumaya University
Pipelining
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Slide 18
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Homework
Mano
9-1
In certain scientific computations it is necessary to
perform the arithmetic operation
(Ai +Bi) (Ci + Di) with a stream of numbers. Specify a
pipeline configuration to carry out this task. List the
contents of all registers in the pipeline for i=1 through 6.
9-2
Draw a space-time diagram for a six-segment pipeline
showing the time it takes to process eight tasks.
9-3
Determine the number of clock cycles that it takes to
process 200 tasks in a six-segment pipeline.
Slide 19
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Homework
9-4
A nonpipeline system takes 50 ns to process a task. The
same task can be processed in a six-segment pipeline with
a clock cycle of 10 ns. Determine the speedup ration of the
pipeline for 100 tasks. What is the maximum speedup that
can be achieved?
Slide 20
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Homework
9-5
The pipeline of Fig 9-2 has the following propagation
times: 40 ns for the operands to be read from memory
into registers R1 and R2, 45 ns for the signal to propagate
through the multiplier, 5 ns for the transfer into R3, and
15 ns to add the two numbers into R5.
a. What is the minimum clock cycle time that can be
used?
b. A nonpipeline system can perform the same operation
by removing R3 and R4. How long will it take to
multiply and add the operands without using the
pipeline?
c. Calculate the speedup of the pipeline for 10 tasks and
again for 100 tasks.
d. What is the maximum speedup that can be achieved?
Princess Sumaya Univ.
Computer Engineering Dept.
Slide 2
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining
Applied to sequential steps to speed up operation
Example:
Compute
Ri = Ai Bi + Ci
Perform the
multiplication
first
for i=1, 2, … n
Perform the
addition next
1 / 15
Slide 3
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Sequential Execution
R1
Multiplier
Adder
R2
R3
MUL
ADD
2 / 15
Slide 4
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelined Execution
R1
Multiplier
R3
Adder
R2
R4
MUL
ADD
3 / 15
Slide 5
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelined Execution
Clock
Input A
A1
A2
A3
A4
Input B
B1
B2
B3
B4
C1
C2
C3
C4
R1
A1
A2
A3
A4
R2
B1
B2
B3
B4
R3
A1B1
A2B2
A3B3
A4B4
R4
C1
C2
C3
C4
A2B2
A3B3
A4B4
A1B1+C1
A2B2+C2
A3B3+C3
Input C
Mult.
Adder
A1B1
A4B4+C4
4 / 15
Slide 6
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelines
Clock
Input
S1
R1
Time 1
S1 T1
S2
S3
S4
S2
2
T2
T1
3
T3
T2
T1
R2
4
T4
T3
T2
T1
S3
5
T5
T4
T3
T2
6
T6
T5
T4
T3
R3
7
8
9
T6
T5
T4
T6
T5
T6
Serial Execution Time T S 6 Tasks 4 Time Units per Task
Parallel
Execution Time T P ( 4 ) ( 6 1) Time Units
R4
S4
Speedup
TS
2 . 67
TP
5 / 15
Slide 7
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelines
Speedup
n (k T )
k ( n 1) T
n = number of Tasks
k = number of segments
T = segment time
Speedup
nk
k ( n 1)
k
For n >> k
6 / 15
Slide 8
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining
Non Pipelined Process
P
C
Fetch Instr.
Get Operands
Execute
Store Result
Instr.
1
Instr.
1
Instr.
1
Instr.
1
ƮMem
ƮReg
ƮALU
ƮMem
Instr.
Mem.
Register
File
ALU
Data
Mem.
7 / 15
Slide 9
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining
Non Pipelined Process
P
C
Fetch Instr.
Get Operands
Execute
Store Result
Instr.
2
Instr.
2
Instr.
2
Instr.
2
ƮMem
ƮReg
ƮALU
ƮMem
Instr.
Mem.
Register
File
ALU
Data
Mem.
8 / 15
Slide 10
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining
Non Pipelined Process
● Clock Period =
● Clocks per Instruction =
ƮMem
P
C
Instr.
Mem.
ƮReg
Register
File
ƮALU
ALU
ƮMem
Data
Mem.
9 / 15
Slide 11
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelining
Pipelined Process
Fetch Instr.
Get Operands
Execute
Store Result
Instr.
54321
Instr.
4321
Instr.
321
Instr.
21
ƮMem
ƮReg
ƮALU
ƮMem
X
P
C
Instr.
Mem.
I
R
Register
File
ALU
Y
R
e
s
u
l
t
Data
Mem.
10 / 15
Slide 12
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelining
Pipelined Process
X
P
C
Instr.
Mem.
I
R
Register
File
ALU
Y
Stage 2
Stage 3
R
e
s
u
l
t
Data
Mem.
Time
Stage 1
Stage 4
1
Fetch Instr. 1
2
Fetch Instr. 2
Get Operands 1
3
Fetch Instr. 3
Get Operands 2
Execute 1
4
Fetch Instr. 4
Get Operands 3
Execute 2
Store Result 1
5
Fetch Instr. 5
Get Operands 4
Execute 3
Store Result 2
6
Fetch Instr. 6
Get Operands 5
Execute 4
Store Result 3
11 / 15
Slide 13
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelining
Pipelined Process
● Clock Period =
● Clocks per Instruction =
ƮMem
ƮReg
ƮALU
X
P
C
Instr.
Mem.
I
R
Register
File
ALU
Y
ƮMem
R
e
s
u
l
t
Data
Mem.
12 / 15
Slide 14
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining Hazards
Structural Hazards
Hardware can’t support instruction combination at a
certain time.
Example:
X
P
C
Instr.
Mem.
I
R
Register
File
ALU
Y
R
e
s
u
l
t
Data
Mem.
13 / 15
Slide 15
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Pipelining Hazards
Data Hazards
One instruction has to wait for another to complete.
Example:
X
P
C
Instr.
Mem.
I
R
Register
File
ALU
Y
R
e
s
u
l
t
Data
Mem.
14 / 15
Slide 16
22444 – Computer Arch. & Org. (1)
Princess Sumaya University
Computer Engineering Dept.
Pipelining Hazards
Control Hazards
Decision depends on the result of unfinished instruction.
Example:
Time
Stage 1
1
Fetch Instr. 24
2
Fetch Instr. 28 Get Operands 24
?
?
3
4
Stage 2
Instr.
Mem.
I
R
Register
File
Execute 28
ALU
Y
Stage 4
Get Operands 28 Execute 24
X
P
C
Stage 3
R
e
s
u
l
t
Store Result 24
Data
Mem.
15 / 15
Slide 17
Princess Sumaya University
Pipelining
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Slide 18
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Homework
Mano
9-1
In certain scientific computations it is necessary to
perform the arithmetic operation
(Ai +Bi) (Ci + Di) with a stream of numbers. Specify a
pipeline configuration to carry out this task. List the
contents of all registers in the pipeline for i=1 through 6.
9-2
Draw a space-time diagram for a six-segment pipeline
showing the time it takes to process eight tasks.
9-3
Determine the number of clock cycles that it takes to
process 200 tasks in a six-segment pipeline.
Slide 19
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Homework
9-4
A nonpipeline system takes 50 ns to process a task. The
same task can be processed in a six-segment pipeline with
a clock cycle of 10 ns. Determine the speedup ration of the
pipeline for 100 tasks. What is the maximum speedup that
can be achieved?
Slide 20
Princess Sumaya University
22444 – Computer Arch. & Org. (1)
Computer Engineering Dept.
Homework
9-5
The pipeline of Fig 9-2 has the following propagation
times: 40 ns for the operands to be read from memory
into registers R1 and R2, 45 ns for the signal to propagate
through the multiplier, 5 ns for the transfer into R3, and
15 ns to add the two numbers into R5.
a. What is the minimum clock cycle time that can be
used?
b. A nonpipeline system can perform the same operation
by removing R3 and R4. How long will it take to
multiply and add the operands without using the
pipeline?
c. Calculate the speedup of the pipeline for 10 tasks and
again for 100 tasks.
d. What is the maximum speedup that can be achieved?