Princess Sumaya Univ. Computer Engineering Dept. Princess Sumaya University 22444 – Computer Arch.

Download Report

Transcript Princess Sumaya Univ. Computer Engineering Dept. Princess Sumaya University 22444 – Computer Arch.

Slide 1

Princess Sumaya Univ.
Computer Engineering Dept.


Slide 2

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Pipelining
 Applied to sequential steps to speed up operation
Example:
Compute

Ri = Ai  Bi + Ci

Perform the
multiplication
first

for i=1, 2, … n

Perform the
addition next

1 / 15


Slide 3

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Sequential Execution

R1

Multiplier

Adder
R2

R3

MUL

ADD
2 / 15


Slide 4

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Pipelined Execution

R1

Multiplier

R3

Adder
R2

R4

MUL

ADD
3 / 15


Slide 5

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Pipelined Execution
Clock
Input A

A1

A2

A3

A4

Input B

B1

B2

B3

B4

C1

C2

C3

C4

R1

A1

A2

A3

A4

R2

B1

B2

B3

B4

R3

A1B1

A2B2

A3B3

A4B4

R4

C1

C2

C3

C4

A2B2

A3B3

A4B4

A1B1+C1

A2B2+C2

A3B3+C3

Input C

Mult.
Adder

A1B1

A4B4+C4
4 / 15


Slide 6

22444 – Computer Arch. & Org. (1)

Princess Sumaya University

Computer Engineering Dept.

Pipelines
Clock
Input

S1

R1

Time 1
S1 T1
S2
S3
S4

S2

2
T2
T1

3
T3
T2
T1

R2

4
T4
T3
T2
T1

S3

5
T5
T4
T3
T2

6
T6
T5
T4
T3

R3

7

8

9

T6
T5
T4

T6
T5

T6

Serial Execution Time  T S  6 Tasks  4 Time Units per Task
Parallel

Execution Time  T P  ( 4 )  ( 6  1)  Time Units

R4

S4

Speedup 

TS

 2 . 67

TP
5 / 15


Slide 7

22444 – Computer Arch. & Org. (1)

Princess Sumaya University

Computer Engineering Dept.

Pipelines
Speedup



n (k T )

k  ( n  1)  T

n = number of Tasks
k = number of segments
T = segment time

Speedup



nk
k  ( n  1)

k

For n >> k

6 / 15


Slide 8

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Pipelining
 Non Pipelined Process

P
C

Fetch Instr.

Get Operands

Execute

Store Result

Instr.
1

Instr.
1

Instr.
1

Instr.
1

ƮMem

ƮReg

ƮALU

ƮMem

Instr.
Mem.

Register
File

ALU

Data
Mem.

7 / 15


Slide 9

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Pipelining
 Non Pipelined Process

P
C

Fetch Instr.

Get Operands

Execute

Store Result

Instr.
2

Instr.
2

Instr.
2

Instr.
2

ƮMem

ƮReg

ƮALU

ƮMem

Instr.
Mem.

Register
File

ALU

Data
Mem.

8 / 15


Slide 10

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Pipelining
 Non Pipelined Process
● Clock Period =
● Clocks per Instruction =

ƮMem

P
C

Instr.
Mem.

ƮReg

Register
File

ƮALU

ALU

ƮMem

Data
Mem.

9 / 15


Slide 11

22444 – Computer Arch. & Org. (1)

Princess Sumaya University

Computer Engineering Dept.

Pipelining
 Pipelined Process
Fetch Instr.

Get Operands

Execute

Store Result

Instr.
54321

Instr.
4321

Instr.
321

Instr.
21

ƮMem

ƮReg

ƮALU

ƮMem

X
P
C

Instr.
Mem.

I
R

Register
File

ALU
Y

R
e
s
u
l
t

Data
Mem.

10 / 15


Slide 12

22444 – Computer Arch. & Org. (1)

Princess Sumaya University

Computer Engineering Dept.

Pipelining
 Pipelined Process
X
P
C

Instr.
Mem.

I
R

Register
File

ALU
Y

Stage 2

Stage 3

R
e
s
u
l
t

Data
Mem.

Time

Stage 1

Stage 4

1

Fetch Instr. 1

2

Fetch Instr. 2

Get Operands 1

3

Fetch Instr. 3

Get Operands 2

Execute 1

4

Fetch Instr. 4

Get Operands 3

Execute 2

Store Result 1

5

Fetch Instr. 5

Get Operands 4

Execute 3

Store Result 2

6

Fetch Instr. 6

Get Operands 5

Execute 4

Store Result 3
11 / 15


Slide 13

22444 – Computer Arch. & Org. (1)

Princess Sumaya University

Computer Engineering Dept.

Pipelining
 Pipelined Process
● Clock Period =
● Clocks per Instruction =

ƮMem

ƮReg

ƮALU
X

P
C

Instr.
Mem.

I
R

Register
File

ALU
Y

ƮMem
R
e
s
u
l
t

Data
Mem.

12 / 15


Slide 14

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Pipelining Hazards
 Structural Hazards
Hardware can’t support instruction combination at a
certain time.
Example:

X
P
C

Instr.
Mem.

I
R

Register
File

ALU
Y

R
e
s
u
l
t

Data
Mem.

13 / 15


Slide 15

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Pipelining Hazards
 Data Hazards
One instruction has to wait for another to complete.
Example:

X
P
C

Instr.
Mem.

I
R

Register
File

ALU
Y

R
e
s
u
l
t

Data
Mem.

14 / 15


Slide 16

22444 – Computer Arch. & Org. (1)

Princess Sumaya University

Computer Engineering Dept.

Pipelining Hazards
 Control Hazards
Decision depends on the result of unfinished instruction.
Example:

Time

Stage 1

1

Fetch Instr. 24

2

Fetch Instr. 28 Get Operands 24

?
?

3

4

Stage 2

Instr.
Mem.

I
R

Register
File

Execute 28

ALU
Y

Stage 4

Get Operands 28 Execute 24

X
P
C

Stage 3

R
e
s
u
l
t

Store Result 24

Data
Mem.

15 / 15


Slide 17

Princess Sumaya University

Pipelining

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.


Slide 18

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Homework
 Mano
9-1

In certain scientific computations it is necessary to
perform the arithmetic operation
(Ai +Bi) (Ci + Di) with a stream of numbers. Specify a
pipeline configuration to carry out this task. List the
contents of all registers in the pipeline for i=1 through 6.

9-2

Draw a space-time diagram for a six-segment pipeline
showing the time it takes to process eight tasks.

9-3

Determine the number of clock cycles that it takes to
process 200 tasks in a six-segment pipeline.


Slide 19

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Homework
9-4

A nonpipeline system takes 50 ns to process a task. The
same task can be processed in a six-segment pipeline with
a clock cycle of 10 ns. Determine the speedup ration of the
pipeline for 100 tasks. What is the maximum speedup that
can be achieved?


Slide 20

Princess Sumaya University

22444 – Computer Arch. & Org. (1)

Computer Engineering Dept.

Homework
9-5

The pipeline of Fig 9-2 has the following propagation
times: 40 ns for the operands to be read from memory
into registers R1 and R2, 45 ns for the signal to propagate
through the multiplier, 5 ns for the transfer into R3, and
15 ns to add the two numbers into R5.
a. What is the minimum clock cycle time that can be
used?
b. A nonpipeline system can perform the same operation
by removing R3 and R4. How long will it take to
multiply and add the operands without using the
pipeline?
c. Calculate the speedup of the pipeline for 10 tasks and
again for 100 tasks.
d. What is the maximum speedup that can be achieved?