Transcript Slide 1

COMP60611
Fundamentals of Parallel
and Distributed Systems
Lecture 2
Introduction to Parallel Programs
John Gurd, Graham Riley
Centre for Novel Computing
School of Computer Science
University of Manchester
Combining the strengths of UMIST and
The Victoria University of Manchester
Overview
• We focus on the higher of the two implementationoriented Levels of Abstraction
– The Program Level
• sequential state-transition programming model
• two fundamental ways to go parallel
– processes (message-passing)
– threads (data-sharing)
• implications for parallel programming languages
– Summary
21/07/2015
2
On the Nature of
Digital Systems
• Programs and their hardware realisations, whether parallel or
sequential, are essentially the same; i.e., programmed statetransition.
• For example, a sequential system (abstract machine or
concrete hardware) comprises a state-transition machine
(processor) attached to a memory which is divided into two
logical sections; one fixed (the code state) and one changeable
(the data state).
– The code state contains instructions (or statements) that can be
executed in a data-dependent sequence, changing the contents of
the data state in such a way as to progress the required
computation. The sequence is controlled by a program counter.
– The data state contains the variables of the computation. These
start in an initial state, defining the input of the computation, and
finish by holding its logical output. In programming terms, the data
state contains the data structures of the program.
21/07/2015
3
Sequential Digital Systems
• Performance in the above model is governed by a statetransition cycle. The program counter identifies the 'current'
instruction. This is fetched from the code state and then
executed. Execution involves reading data from the data state,
performing appropriate operations, then writing results back to
the data state and assigning a new value to the program
counter.
• To a first approximation, execution time will depend on the exact
number and sequence of these actions (we ignore, for the
moment, the effect of any memory buffering schemes).
• This is the programmer's model of what constitutes a sequential
computation. It is predominantly a model of memory, and the
programmer's art is essentially to map algorithms into memory
in such a way that they will execute with good performance.
21/07/2015
4
Sequential Digital Systems
• It will be convenient to think of memory in diagrammatic terms.
In this sense, the model can be visualised as follows:
Code – fixed memory
Data – changeable memory
• The Code has an associated, data-dependent locus of control,
governed by the program counter; there is also some associated
processor state which we roll into the Data, for the moment. This
whole memory image is called a process (after Unix
terminology; other names are used).
21/07/2015
5
Parallel Execution
• It is possible to execute more than one process concurrently,
and to arrange for the processes to co-operate in solving some
large problem using a message-passing protocol (cf. Unix
pipes, forks, etc.).
• However, the 'start-up' costs associated with each process are
large, mainly due to the cost of protecting its data memory from
access by any other process. As a consequence, a large parallel
grain size is needed.
• An alternative is to exploit parallelism within a single process,
using some form of 'lightweight' process, or thread. This should
allow use of a smaller parallel grain size, but carries risks
associated with sharing of data.
• We shall look at the case where just two processors are active.
This can be readily generalised to a larger number of
processors.
21/07/2015
6
Two-fold Parallelism
• In the message-passing scheme, two-fold parallelism is
achieved by simultaneous activation of two 'co-operating'
processes.
• Each process can construct messages (think of these as values
of some abstract data type) and send them to other processes.
A process has to receive incoming messages explicitly (this
restriction can be overcome, but it is not a straightforward matter
to do so).
• The message-passing scheme is illustrated in the following
diagram:
.
21/07/2015
Code A
Code B
Data A
Data B
7
Two-fold Parallelism
• In the message-passing scheme, two-fold parallelism is
achieved by simultaneous activation of two 'co-operating'
processes.
• Each process can construct messages (think of these as values
of some abstract data type) and send them to other processes.
A process has to receive incoming messages explicitly (this
restriction can be overcome, but it is not a straightforward matter
to do so).
• The message-passing scheme is illustrated in the following
diagram:
Process A
21/07/2015
Code A
Code B
Data A
Data B
8
Two-fold Parallelism
• In the message-passing scheme, two-fold parallelism is achieved
by simultaneous activation of two 'co-operating' processes.
• Each process can construct messages (think of these as values of
some abstract data type) and send them to other processes. A
process has to receive incoming messages explicitly (this
restriction can be overcome, but it is not a straightforward matter to
do so).
• The message-passing scheme is illustrated in the following
diagram:
21/07/2015
Code A
Code B
Data A
Data B
Process B
9
Two-fold Parallelism
• Within a single process, an obvious way of allowing two-fold
parallel execution is to allow two program counters to control
progress through two separate, but related, code states. To a
first approximation, the two streams of instructions will need to
share the sequential data state.
Code A
Code B
Shared Data
• When, as frequently happens, Code A and Code B are identical,
this scheme is termed single-program, multiple-data (SPMD).
21/07/2015
10
Two-fold Parallelism
• Within a single process, an obvious way of allowing two-fold
parallel execution is to allow two program counters to control
progress through two separate, but related, code states. To a
first approximation, the two streams of instructions will need to
share the sequential data state.
Thread A
Code A
Code B
Shared Data
• When, as frequently happens, Code A and Code B are identical,
this scheme is termed single-program, multiple-data (SPMD).
21/07/2015
11
Two-fold Parallelism
• Within a single process, an obvious way of allowing two-fold
parallel execution is to allow two program counters to control
progress through two separate, but related, code states. To a
first approximation, the two streams of instructions will need to
share the sequential data state.
Code A
Code B
Thread B
Shared Data
• When, as frequently happens, Code A and Code B are identical,
this scheme is termed single-program, multiple-data (SPMD).
21/07/2015
12
Privatising Data
• Each stream of instructions (from Code A and from Code B) will
issue references to the shared data state using a global
addressing scheme (i.e. the same address, issued from
whichever stream of instructions, will access the same shared
data memory location).
• There are obvious problems of contention and propriety
associated with this sharing arrangement; it will be necessary to
use locks to protect any variable that might be shared, and
these will affect performance.
• Hence, it is usual to try and identify more precisely which parts
of the data state really need to be shared; then at least the use
of locks can be confined to those variables (and only those
variables) that really need the protection.
21/07/2015
13
Privatising Data
• In general, there will be some variables that are only referenced
from one instruction stream or the other. Assuming that these
can be identified, we can segregate the data state into three
segments, as follows:
Code A
Code B
Private Shared Private
Data B
Data A Data
• We can then isolate the execution objects, thread A and thread
B, within the process, which have the Shared Data as their only
part in common.
21/07/2015
14
Privatising Data
• In general, there will be some variables that are only referenced
from one instruction stream or the other. Assuming that these
can be identified, we can segregate the data state into three
segments, as follows:
Thread A
Code A
Code B
Private Shared Private
Data B
Data A Data
• We can then isolate the execution objects, thread A and thread
B, within the process, which have the Shared Data as their only
part in common.
21/07/2015
15
Privatising Data
• In general, there will be some variables that are only referenced
from one instruction stream or the other. Assuming that these
can be identified, we can segregate the data state into three
segments, as follows:
Code A
Code B
Thread B
Private Shared Private
Data B
Data A Data
• We can then isolate the execution objects, thread A and thread
B, within the process, which have the Shared Data as their only
part in common.
21/07/2015
16
Identifying Private Data
• Determining which variables fall into which category (private-toA; private-to-B; shared) is non-trivial. In particular, the required
category for a certain variable may depend on the values of
variables elsewhere in the data state.
• In the general case (more than two threads) the procedure for
identifying categories must distinguish the following:
– Shared variable --- can potentially be accessed by more than one
thread.
– Private variable (to thread X) --- can only ever be accessed by
thread X.
• How to achieve this distinction in acceptable time is an
interesting research problem.
21/07/2015
17
Parallel Programming Language
Requirements
• Consider the additional programming language constructs that
will be necessary to handle parallelism in either of the ways we
have described.
• Message-Passing (between processes):
–
means to create new processes;
–
means to place data in a process;
–
means to send/receive messages;
–
means to terminate 'dead' processes.
• Data-Sharing (between threads in one process):
–
means to create new threads;
–
means to share/privatise data;
–
means to synchronise shared accesses;
–
means to terminate 'dead' threads.
21/07/2015
18
Summary
• The transition from a sequential programming model
to a parallel programming model can be made in two
distinct ways:
– Message-passing between separate processes (processbased parallel programming); or
– Data-sharing between separate threads within a single
process.
• Both models place new requirements for constructs
in programming languages.
• Neither model precludes use of the other; they are
simply different ways of introducing parallelism at the
program level.
21/07/2015
19