Parallel Computer Architecture

Download Report

Transcript Parallel Computer Architecture

Parallel Computer
Architecture
Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen
S
What is Parallel Computing
S Definition:
S Parallel computing is a type of computing
architecture in which several processors execute or
process an application or computation
simultaneously.
S
The Compute Resources
S The Computational problem
Why Parallel Computing
S Parallel computing is used in the natural world.
S Parallel computing is suited in real world phenomena.
S Reasons for using parallel computing:
S Save time and/or money
S Solve larger/more complex problems
S Provide concurrency
S Take advantage of non-local resources
S Cost savings
S Overcoming memory constraints
Types Of Parallelism
S Bit-Level
S ALU Parallelism
S Based on increasing processor word size
S Instruction Level
S Pipelining
S Execution of several instructions simultaneously
S Thread Level
S Splitting program into parts, then having them run side by side
Hardware Architecture
S
Flynn’s Classical Taxonomy
• Michael J. Flynn –1966
• Classification of parallel computer architectures based on based on the number
of instruction and data streams available.
• Machines can have one or multiple data streams and one or multiple processors
SISD
MISD
Single Instruction, Single Data
Multiple Instruction, Single Data
SIMD
MIMD
Single Instruction, Multiple Data
Multiple Instruction, Multiple Data
SISD
S Single Instruction, Single
Data
S Uniprocessors
S Simple to design
S Not as flexible as MIMD
SIMD
S Single Instruction,
Multiple Data
S Array Processors
S Multiprocessors must
execute same instructions
simultaneously
MISD
S Multiple Instruction, Single
Data
MIMD
S Multiple Instruction,
Multiple Data
S Each processor has its own
independent instruction and
data stream
S Can be further separated into
shared memory and
distributed memory
Flynn’s Classical Taxonomy
Advantages
S
Most widely accepted
Disadvantages
S
Very few applications of MISD
machines
S
Assumes parallelism was homogenous
S
No consideration of how processors
are connected or how they view
memory in the MIMD category
Memory Architectures
S
Shared Memory
S
Some General Characteristics of memory include:
S
Shared memory parallel computers vary widely, but generally have in common the ability for all processors
to access all memory as global address space. Multiple processors can operate independently but share the
same memory resources.
S
Changes in a memory location effected by one processor are visible to all other processors.
S
Historically, shared memory machines have been classified as UMA(Uniform memory access) and NUMA(non
uniform memory access), based upon memory access times.
S
Some characteristics of UMA include: It is most commonly used in Symmetric Multiprocessor machines, it
has identical processors, equal access and access times to memory,.
S
Some characteristics of NUMA include: Often made by physically linking two or more SMP’s, one SMP
can directly access memory of other SMP. Not all processors have equal access time to all memories.
Memory access over link is slower.
Distributed Memory
S
General Characteristics: Distributed memory systems require a communication network to
connect inter-processor memory.
S
Processors have their own local memory. Memory addresses in one processor do not map to
another processor, so there is no concept of global address space across all processors.
S
Because each processor has its own local memory, it operates independently. Changes it
makes to its local memory have no effect on the memory of other processors.
S
Advantages of Distributed Memory include: Memory is scalable with the number of
processors. Increase the number of processors and the size of memory increases
proportionately.
S
Disadvantages include: The programmer is responsible for many of the details associated
with data communication between processors. It also may be difficult to map existing data
structures, based on global memory, to this memory organization.
Hybrid Distributed-Shared
Memory
S
The largest and fastest computers in the world today employ both shared and distributed
memory architectures.
S
The distributed memory component is the networking of multiple shared memory/GPU
machines, which know only about their own memory - not the memory on another
machine. Therefore, network communications are required to move data from one
machine to another.
S
Advantages and Disadvantages:
S
Whatever is common to both shared and distributed memory architectures.
S
Increased scalability is an important advantage
S
Increased programmer complexity is an important disadvantage
Parallel Programming
Models
S
Shared Memory Model
1. In the shared-memory programming model, tasks
share a common address space, which they read and
write asynchronously.
2. Various mechanisms such as locks / semaphores
may be used to control access to the shared memory.
3. An advantage of this model from the
programmer's point of view is that the notion of
data "ownership" is lacking, so program
development can often be simplified.
4. An important disadvantage in terms of
performance is that it becomes more difficult to
understand and manage data locality.
http://www.fergustan.net/dbs-atm-queues/
Threads Model
1. In the threads model of parallel
programming, a single process can have
multiple, concurrent execution paths.
2.Threads are commonly associated with
shared memory architectures and
operating systems.
3. Unrelated standardization efforts have
resulted in two very different implementations
of threads: POSIX Threads and OpenMP.
http://www.bu.edu/today/2012/boston-landmarks-orchestra-hosts-first-gala/
4. Microsoft has its own implementation for
threads, which is not related to the UNIX POSIX
standard or OpenMP.
Message Passing Model
http://www.bigbluewave.ca/2012/08/canadian-pro-lifers-need-to-talk-to.html
1. In a message passing model, parallel tasks
exchange data through passing messages to
one another. These communications can be
asynchronous or synchronous
2. The Communicating Sequential Processes
(CSP) formalization of message-passing
employed communication channels to 'connect'
processes, and led to a number of important
languages such as Joyce, Occam and Erlang.
3. Message Passing Model Implementations (MPI)
is now the "de facto" industry standard for message
passing, replacing virtually all other message
passing implementations used for production work.
Most, if not all of the popular parallel computing
platforms offer at least one implementation of MPI.
A few offer a full implementation of MPI-2.
Data Parallel Model
S Address space is globally treated.
S Focuses on performing operations on a data set.
S Sets of tasks work simultaneously, but on different partition
on same data structure
S Tasks perform the same operation on its partition
References
S https://computing.llnl.gov/tutorials/parallel_comp/#MemoryArch
S "Introduction to parallel computation" . Blaise
Barney.http://computing.llnl.gov/tutorials/parallel_comp/#WhyUs
e
S Linda Null, Julia Lobur. (2012) ."Computer organization and
Architectecture". Alternative Architecture,Third Edition, 505-541
S -http://en.wikipedia.org/wiki/Parallel_programming_model
-cosy.univ-reims.fr/~fnolot/.../introduction_to_parallel_computing.ppt
S https://computing.llnl.gov/tutorials/parallel_comp/#ModelsData