OpenMP PowerPoint Slides

Download Report

Transcript OpenMP PowerPoint Slides

CS 838: Pervasive Parallelism
Introduction to OpenMP
Copyright 2005 Mark D. Hill
University of Wisconsin-Madison
Slides are derived from online references of
Lawrence Livermore National Laboratory, National
Energy Research Scientific Computing Center,
University of Minnesota, OpenMP.org
Thanks!
Outline
• Introduction
– Motivation
– Example
• Programming Model
– Expressing parallelism
– Synchronization
• Syntax
(C) 2005
CS 838
2
Introduction to OpenMP
• What is OpenMP?
– Open specification for Multi-Processing
– “Standard” API for defining multi-threaded shared-memory
programs
• Header
– Preprocessor (compiler) directives
– Library Calls
– Environment Variables
(C) 2005
CS 838
3
Motivation
• Thread libraries are hard to use
– P-Threads/Solaris threads have many library calls for initialization,
synchronization, thread creation, condition variables, etc.
– Usually require alternate programming styles
» Programmer must code with multiple threads in mind
• Synchronization between threads introduces a new
dimension of program correctness
(C) 2005
CS 838
4
Motivation
• Wouldn’t it be nice to write serial programs and
somehow parallelize them “automatically”?
– OpenMP can parallelize many serial programs with relatively few
annotations that specify parallelism and independence
– OpenMP is a small API that hides cumbersome threading calls with
simpler directives
(C) 2005
CS 838
5
Example: Hello, World!
Files:
(Makefile)
hello.c
(C) 2005
CS 838
6
Outline
• Introduction
– Motivation
– Example
• Programming Model
– Expressing parallelism
– Synchronization
• Syntax
(C) 2005
CS 838
7
Programming Model
• Thread-like Fork/Join
model
(C) 2005
– Typically multiple thread
creation/ destruction
events
Fork
– Some built-in automatic
parallelization
Join
CS 838
8
Programming Models
• Data parallelism
– Threads perform similar
functions, guided by
thread identifier
Fork
• Control parallelism
– Threads perform differing
functions
» One thread for I/O, one for
computation, etc…
(C) 2005
Join
CS 838
9
Programming Model
Master Thread
• Thread with ID=0
• Only thread that exists in
sequential regions
• Depending on
implementation, may have
special purpose inside
parallel regions
• Some special directives
affect only the master thread
(like master)
(C) 2005
0
Fork
0 1 2 3 4 5 6 7
Join
0
CS 838
10
Programming Model
Parallel Code Sections
• Loop annotation directives can be used to parallelize
loops
• Explicit parallel regions are declared with the
parallel directive
– Start of region corresponds to N pthread_create() calls (Fork
Event)
– End of region corresponds to N pthread_join() calls (Join Event)
(C) 2005
CS 838
11
Programming Model
Synchronization Constructs
• Synchronization provided by OpenMP specification
– Can declare Critical Sections in code
» Mutual exclusion guaranteed at runtime
– Can declare “simple” statements as atomic
– Barrier directives
– Lock functions
(C) 2005
CS 838
12
Programming Model
Directives
• Directives (preprocessor) used to express parallelism
and independence to OpenMP implementation
• Some synchronization directives
– Atomic, Critical Section, Barrier
(C) 2005
CS 838
13
Programming Models
Library Calls
• Library calls provide functionality that cannot be
built-in at compile-time
• Mutator/Accessor functions
– omp_[get,set]_num_threads()
• Lock/Unlock functionality
(C) 2005
CS 838
14
Programming Model
Environment Variables
• Provide default behavior
• Allow other processes to change behavior of
OpenMP-enabled programs
• Useful for scripting:
– setenv OMP_NUM_THREADS 4
(C) 2005
CS 838
15
Limitations
• OpenMP isn’t guaranteed to divide work optimally
among threads
• Highly sensitive to programming style
• Overheads higher than traditional threading
• Requires compiler support (use cc)
• Doesn’t parallelize dependencies
(C) 2005
CS 838
16
Outline
• Introduction
– Motivation
– Example
• Programming Model
– Expressing parallelism
– Synchronization
• Syntax
(C) 2005
CS 838
17
OpenMP Syntax
• General syntax for OpenMP directives
#pragma omp directive [clause…] CR
• Directive specifies type of OpenMP operation
– Parallelization
– Synchronization
– Etc.
• Clauses (optional) modify semantics of Directive
(C) 2005
CS 838
18
OpenMP Syntax
• PARALLEL syntax
#pragma omp parallel [clause…] CR
structured_block
Ex:
#pragma omp parallel
{
printf(“Hello!\n”);
} // implicit barrier
(C) 2005
Output:
(N=4)
Hello!
Hello!
Hello!
Hello!
CS 838
19
OpenMP Syntax
• DO/for Syntax (DO-Fortran, for-C)
#pragma omp for [clause…] CR
for_loop
Ex:
#pragma omp parallel
{
#pragma omp for private(i) shared(x) \
schedule(static,x/N)
for(i=0;i<x;i++) printf(“Hello!\n”);
} // implicit barrier
Note: Must reside inside a parallel section
(C) 2005
CS 838
20
OpenMP Syntax
More on Clauses
• private() – A variable in private list is private to
each thread
• shared() – Variables in shared list are visible to all
threads
– Implies no synchronization, or even consistency!
• schedule() – Determines how iterations will be
divided among threads
– schedule(static, C) – Each thread will be given C iterations
» Usually N*C = Number of total iterations
– schedule(dynamic) – Each thread will be given additional
iterations as-needed
» Often less efficient than considered static allocation
• nowait – Removes implicit barrier from end of block
(C) 2005
CS 838
21
OpenMP Syntax
• PARALLEL FOR (combines parallel and for)
#pragma omp parallel for [clause…] CR
for_loop
Ex:
#pragma omp parallel for shared(x)\
private(i) \
schedule(dynamic)
for(i=0;i<x;i++) {
printf(“Hello!\n”);
}
(C) 2005
CS 838
22
Example: AddMatrix
Files:
(Makefile)
addmatrix.c
matrixmain.c
printmatrix.c
(C) 2005
// omp-parallelized
// non-omp
// non-omp
CS 838
23
OpenMP Syntax
• ATOMIC syntax
#pragma omp atomic CR
simple_statement
Ex:
#pragma omp parallel shared(x)
{
#pragma omp atomic
x++;
} // implicit barrier
(C) 2005
CS 838
24
OpenMP Syntax
• CRITICAL syntax
#pragma omp critical CR
structured_block
Ex:
#pragma omp parallel shared(x)
{
#pragma omp critical
{
// only one thread in here
}
} // implicit barrier
(C) 2005
CS 838
25
OpenMP Syntax
ATOMIC vs. CRITICAL
• Use ATOMIC for “simple statements”
– Usu. Lower overhead than CRITICAL
• Use CRITICAL for larger expressions
– May involve an unseen implicit lock
(C) 2005
CS 838
26
OpenMP Syntax
• MASTER – only Thread 0 executes a block
#pragma omp master CR
structured_block
• SINGLE – only one thread executes a block
#pragma omp single CR
structured_block
• No implied synchronization
(C) 2005
CS 838
27
OpenMP Syntax
• BARRIER
#pragma omp barrier CR
• Locks
–
–
–
–
–
–
(C) 2005
Locks are provided through omp.h library calls
omp_init_lock()
omp_destroy_lock()
omp_test_lock()
omp_set_lock()
omp_unset_lock()
CS 838
28
OpenMP Syntax
• FLUSH
#pragma omp flush CR
• Guarantees that threads’ views of memory is
consistent
• Why? Remember OpenMP directives…
– Code generated by directives at compile-time cannot respond to
dynamic events
» Variables are not always declared as volatile
» Using variables from registers instead of memory can seem like
a consistency violation
– Synch. Often has an implicit flush
» ATOMIC, CRITICAL
(C) 2005
CS 838
29
Example: Synchronization
Files:
(Makefile)
increment.c
(C) 2005
CS 838
30
OpenMP Syntax
• Functions
omp_set_num_threads()
omp_get_num_threads()
omp_get_max_threads()
omp_get_num_procs()
omp_get_thread_num()
omp_set_dynamic()
omp_[init destroy test set unset]_lock()
(C) 2005
CS 838
31