Transcript paper.ppt
Parallel Apps
November 6, 2000
Hyang-Ah Kim
Brenda Liu
SoYoung Park
Outline
11/6/00
Introduction
Barnes background
Barnes optimizations
Ocean background
Ocean optimizations
Conclusion
Parallel Applications
2
Introduction
Minimum problem size
Scale application performance
Programming models
SAS
CC-NUMA
Parallel efficiency?
(speedup
11/6/00
over uniprocessor) / p
Parallel Applications
3
Barnes Background
N-body galaxy simulation
Star on w hich forc es
are being computed
Small gr oup far enough aw ay to
approximate to center of mass
Star too close to
approximate
Large group far
enough aw ay to
approximate
Communication pattern?
Irregular
Hierarchical
11/6/00
Parallel Applications
4
Barnes Problem Size
Optimizations visited:
Data
placement
Dynamic partitioning
Prefetching
11/6/00
Work needed to scale is
algorithmic
Parallel Applications
5
Scaling Performance
Performance change from 32 to
128 processors?
Degradation:
Communicationcomputation ratio, communication
pattern, load balance, locality,
synchronization
How can they be overcome?
Increase
problem size
Application restructuring
11/6/00
Parallel Applications
6
General Findings
11/6/00
Scaling to 128 processors without
any change
Parallel Applications
7
Scaling Barnes
11/6/00
Memory bottleneck: building
shared tree (31% in 128-proc vs.
2% is uniprocessor)
Original
algorithm:
globally
shared tree
Parallel Applications
8
Scaling Barnes
11/6/00
Parallel Applications
9
Scaling Barnes
11/6/00
New algorithm: MergeTree
Parallel Applications
10
Ocean Background
Ocean simulation using multigrid
solver
Communication pattern?
Nearest
neighbor iterative
Hierarchical
11/6/00
Parallel Applications
11
Ocean Problem Size
Optimizations visited:
Processor-centric
11/6/00
array data
structures
Data placement
Prefetching
Work needed to scale is difficult
Parallel Applications
12
Programming Models
Options
Shared Address
Space
Message Passing
SHMEM
Motivation
if
application is regular / predictable?
If we can use similar algorithms and
partitions across the models?
11/6/00
Parallel Applications
13
Ocean Discussions
11/6/00
Parallel Applications
14
Ocean Discussions
11/6/00
Parallel Applications
15
Conclusion
Some guidelines
Load
balancing for moderate
systems, communication for large
systems
Data partition & placement
Very application dependent
Optimization
Programming
11/6/00
model
Parallel Applications
16