Computational Informatics for Brain Electromagnetic Research
Download
Report
Transcript Computational Informatics for Brain Electromagnetic Research
Integrating Performance Analysis in
the Uintah Software Development Cycle
Allen D. Malony,
Sameer Shende
J. Davison de St. Germain,
Allan Morris, Steven G. Parker
{malony,sameer}@cs.uoregon.edu
{dav,amorris,sparker}@cs.utah.edu
Department of Computer and
Information Science
Computational Science Institute
University of Oregon
Department of Computer Science
School of Computing
University of Oregon
Outline
Scientific software engineering
C-SAFE and Uintah Computational Framework (UCF)
TAU performance system
Role of performance mapping
Performance analysis integration in UCF
Goals and design
Challenges for performance technology integration
TAU performance mapping
X-PARE
Concluding remarks
May 16, 2002
2
ISHPC 2002
Scientific Software (Performance) Engineering
Modern scientific simulation software is complex ()
Large development teams of diverse expertise
Simultaneous development on different system parts
Iterative, multi-stage, long-term software development
Need support for managing complex software process
Software engineering tools for revision control,
automated testing, and bug tracking are commonplace
In contrast, tools for performance engineering are not
evaluation
(measurement, analysis, benchmarking)
optimization (diagnosis, tracking, prediction, tuning)
Incorporate performance engineering methodology and
support by flexible and robust performance tools
May 16, 2002
3
ISHPC 2002
Utah ASCI/ASAP Level 1 Center (C-SAFE)
C-SAFE was established to build a problem-solving
environment (PSE) for the numerical simulation of
accidental fires and explosions
Combine fundamental chemistry and engineering physics
Integrate non-linear solvers, optimization, computational
steering, visualization, and experimental data verification
Support very large-scale coupled simulations
Computer science problems:
May 16, 2002
Coupling multiple scientific simulation codes with
different numerical and software properties
Software engineering across diverse expert teams
Achieving high performance on large-scale systems
4
ISHPC 2002
Example C-SAFE Simulation Problems
Heptane fire simulation
∑
Typical C-SAFE simulation with
a billion degrees of freedom and
non-linear time dynamics
Material stress simulation
May 16, 2002
5
ISHPC 2002
Uintah Problem Solving Environment (PSE)
Enhanced SCIRun PSE
Pure dataflow component-based
Shared memory scalable multi-/mixed-mode parallelism
Interactive only interactive plus standalone
Design and implement Uintah component architecture
Application programmers provide
description
of computation (tasks and variables)
code to perform task on single “patch” (sub-region of space)
Components for scheduling, partitioning, load balance, …
Follow Common Component Architecture (CCA) model
Design and implement Uintah Computational Framework
(UCF) on top of the component architecture
May 16, 2002
6
ISHPC 2002
Uintah High-Level Component View
May 16, 2002
7
ISHPC 2002
Uintah Parallel Component Architecture
C-SAFE
Problem Specification
High Level Architecture
Scheduler
Subgrid
Model
Mixing
Model
Simulation
Controller
Numerical
Solvers
Fluid
Model
High Energy
Simulations
Material
Properties
Database
MPM
Data
Manager
Post Processing
And Analysis
Numerical
Solvers
Parallel
Services
Resource
Management
Visualization
Database
Chemistry
Databases
Chemistry
Database
Controller
Performance
Analysis
Non-PSE Components
Implicitly
Connected to
All Components
UCF
Data
Checkpointing
PSE Components
Control / Light Data
Blazer
May 16, 2002
8
ISHPC 2002
Uintah Computational Framework (UCF)
Execution model based on software (macro) dataflow
Exposes parallelism and hides data transport latency
Computations expressed a directed acyclic graphs of tasks
consumes
input and produces output (input to future task)
input/outputs specified for each patch in a structured grid
Abstraction of global single-assignment memory
DataWarehouse
Directory mapping names to values (array structured)
Write value once then communicate to awaiting tasks
Task graph gets mapped to processing resources
Communications schedule approximates global optimal
May 16, 2002
9
ISHPC 2002
Uintah Task Graph (Material Point Method)
Diagram of named tasks
(ovals) and data (edges)
Imminent computation
Dataflow-constrained
MPM
May 16, 2002
Newtonian material point
motion time step
Solid: values defined at
material point (particle)
Dashed: values defined at
vertex (grid)
Prime (’): values updated
during time step
10
ISHPC 2002
Example Taskgraphs (MPM and Coupled)
May 16, 2002
11
ISHPC 2002
Uintah PSE
UCF automatically sets up:
Domain decomposition
Inter-processor communication with aggregation/reduction
Parallel I/O
Checkpoint and restart
Performance measurement and analysis (stay tuned)
Software engineering
May 16, 2002
Coding standards
CVS (Commits: Y3 - 26.6 files/day, Y4 - 29.9 files/day)
Correctness regression testing with bugzilla bug tracking
Nightly build (parallel compiles)
170,000 lines of code (Fortran and C++ tasks supported)
12
ISHPC 2002
Performance Technology Integration
Uintah presents challenges to performance integration
Software diversity and structure
UCF
middleware, simulation code modules
component-based hierarchy
Portability objectives
cross-language
and cross-platform
multi-parallelism: thread, message passing, mixed
Scalability objectives
High-level programming and execution abstractions
Requires flexible and robust performance technology
Requires support for performance mapping
May 16, 2002
13
ISHPC 2002
TAU Performance System Framework
Tuning and Analysis Utilities
Performance system framework for scalable parallel and
distributed high-performance computing
Targets a general complex system computation model
nodes / contexts / threads
Multi-level: system / software / parallelism
Measurement and analysis abstraction
Integrated toolkit for performance instrumentation,
measurement, analysis, and visualization
May 16, 2002
Portable performance profiling/tracing facility
Open software approach
14
ISHPC 2002
TAU Performance System Architecture
Paraver
EPILOG
May 16, 2002
15
ISHPC 2002
Performance Analysis Objectives for Uintah
Micro tuning
Optimization of simulation code (task) kernels for
maximum serial performance
Scalability tuning
Identification of parallel execution bottlenecks
overheads:
scheduler, data warehouse, communication
load imbalance
Adjustment of task graph decomposition and scheduling
Performance tracking
Understand performance impacts of code modifications
Throughout course of software development
C-SAFE
May 16, 2002
application and UCF software
16
ISHPC 2002
Task Execution in Uintah Parallel Scheduler
Profile methods
and functions in
scheduler and in
MPI library
Task execution time
dominates (what task?)
Task execution time
distribution per process
MPI communication
overheads (where?)
Need to map
performance data!
May 16, 2002
18
ISHPC 2002
Semantics-Based Performance Mapping
Associate
performance
measurements
with high-level
semantic
abstractions
Need mapping
support in the
performance
measurement
system to assign
data correctly
May 16, 2002
19
ISHPC 2002
Hypothetical Mapping Example
Particles distributed on surfaces of a cube
Particle* P[MAX]; /* Array of particles */
int GenerateParticles() {
/* distribute particles over all faces of the cube */
for (int face=0, last=0; face < 6; face++){
/* particles on this face */
int particles_on_this_face = num(face);
for (int i=last; i < particles_on_this_face; i++) {
/* particle properties are a function of face */
P[i] = ... f(face);
...
}
last+= particles_on_this_face;
}
}
May 16, 2002
20
ISHPC 2002
Hypothetical Mapping Example (continued)
int ProcessParticle(Particle *p) {
/* perform some computation on p */
}
int main() {
GenerateParticles();
/* create a list of particles */
for (int i = 0; i < N; i++)
/* iterates over the list */
ProcessParticle(P[i]);
}
How much time (flops) spent processing face i particles?
What is the distribution of performance among faces?
How is this determined if execution is parallel?
May 16, 2002
21
ISHPC 2002
No Performance Mapping versus Mapping
Typical performance
tools report performance
with respect to routines
Does not provide support
for mapping
TAU (w/ mapping)
TAU (no mapping)
May 16, 2002
TAU’s performance
mapping can observe
performance with respect
to scientist’s
programming and
problem abstractions
22
ISHPC 2002
Uintah Task Performance Mapping
Uintah partitions individual particles across processing
elements (processes or threads)
Simulation tasks in task graph work on particles
Tasks have domain-specific character in the computation
“interpolate
particles to grid” in Material Point Method
Task instances generated for each partitioned particle set
Execution scheduled with respect to task dependencies
How to attribute execution time among different tasks?
Assign semantic name (task type) to a task instance
SerialMPM::interpolateParticleToGrid
Map TAU timer object to (abstract) task (semantic entity)
Look up timer object using task type (semantic attribute)
Further partition along different domain-specific axes
May 16, 2002
23
ISHPC 2002
Task Performance Mapping (Profile)
Mapped task
performance
across processes
Performance
mapping for
different tasks
May 16, 2002
24
ISHPC 2002
Task Performance Mapping (Trace)
Work packet
computation
events colored
by task type
May 16, 2002
Distinct phases of
computation can be
identifed based on task
25
ISHPC 2002
Task Performance Mapping (Trace - Zoom)
Startup
communication
imbalance
May 16, 2002
26
ISHPC 2002
Task Performance Mapping (Trace - Parallelism)
Communication
/ load imbalance
May 16, 2002
27
ISHPC 2002
Comparing Uintah Traces for Scalability Analysis
8 processes
32 processes
32 processes
May 16, 2002
8 processes
28
ISHPC 2002
Performance Tracking and Reporting
Integrated performance measurement allows
performance analysis throughout development lifetime
Applied performance engineering in software design and
development (software engineering) process
Create “performance portfolio” from regular performance
experimentation (couple with software testing)
Use performance knowledge in making key software
design decision, prior to major development stages
Use performance benchmarking and regression testing to
identify irregularities
Support automatic reporting of “performance bugs”
Enable cross-platform (cross-generation) evaluation
May 16, 2002
29
ISHPC 2002
XPARE - eXPeriment Alerting and REporting
Experiment launcher automates measurement / analysis
Reporting system conducts performance regression tests
Configuration and compilation of performance tools
Instrumentation control for Uintah experiment type
Execution of multiple performance experiments
Performance data collection, analysis, and storage
Integrated in Uintah software testing harness
Apply performance difference thresholds (alert ruleset)
Alerts users via email if thresholds have been exceeded
Web alerting setup and full performance data reporting
Historical performance data analysis
May 16, 2002
30
ISHPC 2002
XPARE System Architecture
Experiment
Launch
Mail
server
Performance
Database
Performance
Reporter
May 16, 2002
Alerting
Setup
Comparison
Tool
31
Regression
Analyzer
ISHPC 2002
Scaling Performance Optimizations (Past)
Last year:
initial “correct”
scheduler
Reduce
communication
by 10 x
Reduce task
graph overhead
by 20 x
ASCI Nirvana
SGI Origin 2000
Los Alamos
National Laboratory
May 16, 2002
32
ISHPC 2002
Scalability to 2000 Processors (Current)
ASCI Nirvana
SGI Origin 2000
Los Alamos
National Laboratory
May 16, 2002
33
ISHPC 2002
Concluding Remarks
Modern scientific simulation environments involves a
complex (scientific) software engineering process
Complex parallel software and systems pose challenging
performance analysis problems that require flexible and
robust performance technology and methods
Iterative, diverse expertise, multiple teams, concurrent
Cross-platform, cross-language, large-scale
Fully-integrated performance analysis system
Performance mapping
Neet to support performance engineering methodology
within scientific software design and development
May 16, 2002
Performance comparison and tracking
34
ISHPC 2002
Acknowledgements
Department of Energy (DOE), ASCI Academic
Strategic Alliances Program (ASAP)
Center for the Simulation of Accidental Fires and
Explosions (C-SAFE), ASCI/ASAP Level 1 center,
University of Utah
http://www.csafe.utah.edu
Computational Science Institute, ASCI/ASAP
Level 3 projects with LLNL / LANL,
University of Oregon
http://www.csi.uoregon.edu
ftp://ftp.cs.uoregon.edu/pub/malony/Talks/ishpc2002.ppt
May 16, 2002
35
ISHPC 2002