Tools and Utilities for parallel and serial codes in ENEA

Download Report

Transcript Tools and Utilities for parallel and serial codes in ENEA

C.R. ENEA-Portici. 11/12/2007
Tools and Utilities for parallel
and serial codes in ENEA-GRID
environment
CRESCO Project: Salvatore Raia
SubProject I.2
OUTLINE:
 GRID, cluster and parallel Computing (Intro)
 ENEA-GRID. Architecture and functionality
 My Activity for CRESCO project and results on
ENEA-GRID
 Conclusion and objectives
C.R. ENEA-Portici. 11/12/2007
What is a cluster ?
Supercomputer= computer with many
of resources
processors connected-Collection
via high-speed
(HW, SW) connected via
computer bus and that share
memory
public orthe
private
network
(SMP) . It runs one Operating system
- Each CPU runs a separated
istance of operating system
-Administration: local
supercomputer
cluster 1
C.R. ENEA-Portici. 11/12/2007
How to get a Grid ?
GRID
cluster
1
- Collection
of interconnected
= nodes made
of clusters
and
clusters
geographically
clustermay
2
each node
have Shared
or
distributed
Distributed memory
architectures
(Hybrid
cluster)3 that share processes .
administration:
sometimes
ENEA-GRID has -the
same structure
clusters belong
to different
With 6 clusters: Bologna,
Casaccia,
or company
cluster
N
Frascati,
Portici,department
Trisaia, Brindisi
GRID 1
C.R. ENEA-Portici. 11/12/2007
ENEA-GRID structure (HW)
C.R. ENEA-Portici. 11/12/2007
GRID features
 Pro:




Shared resources
Low costs (clock ?)
Open systems
Scalability
Frequency scaling (domain ?)
Power consumption P=C×V×V×F
How is it managed on
ENEA-GRID ?
 Con:
 Several platforms
 Load balancing
 User Access
C.R. ENEA-Portici. 11/12/2007
ENEA-GRID structure (SW)
ICA client
Resources management
File System
Operating Systems
C.R. ENEA-Portici. 11/12/2007
User Interface
USER ACCESS
 ICA client
 ssh o telnet
 web
Switch host
Run Appl.
Jobs status
C.R. ENEA-Portici. 11/12/2007
My activity on ENEA-GRID (CRESCO pr.)
Serial and Parallel (MPI) codes
How to cope with ?
Problem with:
 Multi platforms
 Load balancing
 User Access
LSF utilities
Software dev.
C.R. ENEA-Portici. 11/12/2007
User interfaces
Tools for Serial and Parallel (MPI) codes
Multi Platform
 Serial codes
 Parallel
codes
(MPI)
…So we need a lots of binaries
for each
platform.
Compilers
MPI Implementations
after compiling our source code
in each platform, we
 Launcher:
GNU
MPICH
have “binary1”…”binaryN” for
host1,…hostN.
 PGI
 LAM-MPI
It is a shell script (placed on AFS) that selects the righteous
 IBM
 POE
“binary” for the selected host
Problems with execution too
…tools
C.R. ENEA-Portici. 11/12/2007
Some MPI problems
Results: tools serial and parallel (MPI) codes
 SERIAL
 Program for Fortran 77/90,C and C++ serial compiling
(look Java Interface)
 Launcher for “NS2” application (use external libraries)
 PARALLEL (MPI)
 Launcher for running a test program (check command)
 Launcher for HPL test on AIX and Linux
user1 installation
user2 installation
C.R. ENEA-Portici. 11/12/2007
Analizing LSF utilities Serial and Parallel codes
LSF Resources
 Serial codes
 Parallel codes (MPI)
 Resources definition
 “NS2” application
 Serial LSF utilities
 Job array (Multicase)
 “lsgrun”
No correlation
 Parallel LSF utilities
 “mpijob” (MPICH)
 “poejob” (POE)
Correlation
C.R. ENEA-Portici. 11/12/2007
Results: Integration with other application
(My)Java Interface
 Serial codes
 Parallel codes (MPI)
C.R. ENEA-Portici. 11/12/2007
Conclusion and objectives
 Launcher + LSF utilities + User interface
allow to create a omogeneous environment
 Objectives:
 Optimization of programs to launch serial and
parallel codes, including checking resources to run
the application (e.g. library, other programs, etc)
 Exploitation of LSF utilities in order to make easy
running MPI programs (mpijob, poejob, etc) and load
balancing
 Improve error handling for user interfaces … …
C.R. ENEA-Portici. 11/12/2007
Andrew File System
C.R. ENEA-Portici. 11/12/2007
LSF-Load Sharing Facilities
C.R. ENEA-Portici. 11/12/2007