Getting started on the Cray XE6 Beagle Beagle Team (beagle-support@ci.uchicago.edu) Computation Institute University of Chicago & Argonne National Laboratory www.ci.anl.gov www.ci.uchicago.edu.

Getting started on the Cray XE6 Beagle Beagle Team ([email protected]) Computation Institute University of Chicago & Argonne National Laboratory www.ci.anl.gov www.ci.uchicago.edu.

Transcript Getting started on the Cray XE6 Beagle Beagle Team ([email protected]) Computation Institute University of Chicago & Argonne National Laboratory www.ci.anl.gov www.ci.uchicago.edu.

Getting started on the Cray XE6
Beagle
Beagle Team ([email protected])
Computation Institute
University of Chicago & Argonne National Laboratory
www.ci.anl.gov
www.ci.uchicago.edu
Outline
•
•
•
•
•
•
•
•
•
2
What is the Computation Institute?
Beagle hardware
Basics about the work environment
Data transfer using Globus Online
Use of the compilers (C, C++, and Fortran)
Launch of a parallel application
Job monitoring
Introduction to debugger and profiler
Introduction to Parallel scripting with Swift
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Computation Institute
Director: Ian Foster
http://www.ci.uchicago.edu/
Contact: [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Computation Institute
Joint Argonne/Chicago institute, with ~100 Fellows (~50
UChicago faculty) and ~60 staff
• Primary goals:
•
–
–
–
4
Pursue new discoveries using multi-disciplinary collaborations
and computational methods
Develop new computational methods and paradigms required
to tackle these problems, and create the computational tools
required for the effective application of advanced methods at
the largest scales
Educate the next generation of investigators in the advanced
methods and platforms required for discovery
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
How the CI supports people who use Beagle
•
•
•
•
Startup assistance
User administration assistance
Job management services
Technical support
•
•
•
•
User campaign management
Assistance with planning, reporting
Collaboration within science domains
Beagle point of coordination
Beagle
Services
•
•
•
Workshops & seminars
Customized training programs
On-line content & user guides
•
Beagle’s wiki*
•
Beagle’s web page**
•
•
•
•
Performance engineering
Application tuning
Data analytics
I/O tuning
* http://www.ci.uchicago.edu/wiki/bin/view/Beagle/WebHome
** http://beagle.ci.uchicago.edu/
5
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Beagle: hardware overview
6
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Beagle “under the hood”
7
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Beagle Cray XE 6 system overview
Login nodes
#2
Accessible
Sandbox
nodeare submitted
Where jobs
Compute
# 1 nodes
Accessible
# 736
Compilation, script
design…
Service
nodes:
Not directly accessible
• Network access
Where computations
are
• Scheduler
performed
• I/O
• …
• To know more:
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SystemSpecs#Overview
8
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Compute nodes
Compute nodes
• 2 AMD Opteron 6100 “Magny-Cours”
• 12-core (24 per node)
• 2.1-GHz
• 32 GB RAM (8 GB per processor)
• No disk on node (mounts DVS and
Lustre network filesystems)
9
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Details about the Processors (sockets)
•
Superscalar:
3 Integer ALUs
• 3 Floating point ALUs (can do 4 FP per cycle)
•
•
Cache hierarchy:
Victim cache
• 64KB L1 instruction cache
• 64KB L1 data cache (latency 3 cycles)
• 512KB L2 cache per processor core (latency of 9 cycles)
• 12MB shared L3 cache (latency 45 cycles)
•
• To know more:
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SystemSpecs
10
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Interconnect
Interconnect
• Communication between compute nodes
and with service nodes
• Gemini Interconnect
• 2 nodes per Gemini ASIC
• 4 x 12-cores (48 per Gemini)
• Gemini are arranged in a 3D torus
• Latency ~ 1 μs
• 168 GB/s bandwidth of switching capacity
(20 GB injection per node)
• Resilient design
• To know more:
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SystemSpecs#Details_about_the_Interconnect
11
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Steps for computing on Beagle
You need a user id on Beagle
• You need an active project
• You need to understand the basics of how the
system works (check files, move files, create
directories)
• You need to move your data to Beagle
• The application(s) that perform the calculations
need to be installed on Beagle
• You need to submit and monitor your jobs to the
compute nodes
• You need to transfer your data back to your system
•
12
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
What you need to get started on Beagle
•
A CI account: if you don’t have it, get one
–
–
https://accounts.ci.uchicago.edu/
You will need some person at the CI to sponsor you, this person can be:
o
o
o
•
A CI project (for accounting)
–
https://www.ci.uchicago.edu/hpc/projects/
o
o
–
•
This will change later this year, to let allocations committee make
decisions
http://www.ci.uchicago.edu/faq
To know more about Beagle accounts and basics
–
13
For joining an HPC project
For creating a new HPC project
To know more about CI account and HPC basics
–
•
Your PI, if he or she is part of the CI
A collaborator that is part of the CI
A catalyst you will be working with
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/HowToStart
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Basics on using Beagle
•
Login
–
–
–
ssh to login.beagle.ci.uchicago.edu to submit jobs
ssh to sandbox.beagle.ci.uchicago.edu for CPU-intensive
development and interactive operations
To know more:
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_log_in
•
Data transfer
–
–
–
–
For small files scp or sftp
GridFTP to gridftp.beagle.ci.uchicago.edu
Or use Globus Online (coming later in the talk)
To know more:
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_move_data_to_and_from_Bea
•
How to receive personalized support
–
14
[email protected]
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Beagle’s operating system
•
•
Cray XE6 uses Cray Linux Environment v3 (CLE3)
SuSE Linux-based
Compute nodes use Compute Node Linux (CNL)
Login and sandbox nodes use a more standard
Linux
The two are different.
Compute nodes can operate in
•
ESM (extreme scalability mode) to optimize
performance to large multi-node calculations
– CCM (cluster compatibility mode) for out-of-the-box
compatibility with Linux/ x86 versions of software –
without
To know
more: recompilation or relinking!
•
•
•
•
–
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#Basics_about_the_work_environmen
15
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Modules and work environment
•
•
•
Modules sets the environment necessary to use a
specific to applications, collection of applications,
or libraries
A module dynamically modifies the user
environment
The module command provides a number of
capabilities including:
–
–
–
–
–
loading a module (module load)
unloading a module (module unload)
unloading a module and loading another (module swap)
listing which modules are loaded (module list)
determining which modules are available (module avail)
• To know more:
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#Modules_and_work_Environment
16
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Beagle’s filesystems
/lustre/beagle: local Lustre filesystem (read-write -- this
is where batch jobs should do most of their I/O. NO
BACKUP!)
• /gpfs/pads: PADS GPFS (read-write) – for permanent
storage
• /home: CI home directories (read-only on compute
nodes)
• USE LUSTRE ONLY for I/O on compute nodes:
•
–
–
It is considerably faster than other filesystems
Use of other filesystems can affect seriously performance as
they rely on network and I/O external to Beagle
To know more:
• • /soft,
/tmp, /var, /opt, /dev, … usually you won’t need to
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_work_on_the_filesystem
worry about those
17
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
The Lustre filesystem
The I/O during computation should be
done through the high performance Lustre
• Lustre is mounted as /lustre/beagle
• Users have to create their own directory
on Lustre. This is done to give them more
freedom in how to set it up (naming,
privacy …)
•
• To know more:
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#Tuning_the_performance_of_the_Lu
18
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Lustre performance: striping
Files in the Lustre filesystem are striped by
default: split up into pieces and sent to different
disks.
• This parallelization of the I/O allows the user to
use more disks at the same time and may give
them a higher bandwidth for I/O if used
properly.
• Usually good values are between one and four.
Higher values might be better for specific
but this is not likely.
• applications,
To know more:
•
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#Tuning_the_performance_of_the_Lu
19
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Lustre basic commands
•
•
•
•
•
20
lfs df — system configuration information
lfs setstripe — create a file or directory
with a specific striping pattern
lfs getstripe — display file striping
patterns
lfs find [directory | file name]
— find a file or directory
Try typing: man lfs
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
How to move data to and from Beagle
Beagle is not HIPAA-compliant — do not put PHI data on Beagle
• Example of factors for choosing a data movement tool:
•
–
–
–
–
–
•
how many files, how large the files are …
how much fault tolerance is desired,
performance
security requirements, and
the overhead needed for software setup.
Recommended tools:
–
scp/sftp can be OK for moving a few small files
o
o
–
pros: quick to initiate
cons: slow and not scalable
For optimal speed and reliability we recommend Globus Online :
high-performance (e.g., fast)
o reliable and easy to use
o easy to use from either a command line or web browser,
• To know more:
o provides fault tolerant, fire-and-forget transfers. If you know you'll be moving a lot
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_move_data_to_and_from_Bea
of data or find scp is too slow/unreliable we recommend
o
21
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Getting data to the right place…
“I need my data over there – at
my _____” (supercomputing
center, campus server, etc.)
Data
Destination
Data Source
Trivial, right?
22
Intro to Beagle – [email protected]
22
www.ci.anl.gov
www.ci.uchicago.edu
What’s the big deal?
Data
Destination
“GAAAH!
%&@#&”
Data Source
Reality: it is tedious and time-consuming
23
Intro to Beagle – [email protected]
23
www.ci.anl.gov
www.ci.uchicago.edu
How
How It Works
Data
Source
1
2
It Works
Globus Online
moves files
Data
Destination
User initiates
transfer
request
Globus Online
notifies user
24
3
www.ci.anl.gov
www.ci.uchicago.edu
How (2
It easy
Works
Getting Started
steps)
1. Sign up: Visit www.globusonline.org to create an account
25
www.ci.anl.gov
www.ci.uchicago.edu
How (2
It easy
Works
Getting Started
steps)
2. Start moving files: Pick your data and where you want to move
it, then click to transfer
26
www.ci.anl.gov
www.ci.uchicago.edu
HowOptions
It Works
File Movement
We strive to make Globus Online
broadly accessible…
• You can just move files using the Web
GUI
• To automate workflows you use the
Command Line Interface (CLI)
• To know more: (quickstart, tutorials, FAQs …)
https://www.globusonline.org/resources/
27
www.ci.anl.gov
www.ci.uchicago.edu
Steps for computing on Beagle
You need a user id on Beagle ✔
• You need an active project ✔
• You need to understand the basics of how the
system works (check files, move files, create ✔
directories)
• You need to move your data to Beagle ✔
• The application(s) that perform the calculations
need to be installed on Beagle
• You need to submit and monitor your jobs to the
compute nodes
• You need to transfer your data back to your system ✔
•
28
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Applications on Beagle
•
Applications on Beagle are run from the
command line, e.g.:
aprun –n 17664 myMPIapp <myInput >& this.log
•
How do I know if an application is on Beagle?
http://beagle.ci.uchicago.edu/software/
– http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SoftwareOnBeagle
– Use module avail, e.g.:
lpesce@login2:~> module avail 2>&1 | grep -i namd
gromacs/4.5.3(default)
namd/2.7(default)
–
•
•
29
What if it isn’t there?
What if I want to use my own application?
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
If you need a tool that isn’t on Beagle
For any specific requirements, submit a ticket to
[email protected]
with the following information:
• Research project, group and/or PI
• Name(s) of software packages(s)
• Intended use and/or purpose
• Licensing requirements (if applicable)
• Specific instructions or preferences (specific
release/version/vendor, associated packages,
URLs for download, etc.)
30
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Porting software to Beagle: modules
pesce@login2:~> module list
Currently Loaded Modulefiles:
1) modules/3.2.6.6
2) nodestat/2.2-1.0301.23102.11.16.gem
:
12) xtpe-network-gemini
13) pgi/11.1.0
14) xt-libsci/10.5.0
15) pmi/1.0-1.0000.8256.50.6.gem
• PrgEnv-xxxx refers to the
programming environment
currently loaded
16) xt-mpich2/5.2.0
17) xt-asyncpe/4.8
18) atp/1.1.1
• Default is PGI (Portland Group
compilers)
19) PrgEnv-pgi/3.1.61
20) xtpe-mc12
21) torque/2.5.4
22) moab/5.4.1
31
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Compilation environment
lpesce@login2:~> module avail PrgEnv
lpesce@login2:~> module avail 2>&1 | grep PrgEnv
PrgEnv-cray/1.0.2
Cray compilers
PrgEnv-cray/3.1.49A
-Excellent Fortran
PrgEnv-cray/3.1.61(default)
-CAF and UPC
PrgEnv-gnu/3.1.49A
We will soon have
Gnualso
compilers
PrgEnv-gnu/3.1.61(default)
Pathscale compilers
-Excellent C - Standard
PrgEnv-pgi/3.1.49A
PGI compilers
PrgEnv-pgi/3.1.61(default)
-Excellent Fortran
-Reliable
32
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Compiling on Beagle
•
Compilers are called
cc for a C compiler
– CC for a C++ compiler
– ftn for a Fortran compiler
–
•
•
•
Do not use gcc, gfortran … those commands will
produce an executable for the sandbox node!
CC, cc, ftn, etc. … are cross-compilers (driver
scripts) and produce code to be run on the
compute
nodes
To know more:
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/DevelopOnBeagle
33
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Compiling on Beagle: environment set up
•
•
•
Move your source files to Beagle
Select a compiler and load it
e.g., module swap PrgEnv-pgi PrgEnv-gcc
Determine whether additional libraries are
required and whether
– Native, optimized versions are available for the Cray XE6
http://docs.cray.com/cgi-bin/craydoc.cgi?mode=SiteMap;f=xe_sitemap
under “Math and Science Libraries”
– For a list of all libraries installed on Beagle use:
module avail 2>&1 | less
•
Load the required libraries
e.g., FFTW, via module load fftw
34
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
User’s guides and man pages
•
PGI:
–
–
•
GCC:
–
–
•
–
http://docs.cray.com/cgi-bin/craydoc.cgi?mode=SiteMap;f=xe_sitemap
Or type man crayftn, man craycc, man crayc++
Pathscale:
–
–
35
http://gcc.gnu.org/onlinedocs/
Or type man gfortran, man gcc, man g++
Cray: under “Programming Environment”
–
•
http://www.pgroup.com/resources/docs.htm
Or type man pgf90, man pgcc, man pgCC
http://www.pathscale.com/documentation
Or type man pathf90, man pathcc, man pathCC, man eko
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
More details about the environment
•
Beagle can use both statically and dynamically
linked (shared) libraries
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/DevelopOnBeagle#Static_vs_Dynamic_linking
•
All compilers on Beagle support:
MPI (Message Passing Interface, standard for
distributed computing) and
– OpenMP (standard for shared memory computing).
Note: flags activating openMP pragmas or directives
might be different among compilers, see man pages.
–
•
36
Some compilers support also PGAS languages
(e.g., CAF or UPC), for example the Cray
compilers
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Steps for computing on Beagle
You need a user id on Beagle ✔
• You need an active project ✔
• You need to understand the basics of how the
system works (check files, move files, create ✔
directories)
• You need to move your data to Beagle ✔
• The application(s) that perform the calculations
✔
need to be installed on Beagle
• You need to submit and monitor your jobs to the
compute nodes
• You need to transfer your data back to your system ✔
•
37
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
On running jobs on compute nodes
•
•
•
•
38
The system operates through a resource
manager (Torque) and a scheduler (Moab)
Beagle CLE (Cray Linux Environment) supports
both interactive and batch computations
When running applications on the compute
nodes, it is best to work from the login nodes
(as opposed to the sandbox node, which is
better used to develop)
It is not possible to log in on the compute nodes
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Launching an application on compute nodes
They are all usually part of a PBS (Portable Batch
System) script:
• The first step is to obtain resources which utilizes
the qsub command
• The second step is to set the appropriate
environment to run the calculations
• The third step is to move input files, personal
libraries and applications to the Lustre file system
• The fourth step is to run the application on the
compute nodes using the application launcher
(aprun)
• The final step is to move files back to /home or
/gpfs/pads/projects
39
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
First step: request resources with qsub
•
•
•
Users cannot access compute nodes without a
resource request managed by Torque/Moab
That is, you will always need to use qsub
Typical calls to qsub are:
–
For an interactive job
qsub -I -l walltime=00:10:00,mppwidth=24
–
for a batch job
qsub my_script.pbs
40
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Interactive
When you run interactive jobs you will see a
qsub prologue:
•
lpesce@login2:~> qsub -I –l walltime=00:10:00,mppwidth=24
qsub: waiting for job 190339.sdb to start
qsub: job 190339.sdb ready
############################# Beagle Job Start ##################
#
#
#
Job ID: 190339 Project: CI-CCR000070
• Good
for debugging and small tests
Start time: Tue Jul 26 12:23:14 CDT 2011
• LimitedResources:
to one node
(24 cores)
walltime=00:10:00
#
#
#
##############################################################
•
After you receive a prompt, you can run your
jobs via aprun:
lpesce@login2:~> aprun –n 24 myjob.exe <myinput >& my_log
lpesce@login2:~> aprun –n 24 myjob2.exe <myinput2 >& my_log2
41
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Batch scripts
•
•
•
•
Batch scheduling is usually done with a PBS
script
Scripts can be very complex (see following talk
about Swift)
Note: the script is executed on the login node!
Only what follows the aprun command is run on
the compute nodes
We’ll look into simple scripts
• To know more:
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#How_to_submit_jobs
42
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Example of an MPI script
!/bin/bash
#PBS -N MyMPITest
#PBS -l walltime=1:00:00
#PBS -l mppwidth=240
#PBS -j oe
•
•
•
•
Set shell (I use bash)
Give a name to the job
Set wall time to 1 hr (hh:mm:ss)
Ask to merge err and output from
the scheduler
#Move to the directory where the script was submitted -- by the qsub command • $PBS_O_WORKDIR: directory
from where the script was
cd $PBS_O_WORKDIR
# Define and create a directory on /lustre/beagle where to run the job
submitted
LUSTREDIR=/lustre/beagle/`whoami`/MyMPITest/${PBS_JOBID}
• Use aprun to send the computation to
• Name, output and make a
echo $LUSTREDIR
the compute nodes
directory on lustre
mkdir -p $LUSTREDIR
• -n 240 asks for 240 MPI processes
• Move all the files that will be
# Copy the input file and executable to /lustre/beagle
used to lustre
cp /home/lpesce/tests/openMPTest/src/hello_smp hello.in $LUSTREDIR
• Go to lustre
# Move to /lustre/beagle
cd $LUSTREDIR
# Note that here I was running hello_smp on 240 cores , i.e., using 240 PEs (by using -n 240)
# each with 1 thread -- i.e., just itself (default by not using -d)
aprun -n 240 hello_smp <hello.in > hello.out3
43
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Example of an openMP script
• Set shell (I use bash)
• Give a name to the job
#PBS -l walltime=48:00:00
#PBS -l mppwidth=24
• Set wall time to max: 48 hrs
#PBS -j oe
(hh:mm:ss)
• Use
aprunwhere
tothesend
the
computation
to • Ask to merge err and output from
#Move
to the directory
script was
submitted
-- by the qsub command
• $PBS_O_WORKDIR: directory
cd $PBS_O_WORKDIR
the compute nodes
the schedulerfrom where the script was
# Define and create a directory on /lustre/beagle where to run the job
• First set environmental variable
LUSTREDIR=/lustre/beagle/`whoami`/MyTest/${PBS_JOBID}
submitted
OMP_NUM_THREADS to desired
echo $LUSTREDIR
• Name, output and make a
mkdir -p $LUSTREDIR
value (24 is rarely optimal!)
directory on lustre
asks for
24 OMPthese
processes
• Move all the files that will be
# Copy the•input-d
file 24
and executable
to /lustre/beagle,
have to be user and project specific
cp /home/lpesce/tests/openMPTest/src/hello_smp
hello.in $LUSTREDIR
per MPI process
used to lustre
• -n 1 asks for only one MPI process
• Go to lustre
# Move to /lustre/beagle
#!/bin/bash
#PBS -N MyOMPTest
cd $LUSTREDIR
# Note that here I was running one PE (by using -n 1)
# each with 24 threads (by using -d 24)
# Notice the setting of the environmental variable OMP_NUM_THREADS for openMP
# if other multi-threading approaches are used they might need to be handled differently
OMP_NUM_THREADS=24 aprun -n 1 -d 24./hello_smp <hello.in > hello.out4
44
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Recap of queues available on Beagle
Queue
Name
Max
Walltime
Max #
nodes
Default Max #
# nodes jobs in
queue
Total #
Reserved
nodes
• Recommended as third step after parallelism was tested on a small scale
• Up to 10 nodes..
Interactive
4 hour
1
1
1
8
• Provides dedicated resources to efficiently test and refine scalability
Default queue, to run all the rest
16
development
30first
min
3 applications
1
2
• Recommended as
step in porting
to Beagle
• To test and debug code in real time.
• On
Recommended
one node. 30
as second
after the code
scalability
min step,10
1 compiles and
4 runs using the
10
• Provides
interactive
dedicated
queue on
resources
one nodeto run continuous refinement sessions
• To test parallelism on a small scale
batch
2 days
none 1
744
N/A
• Up to 3 nodes..
• To know more:
• Provides dedicated resources to efficiently optimize and test parallelism
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/SchedulingPolicy
45
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
More about aprun
•
•
The number of processors, both for MPI and
openMP, is determined at launch time by the
aprun command (more or less that is)
The aprun application launcher handles
stdin, stdout and strerr for the user’s
application
• To know more:
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#Aprun
Or type man aprun
46
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
To monitor applications and queues
•
•
•
qsub batch jobs are submitted using the qsub command
qdel is used to delete a job
qstat shows the jobs the resource manager, Torque, knows
about (i.e., all those submitted using qsub).
–
–
–
–
–
qstat -a show all jobs in submit order
qstat -a -u username show all jobs of a specific user in
submit order
qstat -f job_id receive a detailed report on the job status
qstat -n job_id what nodes is a job running on
qstat -q gives the list of the queues available on Beagle
showq show all jobs in priority order. showq tells which
jobs Moab, the scheduler, is considering eligible to run or is
running
• • showres
To know more: showres show all the reservations currently in
place or that have been scheduled
http://www.ci.uchicago.edu/wiki/bin/view/Beagle/ComputeOnBeagle#commands_for_submitting_and_inqu
•
47
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Acknowledgments
•
•
•
•
•
48
BSD for funding most of the operational costs of Beagle
A lot of the images and the content has been taken or
learned from Cray documentation or their staff
Globus for providing us with many slides and support;
special thanks to Mary Bass, manager for
communications and outreach at the CI.
NERSC and its personnel provided us with both material
and direct instruction; special thanks to Katie Antypas,
group leader of the User Services Group at NERSC
All the people at the CI who supported our work, from
administrating the facilities to taking pictures of Beagle
Intro to Beagle – [email protected]
www.ci.anl.gov
www.ci.uchicago.edu
Thanks!
We look forward to working with you.
Questions?
(or later: [email protected])
www.ci.anl.gov
www.ci.uchicago.edu

Getting started on the Cray XE6 Beagle Beagle Team ([email protected]) Computation Institute University of Chicago & Argonne National Laboratory www.ci.anl.gov www.ci.uchicago.edu.

Transcript Getting started on the Cray XE6 Beagle Beagle Team ([email protected]) Computation Institute University of Chicago & Argonne National Laboratory www.ci.anl.gov www.ci.uchicago.edu.

Directory