Job Submission Using PBSPro and Globus Job Commands

Download Report

Transcript Job Submission Using PBSPro and Globus Job Commands

Job Submission
Using PBSPro and
Globus Job Commands
Overview




Computational Resources
Queuing Systems
PBSPro
Globus Toolkit
Computational Resources







Hard Disks (permanent storage)
Number of CPUs (processing power)
CPU time (processing time)
Physical memory (program size)
All computers have limited resources
These resources may span across multiple
processors/machines
How to allocate these resources fairly,
amongst the many users?
Queuing Systems




Holding area for pending requests
A method for allocating the needed
resources based on a user request
Processing of requests dynamically,
quickly and fairly
Many different
implementations
PBSPro




Queuing System used to control the
allocation of computational resources
to user submitted jobs
Allows optimal sharing of all resources
Ensures that the limited resources
aren’t over-run and exceeded
Unattended processing of requests
PBS Queues


Allows distribution of resources into
clearly defined groups, called queues
Queues can be defined by:





Maximum CPU time
Number of CPUs available
Memory needed
Concurrently executing jobs
Queuing schemes evolve over time as
user requests and workload vary
Interacting With PBS





qsub – Submit a job to the queues
qstat – Check the status of a job
qdel – Delete a job from the queues
qmgr – Create/modify queue settings
xpbs – Monitor all queues and jobs
qsub – Submit A Job




To submit a job to the Queuing
System, use “qsub”
E.g. qsub test.sub
Test.sub is a script file containing the
commands to be executed
qsub returns your “QueueID”, if the job
was submitted successfully:
61606.master
Submit File For A Serial Job

!/bin/bash
#PBS -l walltime=2:00:00
#PBS -l mem=5mb
#PBS -j oe
#PBS –m be
cd Serial
./test
Submit File For A Parallel Job

#!/bin/sh
#PBS -l nodes=4:ppn=2
#PBS -l walltime=48:00:00
#PBS -j oe
cd ./MPI/Examples
mpiexec ./test
qstat – Check Job Status

To check the status of a job submitted
to the Queuing System, use “qstat”
Job id
-----------------61395.master
61494.master
61495.master
61496.master
61497.master
61498.master
61555.master
61567.master
61576.master
61578.master
61580.master
Name
---------G5D2C
STDIN
STDIN
STDIN
STDIN
STDIN
Co_V
20_12
20_21
20_23
20_25
User
-----------ngs0140
ngs0227
ngs0227
ngs0227
ngs0227
ngs0227
ngs0133
ngs0234
ngs0234
ngs0234
ngs0234
Time
-----------70:00:40
17:58:40
18:15:42
18:15:02
18:13:42
18:14:13
20:57:53
00:31:17
00:11:59
00:05:51
00:03:09
S
R
R
R
R
R
R
R
R
R
R
R
Queue
-------cpu16
cpu1
cpu1
cpu1
cpu1
cpu8
cpu24
cpu1
cpu4
cpu1
cpu1
qdel – Delete A Job


To delete a job submitted to the
Queuing System, use “qdel”
E.g. qdel QueueID
qdel QueueID1 QueueID2
Globus Toolkit


An open source toolkit for developing
Grid based applications and
connectivity
Allocating computational resources on
remote (Globus aware) machines for
the execution of user submitted jobs
Globus Job Commands




globus-job-run <options>
globus-job-submit <options>
globus-job-status URL
globus-job-get-output URL
globus-job-run




Allows you to run a job as though it
were interactive, on a local or remote
machine
Don’t actually need to log on to the
machine itself
Not submitted to the Queuing System
Returns the programs output as
though you were running interactively
globus-job-run Examples




globus-job-run grid-data.man.ac.uk /bin/date
globus-job-run grid-data.rl.ac.uk ./test
globus-job-run \
grid-compute.leeds.ac.uk/jobmanager-pbs \
-np 8 -x ‘(jobtype=mpi)(environment= \
(NGSMODULES clusteruser))’ ./MPI/test
globus-job-run grid-data.rl.ac.uk -s ./test
globus-job-submit


To submit a job through a Globus Job
Manager
Commands returns a URL to the
program’s output
https://grid-compute.leeds.ac.uk:64167/5291/1094639422/

The status and output of the job can
be tested through this URL
globus-job-submit Examples



globus-job-submit \
grid-data.rl.ac.uk/jobmanager-pbs ./test
globus-job-submit \
grid-compute.leeds.ac.uk/jobmanager-pbs \
-x ‘(jobtype=mpi)(directory=/home/bob/mpi) \
(environment=(NGSMODULES clusteruser)) \
(count=8)’ ./mpi_program
globus-job-submit \
grid-data.man.ac.uk/jobmanager-pbs -s ./test
globus-job-status URL


To get the status of a submitted job
globus-job-status \
https://gridcompute.leeds.ac.uk:64167/5291/1094639422/

Returns:

Pending, Active, Done, Failed
globus-job-get-output URL


To retrieve the output of a job
submitted through a Globus Job
Manager
globus-job-get-output \
https://gridcompute.leeds.ac.uk:64167/5291/1094639422/

Returns the output of the program to
the console