Transcript CS 696 Introduction to Grid Computing: Lecture #1
GRAM: Globus Resource Allocation and Management
• GRAM designed to provide a single common protocol and API for requesting and using remote system resources – uniform, extensible interface local job scheduling systems. • API for – submitting and canceling a job request – checking the status of a submitted job. • Specifications written by user using Resource Specification Language (RSL) – processed by GRAM as part of the job request. • By design, GRAM does not guarantee user environments on remote hosts.
GRAM Architecture (GT2)
GRAM:
• • • • • • Resource – An entity capable of running one or more processes on behalf of a user.
Client – The process that is using the resource allocation client-side API.
Job – A process or set of processes resulting from a job request. Job Request – A request to gatekeeper to create one or more job processes, expressed in the RSL. Gatekeeper – A process, running as root, which begins the process of handling allocation requests. It exists on the remote computer before any request is submitted. – When the gatekeeper receives an allocation request from a client, it mutually authenticates with the client, maps the requestor to a local user, starts a job manager on the local host as the local user, and passes the allocation arguments to the newly created job manager.
Job Manager – One job manager is created by the gatekeeper to fulfill every request submitted to the gatekeeper. – It starts the job on the local system, and handles all further communication with the client.
• •
RSL Resource Specification Language
RSL is a type of formal language – Has its own syntax and parsing rules – Works like unix regular expressions and shell scripts – We will cover in more detail next week – Ugly…users
hate
it… & (rsl_substitution = (TOPDIR "/home/nobody") (DATADIR $(TOPDIR)"/data") (EXECDIR $(TOPDIR)/bin) ) (executable = $(EXECDIR)/a.out (* ^-- implicit concatenation *)) (directory = $(TOPDIR) ) (arguments = $(DATADIR)/file1 (* ^-- implicit concatenation *) $(DATADIR) # /file2 (* ^-- explicit concatenation *) '$(FOO)' (* <-- a quoted literal *)) (environment = (DATADIR $(DATADIR))) (count = 1) Performing all variable substitution and removing comments yields an equivalent RSL string: • & (rsl_substitution = (TOPDIR "/home/nobody") (DATADIR "/home/nobody/data") (EXECDIR "/home/nobody/bin") ) (executable = "/home/nobody/bin/a.out" ) (directory = "/home/nobody" ) (arguments = "/home/nobody/data/file1" "/home/nobody/data/file2" "$(FOO)" ) (environment = (DATADIR "/home/nobody/data")) (count = 1)
• • • • • • • • • •
GRAM Job Execution Environment
HOME
– The user's home directory.
LOGNAME
– The user's login name.
X509_USER_PROXY
– The path to the job manager's delegated credential. (GSI only).
GLOBUS_GRAM_JOB_CONTACT
– The job manager's contact string for this job.
GLOBUS_GRAM_MYJOB_CONTACT
– The GRAM MyJob contact string for intrajob communication.
GLOBUS_LOCATION
– The path to the Globus installation on the job manager host.
X509_CERT_DIR*
– The path to a trusted certificate directory. This variable will only be set if the -x509-cert-dir argument is given to the job manager.
GLOBUS_GASS_CACHE_DEFAULT*
– The path to the job's GASS cache (if the gass_cache RSL attribute is present).
GLOBUS_TCP_PORT_RANGE*
– A system-specific range of TCP ports which may be used by the job. Globus I/O will automatically honor this range. Only present if the related configuration option is present in the job manager configuration file.
GLOBUS_REMOTE_IO_URL*
– The path to a file containing a URL string of a GASS server which the job may access (if the remote_io_url attribute is present).
Globus client tools
• • • •
Use these to submit jobs to GRAM and get remotes tasks done:
–
globusrun – most basic way
–
globus-job-run
powerful Test authentication (effectively ‘ping’): $ globusrun -a -r cab047.info.uvt.ro
GRAM Authentication test successful Execute remote simple Unix commands $ globus-job-run blue.info.uvt.ro /bin/uname -a Linux blue 2.6.16-hardened-r10 #2 SMP Fri Sep 1 22:36:46 EEST 2006 i686 Intel(R) Xeon(TM) CPU 2.40GHz GenuineIntel GNU/Linux
Globus Jobs: complex but powerful
• Most of these require RSL input • Typically used for batch job submission (e.g. jobs submitted to a queuing system on a cluster) – globus-job-get-output – globus-job-run – globus-job-status – globus-job-submit – globus-job-cancel – globus-job-clean
Globus-job-run: used to run code remotely
• Run on single node of remote cluster [ dana@Hport ~]$ globus-job-run cab047 ./hello.sh
Hello from cab047 [dana@Hport ~]$ • Run on multiple CPU’s: [ dana@Hport ~/.globus]$ globus-job-run cab047 -np 4 /home/dana/hello.sh
Hello from cab047 Hello from cab047 Hello from cab047 Hello from cab047
globus-job-run: Examples
• Run multiple commands:
globus-job-run cab047 /bin/sh -c “cd my_dir ; ls”
• Run several mpi jobs:
globus-job-run \ -: wn01 -np 64 -s my-aix-exec \ -: nanosim1 -np 128 -s my-linux-exec
• For help:
globus-job-run -help Examples taken from NPACI training class (L. Brieger)
globus-job-submit: Remote batch jobs
• For help: globus-job-submit -help • To submit jobs to the remote batch scheduler % globus-job-submit \ cab047/jobmanager-batch \ -queue normal -np 4 /home/dana/mpi/little https://cab047.info.uvt.ro:44864/68982/1047069851/
( jobID in response to submission )
Job management
• Use jobID to check on job status: globus-job-status https://cab047.info.uvt.ro:44864/68982/1047069851 PENDING …ACTIVE…DONE • Use jobID to retrieve output or cancel job globus-job-get-output \ https://cab047.info.uvt.ro:44864/68982/1047069851 globus-job-cancel \ https://cab047.info.uvt.ro:44864/68982/1047069851 • Use jobID to clean up cached output from job (on remote machine): globus-job-clean https://ca047.info.uvt.ro:44864/68982/1047069851