CS 696 Introduction to Grid Computing: Lecture #1

Download Report

Transcript CS 696 Introduction to Grid Computing: Lecture #1

GRAM: Globus Resource Allocation and Management

• GRAM designed to provide a single common protocol and API for requesting and using remote system resources – uniform, extensible interface local job scheduling systems. • API for – submitting and canceling a job request – checking the status of a submitted job. • Specifications written by user using Resource Specification Language (RSL) – processed by GRAM as part of the job request. • By design, GRAM does not guarantee user environments on remote hosts.

GRAM Architecture (GT2)

GRAM:

• • • • • • Resource – An entity capable of running one or more processes on behalf of a user.

Client – The process that is using the resource allocation client-side API.

Job – A process or set of processes resulting from a job request. Job Request – A request to gatekeeper to create one or more job processes, expressed in the RSL. Gatekeeper – A process, running as root, which begins the process of handling allocation requests. It exists on the remote computer before any request is submitted. – When the gatekeeper receives an allocation request from a client, it mutually authenticates with the client, maps the requestor to a local user, starts a job manager on the local host as the local user, and passes the allocation arguments to the newly created job manager.

Job Manager – One job manager is created by the gatekeeper to fulfill every request submitted to the gatekeeper. – It starts the job on the local system, and handles all further communication with the client.

• •

RSL Resource Specification Language

RSL is a type of formal language – Has its own syntax and parsing rules – Works like unix regular expressions and shell scripts – We will cover in more detail next week – Ugly…users

hate

it… & (rsl_substitution = (TOPDIR "/home/nobody") (DATADIR $(TOPDIR)"/data") (EXECDIR $(TOPDIR)/bin) ) (executable = $(EXECDIR)/a.out (* ^-- implicit concatenation *)) (directory = $(TOPDIR) ) (arguments = $(DATADIR)/file1 (* ^-- implicit concatenation *) $(DATADIR) # /file2 (* ^-- explicit concatenation *) '$(FOO)' (* <-- a quoted literal *)) (environment = (DATADIR $(DATADIR))) (count = 1) Performing all variable substitution and removing comments yields an equivalent RSL string: • & (rsl_substitution = (TOPDIR "/home/nobody") (DATADIR "/home/nobody/data") (EXECDIR "/home/nobody/bin") ) (executable = "/home/nobody/bin/a.out" ) (directory = "/home/nobody" ) (arguments = "/home/nobody/data/file1" "/home/nobody/data/file2" "$(FOO)" ) (environment = (DATADIR "/home/nobody/data")) (count = 1)

• • • • • • • • • •

GRAM Job Execution Environment

HOME

– The user's home directory.

LOGNAME

– The user's login name.

X509_USER_PROXY

– The path to the job manager's delegated credential. (GSI only).

GLOBUS_GRAM_JOB_CONTACT

– The job manager's contact string for this job.

GLOBUS_GRAM_MYJOB_CONTACT

– The GRAM MyJob contact string for intrajob communication.

GLOBUS_LOCATION

– The path to the Globus installation on the job manager host.

X509_CERT_DIR*

– The path to a trusted certificate directory. This variable will only be set if the -x509-cert-dir argument is given to the job manager.

GLOBUS_GASS_CACHE_DEFAULT*

– The path to the job's GASS cache (if the gass_cache RSL attribute is present).

GLOBUS_TCP_PORT_RANGE*

– A system-specific range of TCP ports which may be used by the job. Globus I/O will automatically honor this range. Only present if the related configuration option is present in the job manager configuration file.

GLOBUS_REMOTE_IO_URL*

– The path to a file containing a URL string of a GASS server which the job may access (if the remote_io_url attribute is present).

Globus client tools

• • • •

Use these to submit jobs to GRAM and get remotes tasks done:

globusrun – most basic way

globus-job-run

powerful Test authentication (effectively ‘ping’): $ globusrun -a -r cab047.info.uvt.ro

GRAM Authentication test successful Execute remote simple Unix commands $ globus-job-run blue.info.uvt.ro /bin/uname -a Linux blue 2.6.16-hardened-r10 #2 SMP Fri Sep 1 22:36:46 EEST 2006 i686 Intel(R) Xeon(TM) CPU 2.40GHz GenuineIntel GNU/Linux

Globus Jobs: complex but powerful

• Most of these require RSL input • Typically used for batch job submission (e.g. jobs submitted to a queuing system on a cluster) – globus-job-get-output – globus-job-run – globus-job-status – globus-job-submit – globus-job-cancel – globus-job-clean

Globus-job-run: used to run code remotely

• Run on single node of remote cluster [ dana@Hport ~]$ globus-job-run cab047 ./hello.sh

Hello from cab047 [dana@Hport ~]$ • Run on multiple CPU’s: [ dana@Hport ~/.globus]$ globus-job-run cab047 -np 4 /home/dana/hello.sh

Hello from cab047 Hello from cab047 Hello from cab047 Hello from cab047

globus-job-run: Examples

• Run multiple commands:

globus-job-run cab047 /bin/sh -c “cd my_dir ; ls”

• Run several mpi jobs:

globus-job-run \ -: wn01 -np 64 -s my-aix-exec \ -: nanosim1 -np 128 -s my-linux-exec

• For help:

globus-job-run -help Examples taken from NPACI training class (L. Brieger)

globus-job-submit: Remote batch jobs

• For help: globus-job-submit -help • To submit jobs to the remote batch scheduler % globus-job-submit \ cab047/jobmanager-batch \ -queue normal -np 4 /home/dana/mpi/little https://cab047.info.uvt.ro:44864/68982/1047069851/

( jobID in response to submission )

Job management

• Use jobID to check on job status: globus-job-status https://cab047.info.uvt.ro:44864/68982/1047069851 PENDING …ACTIVE…DONE • Use jobID to retrieve output or cancel job globus-job-get-output \ https://cab047.info.uvt.ro:44864/68982/1047069851 globus-job-cancel \ https://cab047.info.uvt.ro:44864/68982/1047069851 • Use jobID to clean up cached output from job (on remote machine): globus-job-clean https://ca047.info.uvt.ro:44864/68982/1047069851