No Slide Title

Download Report

Transcript No Slide Title

Globus Toolkit
Execution Management
6a.1
G
T
4
G
T
3
G
T
2
C WS Core
Community
Authorization
Service
OGSA-DAI
[Tech Preview]
WS
Authentication
Authorization
Reliable
File
Transfer
Grid
Resource
Allocation Mgmt
(WS GRAM)
Monitoring
& Discovery
System
(MDS4)
Java WS Core
GridFTP
Grid
Resource
Allocation Mgmt
(Pre-WS GRAM)
Monitoring
& Discovery
System
(MDS2)
C Common
Libraries
Pre-WS
Authentication
Authorization
Web Services
Components
Non-WS
Components
G
T
3
G
T
4
Python WS Core
[contribution]
Community
Scheduler
Framework
[contribution]
Delegation
Service
Replica
Location
Service
XIO
Credential
Management
Security
Data
Management
Execution
Management
Information
Services
Common
Runtime
I Foster
Grid Resource Allocation
Manager (GRAM)
Job submission
6a.3
Resource Management
• Job submission
• Job status
• Basic resource allocation
6a.4
Outline
• GT2 job submission using RSL version
1 language.
• GT 3.2 job submission using RSL
version 2 language.
• GT 4 job submission (can use RSL
version language
6a.5
Resource Allocation
Globus (2, 3.2, or 4.0) does not have its own
job scheduler to find resources and
automatically send jobs to suitable
machines.
For that, use a separate scheduler, e.g.
Condor, Sun Grid Engine, LSF, PBS, … .
6a.6
Globus Version 2
(pre-2004)
Pre-WS GRAM
6a.7
Globus version 2
From: “Introduction to Grid Computing
with Globus,” IBM Redbooks, Fig. 7-3.
6a.8
GT 2 GRAM
Job startup done using GRAM service.
Consist of:
• Gatekeeper
• Job Manager
Job manager can connect to a local resource
manager (scheduler)
GASS service -- provides access to remote files
and for redirecting standard output streams.
6a.9
GRAM Commands
• globusrun -- Runs a single executable on a
remote site.
• globus-job-run -- Allows you to run a job at one
or several remote resources. Uses globusrun to
submit job.
• globus-job-submit -- For batch job submission
(e.g. using a local scheduling job manager). Not
recommended; use globus-job-run or globusrun
instead, with job manager specified
6a.10
Scheduler
• Can specify a job scheduler with
globusrun, by adding scheduler name to
hostname:
<hostname>/jobmanager-lsf
6a.11
Specifying job
• Command used a file to describe job in
a language called Resource
Specification Language, RSL
• RSL Version 1 -- a metalanguage
describing job and its required
execution.
6a.12
Resource Specification Language
RSL
Provides a specification for:
• Resource requirements - machine type,
number of nodes, memory, etc.
• Job description - directory, executable,
arguments, environment
6a.13
RSL Version 1
Constraints Example
Conjunction (AND): &
• To create 3-5 instances of myProg, each on
a machine with at least 64 Mbytes memory
available to me for 1 hours:
& (executable=myProg)
(count>=3)(count<=5)(memory>=64)
(max_time=60)
6a.14
RSL Version 1
Constraints Example
Disjunction (OR): |
• To create 5 instances of myProg, each on a
machine with at least 64 Mbytes memory or 7
instances of myProg, each on a machine with at
least 32 Mbytes memory :
&(executable=myProg)
(|(&(count=5)(memory>=64))
(&(count=7)(memory>=32)))
6a.15
RSL version 1
Requesting multiple resources
multirequest: +
• To execute 5 instances of myProg1 on a
machine with at least 64 Mbytes memory
and execute 2 instances of myProg2:
+(&(count=5)(memory>=64))
(executable=myProg1))
(&(count=2)(executable=myProg2))
6a.16
Can specify different resource managers on
different machines using
resourceManagerContact attribute.
6a.17
RSL creation with Globus version 2
• GT2 globus-job-run can be used to
generate RSL from command line
arguments with -dumprsl flag
• -help gives options
6a.18
Globus 3.2
6a.19
GT 3.2 GRAM
“Globus Resource Allocation Manager”
A set of “OGSI” compliant services provided
to start remote jobs. notably:
• Master Managed Job Factory Service
(MMJFS).
Also a set of non-OGSI compliant services
(Gatekeeper, Jobmanager) from pre-GT3.
6a.20
Globus GT 3.x
6a.21
Resource Specification
Language, RSL, version 2
• GT3 and GT 4 use RSL version 2.
(Some differences in RSL language
specification in GT4, so not completely
interchangeable.)
• RSL Version 2 is an XML language.
6a.22
Resource Specification
Language Version 2 (RSL -2)
• Can specify everything from executable,
paths, arguments, input/output, error file,
number of processes, max/min execution
time, max/min memory, job type etc. etc.
6a.23
RSL-2
• Much more elegant and flexible, and in
keeping with systems using XML.
• Can use XML parsers.
• Allows more powerful mechanisms with
job schedulers.
• Resource scheduler/broker applies
specification to local resources.
6a.24
RSL-2 Example
Specifying Executable
(executable=/bin/echo)
<gram:executable>
<rsl:path>
<rsl:stringElement value="/bin/echo"/>
</rsl:path>
</gram:executable>
In GT 4 version of RSL-2, can simply write:
<executable>/bin/echo</executable>
6a.25
RSL-2 Example
Specifying Directory
(directory=“/bin”)
<gram:directory>
<rsl:path>
<rsl:stringElement value="/bin/"/>
</rsl:path>
</gram:directory>
In GT 4 version of RSL-2, can simply write:
<directory>/bin/</directory>
6a.26
RSL-2 Example
Specifying Number
(count=1)
<gram:count>
<rsl:integer value="1"/>
</gram:count>
In GT 4 version of RSL-2, can simply write:
<count>1</count>
6a.27
RSL-2 Example
Specifying Arguments
(arguments=“Hello”)
<gram:arguments>
<rsl:string>
<rsl:stringElement value="Hello World"/>
</rsl:string>
</gram:arguments>
In GT 4 version of RSL-2, can simply write:
<argment>hello world</argument>
6a.28
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
RSL and (GT 3.2)
RSL-2 comparison
for echo program
<?xml version="1.0" encoding="UTF-8"?>
<rsl:rsl xmlns:rsl="http://www.globus.org/namespaces/2003/04/rsl"
xmlns:gram="http://www.globus.org/namespaces/2003/04/rsl/gram"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.globus.org/namespaces/2003/04/rsl
c:/ogsa-3.0/schema/base/gram/rsl.xsd
http://www.globus.org/namespaces/2003/04/rsl/gram
c:/ogsa-3.0/schema/base/gram/gram_rsl.xsd">
<gram:job>
<gram:executable> <rsl:path>
<rsl:stringElement value="/bin/echo"/> </rsl:path>
</gram:executable>
<gram:directory> <rsl:path>
<rsl:stringElement value="/bin"/> </rsl:path>
</gram:directory>
<gram:arguments>
<rsl:string> <rsl:stringElement value="Hello World"/> </rsl:string>
</gram:arguments>
<gram:stdin> <rsl:path>
<rsl:stringElement value="/dev/null"/> </rsl:path> </gram:stdin>
<gram:stdout>
<rsl:pathArray>
<rsl:path>
<rsl:substitutionRef name="HOME"/>
<rsl:stringElement value="/stdout"/>
</rsl:path>
</rsl:pathArray>
</gram:stdout>
<gram:stderr>
<rsl:pathArray>
<rsl:path>
<rsl:substitutionRef name="HOME"/>
<rsl:stringElement value="/stderr"/>
</rsl:path>
</rsl:pathArray>
</gram:stderr>
<gram:count> <rsl:integer value="1"/> </gram:count>
<gram:jobType>
<gram:enumeration>
<gram:enumerationValue> <gram:multiple/> </gram:enumerationValue>
</gram:enumeration>
</gram:jobType>
<gram:gramMyJobType>
<gram:enumeration>
<gram:enumerationValue> <gram:collective/> </gram:enumerationValue>
</gram:enumeration>
</gram:gramMyJobType>
<gram:dryRun> <rsl:boolean value="false"/> </gram:dryRun>
<gram:saveState> <rsl:boolean value="true"/> </gram:saveState>
<gram:twoPhase> <rsl:integer value="600"/> </gram:twoPhase>
</gram:job>
</rsl:rsl>
&((executable=/bin/echo)
(directory="/bin")
(arguments="Hello World")
(stdin=/dev/null)
(stdout="stdout")
(stderr="stderr")
(count=1)
)
6a.29
Running GT 3 Job
• Command:
managed-job-globusrun
and arguments -- named master job factory
service to process job and an xml file to
specify job.
• Command equivalent to GT 2 globusrun
command.
6a.30
Globus 4.0
6a.31
GT 4 WS-GRAM
6a.32
• In WS GRAM, jobs started by the
ManagedExecutionJobService,
which is a Java service
implementation running within
globus service container.
6a.33
Running GT 4 Job
• Command:
globusrun-ws
and arguments to specify job.
• Equivalent to GT 3.2 managed-jobglobusrun command, and GT 2
globusrun command.
6a.34
GT4 job submission command
globusrun-ws
•
•
•
•
Submit and monitor GRAM jobs
Replaces (java) managed-job-globusrun
Written in C, faster startup and execution
Supports multiple and single job
submission
• Handles credential management
• Streaming of job stdout/err during
execution
6a.35
Simple job submission
• Step 1: Create proxy with grid-proxy-int
command.
• Step 2: globusrun-ws with parameters
to specify job.
6a.36
Some globusrun-ws flags
for job submission
6a.37
Job submission
-submit Submits (or resubmits) a job
to a job host in one of three output
modes: batch, interactive, or interactivestreaming. This flag needed.
6a.38
Specifying where job is submitted
(ManagedJobFactory)
-F Specifies the “contact” for the job submission.
Default
https://localhost:8443/wsrf/services/
ManagedJobFactoryService
In assignment 3, simply localhost and container port
number used, i.e.
-F localhost:8443
6a.39
Submitting a single job
- c Causes globusrun-ws to generate
a simple job description with the named
program and arguments. This flag, if
used, must be the last flag.
6a.40
Example
Submit program echo with argument hello to
default local host.
% globusrun-ws –submit –c /bin/echo hello
Submitting job...Done.
Job ID: uuid:d23a7be0-f87c-11d9-a53b0011115aae1f
Termination time: 07/20/2005 17:44 GMT
Current job state: Active
Current job state: CleanUp
Current job state: Done
Destroying job...Done.
6a.41
• A successful submission will create a
new ManagedJob resource with its own
unique EPR for messaging.
• globusrun-ws will output this EPR to a
file when requested, as the sole
standard output when running in batch
mode.
6a.42
Selecting a different host
Example
$ globusrun-ws –submit –F
https://140.221.65.193:4444/wsrf/
services/managedJobFactoryService
–c /bin/echo hello
6a.43
Using an RSL file
–f Similar to -c except job description held
in a file.
Example
globusrun-ws –submit –f echo.xml
where echo.xml is an RSL-2 file describing
job.
6a.44
Contents of echo.xml
<job>
<executable>/bin/echo</executable>
<argument>hello</argument>
</job>
6a.45
Batch Submission
-batch
Results in ManagedJob EPR as
the sole standard output (unless in quiet
mode) and then exits.
-o filename Created ManagedJob EPR
written to file (if submission successful)
6a.46
Batch Job Submission
$ globusrun-ws –submit –batch –o
job_epr –s /bin/sleep 50
Submitting job…Done
JoB ID: uuid:f9544174-60c5-11d9-97e30002a5ad41e5
Termination time: 01/08/2005 16:05 GMT
6a.47
Monitoring Batch Submission
-monitor Attaches to an existing job in
interactive or interactive-streaming output
modes.
-j filename EPR for ManagedJob read from
file.
6a.48
Monitoring Batch Job
globusrun-ws –monitor –j job_epr
job state: Active
Current job state: CleanUp
Current job state: Done
Requesting original job description...Done.
Destroying job...Done
6a.49
Batch Submission
-status Reports the current state of
the job and exits
-kill Requests immediate
cancellation of job and exits.
6a.50
6a.51
Input/Output
RSL file can specify where stdout/err goes.
Example
<job>
<executable>/bin/echo</executable>
<directory>/tmp</directory>
<argument>hello</argument>
<stdout>job.out</stdout>
<stderr>job.err</stderr>
…
</job>
6a.52
Stream Input/Output
-s The standard output and standard error
files of the job are monitored and data
is written to the corresponding output
of globusrun-ws.
Allows streaming stdout/err during
executing to the terminal.
6a.53
Stream output data files
Can also “stream” output data files. Specify
in RSL file where to.
6a.54
Example
<job>
…
<fileStageOut>
<transfer>
<sourceUrl>file:///tmp/job.out</sourceUrl>
<destinationUrl>gsiftp://host.domain:2811/
tmp/stage.out</destinationUrl>
</transfer>
</fileStageOut>
…
</job>
6a.55
Reliable File Transfer (RTF)
Example
<job>
…
<transfer>
…
<rftOptions>
<subjectName>/O=NCSU/OU=HPC/
OU=unity.ncsu.edu/CN=Barry Wilkinson
<parallelStreams>4</parallel Streams>
</transfer>
…
</job>
6a.56