Transcript Slide 1

Enabling Grids for E-sciencE
gLite Information System and
Workload Management System
Diego Scardaci
INFN Catania
International Summer School on Grid Computing
Ischia, 9-21 July, 2006
www.eu-egee.org
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
Outline
Enabling Grids for E-sciencE
• Information System Architecture
− Berkeley DB Information Index (BDII)
− The Relational Grid Monitoring Architecture (RGMA)
• Workload Management System
− WMS Architecture
− Job Description Language Overview
− WMProxy Overview
− Special Jobs: DAG, Collections, Parametric and MPI
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
2
Enabling Grids for E-sciencE
Information System
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
3
Information System
Enabling Grids for E-sciencE
• What is?
– System to collect information on the state of resources
• Why?
– To discover resources of the grid and their nature
– To have useful data in order to who is in charge of managing
the workload to do it more efficiently.
– To check for health status of resources.
• How?
– Monitoring state of resources locally and publishing fresh data
on the information system.
– Adopting a data model that MUST be well known to all
components that want to access monitored information
– Using different approaches that we are going to investigate in
the next slides
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
4
Adopted Information Systems
Enabling Grids for E-sciencE
• The BDII (Berkeley DB Information Index)
– has been adopted in LCG middleware as the Information System
provider.
– It is an evolution of the Globus Meta Directory System (MDS)
– It is based on Lightweight Directory Access Protocol (LDAP)
servers.
• The Relational Grid Monitoring Architecture (R-GMA)
– It is an implementation of the Grid Monitoring Architecture (GMA)
standardized by the Global Grid Forum (GGF, now OGF)
– It is a relational implementation of the GMA
– It is strongly Web Services Oriented
– It uses standard SQL query syntax
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
5
GRISs, local BDII and BDII
Enabling Grids for E-sciencE
Abbreviations:
BDII: Berkeley DataBase
Information Index
GIIS: Grid Index Information
Server
GRIS: Grid Resource
Information Server
Each site
can run
a BDII. It
collects the information
given by the local BDIIs
At each site, a *local* BDII collects the
information
given by the GRISs
Local GRISes run on CEs and SEs at each site and report
dynamic and static information
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
6
The IS in gLite
Enabling Grids for E-sciencE
BDII-A
CE Site BDII
CE Site BDII
CE Site BDII
CE
Local
GRIS
SE
Local
GRIS
BDII-C
BDII-B
CE
Local
GRIS
CE
Local
GRIS
CE
Local
GRIS
SE
Local
GRIS
Site 1
EGEE-II INFSO-RI-031688
CE
Local
GRIS
SE
Local
GRIS
Site 2
SE
Local
GRIS
Site 3
RB
Local
GRIS
International Summer School on Grid Computing 2006
7
R-GMA
Enabling Grids for E-sciencE
• The Relational Grid Monitoring
Architecture (R-GMA)
– It is the relational implementation of
GMA defined by the GGF
– Adopts a database model with
tables and relations between tables
– Implements a virtual database
– The user queries the R-GMA as
he/she was querying to a classical
database (SQL string)
– Implements different type of queries
• The information
Store location
REGISTRY
Transfer Data
– The Producer stores its location
(URL) in the Registry.
– The Consumer looks up producer
URLs in the Registry.
– The Consumer contacts the
Producer to get all the data or to
listen for new data.
EGEE-II INFSO-RI-031688
PRODUCER
CONSUMER
Lookup location
International Summer School on Grid Computing 2006
9
Enabling Grids for E-sciencE
Workload Management System
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
10
Outline
Enabling Grids for E-sciencE
• Overview of WMS Architecture
• Job Description Language Overview
• WMProxy Overview
• Special Jobs
–
–
–
–
DAG jobs
Job collections
Parametric jobs
MPI jobs
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
11
WMS Objectives
Enabling Grids for E-sciencE
• The Workload Management System (WMS) comprises a
set of Grid middleware components responsible for
distribution and management of tasks across Grid
resources.
• The purpose of the Workload Manager (WM) is accept and
satisfy requests for job management coming from its clients
– meaning of the submission request is to pass the responsibility
of the job to the WM.
 WM will pass the job to an appropriate CE for execution
• taking into account requirements and the preferences expressed in
the job description file
• The decision of which resource should be used is the
outcome of a matchmaking process.
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
12
Enabling Grids for E-sciencE
WMS Architecture
Keeps
submission
Repository
of resource
Job
Finds
management
an appropriate
requests
information
requests
CE for (submission,
each submission
available to matchmaker
cancellation)
request, taking
expressed
intoPerforms
account the actual
Requests are kept
job
viarequests
a Job Description
and preferences,
job submission
for avia
while
Updated
notifications
GridLanguage
status, utilization
(JDL) and
policies
monitoring
if no resources
are
and/or active
on resources
immediately
polling on available
resources
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
13
Enabling Grids for E-sciencE
Job Description Language
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
14
Job Description Language
Enabling Grids for E-sciencE
• In gLite Job Description Language (JDL) is used to describe
jobs for execution on Grid.
• The JDL adopted within the gLite middleware is based
upon Condor’s CLASSified Advertisement language
(ClassAd).
• A ClassAd is a record-like structure composed of a finite number of
attributes separated by semi-colon (;)
• A ClassAd is highly flexible and can be used to represent arbitrary
services
The JDL is used in gLite to specify the job’s characteristics
and constrains, which are used during the match-making
process to select the best resources that satisfy job’s
requirements.
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
15
Job Description Language (cont.)
Enabling Grids for E-sciencE
• The JDL syntax consists on statements like:
Attribute = value;
• Comments must be preceded by a sharp character
( # ) or have to follow the C++ syntax
WARNING: The JDL is sensitive to blank
characters and tabs. No blank characters
or tabs should follow the
semicolon at the end of a line.
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
16
JDL: an example
Enabling Grids for E-sciencE
Type = "Job";
JobType = "Normal";
Executable = "startGen4.sh";
Environment =
{"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIB
RARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};
Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it
lfn:/grid/gilda/valeria/2000pillar.dat
/gilda/ischia06/vardizzo";
StdOutput = "sample.out";
StdError = "sample.err";
InputSandbox =
{"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};
OutputSandbox = {"sample.err","sample.out"};
Requirements = Member("GLITE3_0_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
17
Enabling Grids for E-sciencE
Workload Manager Proxy
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
18
WMProxy
Enabling Grids for E-sciencE
• WMProxy (Workload Manager Proxy)
– is a new service providing access to the gLite Workload
Management System (WMS) functionality through a simple Web
Services based interface.
– has been designed to handle a large number of requests for job
submission
 gLite 1.5 => ~180 secs for 500 jobs
 goal is to get in the short term to ~60 secs for 1000 jobs
– it provides additional features such as bulk submission and the
support for shared and compressed sandboxes for compound
jobs.
– It’s the natural replacement of the NS in the passage to the SOA
approach.
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
19
New request types
Enabling Grids for E-sciencE
• Support for new types strongly relies on newly
developed JDL converters and on the DAG submission
support
– all JDL conversions are performed on the server
– a single submission for several jobs
• All new request types can be monitored and controlled
through a single handle (the request id)
– each sub-jobs can be however followed-up and controlled
independently through its own id
• “Smarter” WMS client commands/API
– allow submission of DAGs, collections and parametric jobs
exploiting the concept of “shared sandbox”
– allow automatic generation and submission of collections and
DAGs from sets of JDL files located in user specified directories
on the UI
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
20
Enabling Grids for E-sciencE
Special Jobs
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
21
Outline
Enabling Grids for E-sciencE
• DAG
• Job Collection
• Parametric jobs
• MPI jobs on gLite
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
22
DAG job
Enabling Grids for E-sciencE
• A DAG job is a set of jobs where input, output, or
execution of one or more jobs can depend on other
jobs
• Dependencies are represented through Directed
Acyclic Graphs, where the nodes are jobs, and the
edges identify the dependencies
nodeA
nodeB
nodeC
NodeF
nodeD
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
23
JDL structure
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
24
Attribute: Nodes
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
25
Attribute: Dependencies
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
26
DAG jdl
Enabling Grids for E-sciencE
[
]
type = "dag";
max_nodes_running = 4;
nodes = [
nodeA = [
file ="nodes/nodeA.jdl" ;
];
nodeB = [
file ="nodes/nodeB.jdl" ;
];
nodeC = [
file ="nodes/nodeC.jdl" ;
];
nodeD = [
file ="nodes/nodeD.jdl";
];
dependencies = {
{nodeA, nodeB},
{nodeA, nodeC},
{ {nodeB,nodeC}, nodeD }
}
];
EGEE-II INFSO-RI-031688
Node description
could also be
done here,
instead of using
separate files
International Summer School on Grid Computing 2006
27
Job Collection
Enabling Grids for E-sciencE
• A job collection is a set of independent jobs that user
wants to submit and monitor via a single request
• Jobs of a collection are submitted as DAG nodes
without dependencies
• JDL is a list of classad, which describes the subjobs
[
Type = "collection";
VirtualOrganisation = “gilda";
nodes = {
[ <job descr 1 >],
[ <job descr 2 >],
…
};
]
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
28
‘Scattered’ Input Sandboxes
Enabling Grids for E-sciencE
• Input Sandbox can contain
– file paths on the UI machine (i.e. the usual way)
– URI pointing to files on a remote gridFTP/HTTPS server
InputSandbox = {
"gsiftp://neo.datamat.it:2811/var/prg/sim.exe",
"https://ghemon.cnaf.infn.it:8443/data/idat_1",
"file:///home/pacio/myconf“ };
• A base URI to be applied to all sandbox files can also be specified
InputSandboxBaseURI = "gsiftp://matrix.datamat.it:2811/var";
• Only local files (file://) are uploaded to the WMS node
• File pointed by URIs are directly downloaded on the WN by the
JobWrapper just before the job is started
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
29
‘Scattered’ Output Sandboxes
Enabling Grids for E-sciencE
• JDL has been enriched with new attributes for specifying the
destinations for the files listed in the OutputSandbox attribute list
OutputSandbox = { "jobOutput",
"run1/event1",
"jobError"
};
OutputSandboxDestURI = {
"gsiftp://matrix.datamat.it/var/jobOutput",
"https://grid003.ct.infn.it:8443/home/cms/event1",
"gsiftp://matrix.datamat.it/var/jobError" };
• A base URI to be applied to all sandbox files can also be specified
OutputSandboxBaseDestURI = "gsiftp://neo.datamat.it/home/run1/";
• Files are copied when the job has completed execution by the
JobWrapper to the specified destination without transiting on the
WMS node
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
30
Job collection example
Enabling Grids for E-sciencE
[
]
type = "collection";
InputSandbox = {"date.sh"};
All nodes will share
RetryCount = 0;
nodes = {
this Input Sandbox
[
file ="jobs/job1.jdl" ;
],
[
[
Executable = "/bin/sh";
Arguments = "date.sh";
Stdoutput = "date.out";
StdError = "date.err";
OutputSandbox ={"date.out", "date.err"};
]
],
[
file ="jobs/job3.jdl" ;
]
};
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
31
Parametric Job
Enabling Grids for E-sciencE
• A parametric job is a job where one or more of its
attributes are parameterized
• Values of attributes vary according to a parameter
[
JobType = "Parametric";
Executable = "/bin/sh";
Arguments = "md5.sh input_PARAM_.txt";
InputSandbox = {"md5.sh", "input_PARAM_.txt"};
StdOutput = "out_PARAM_.txt";
StdError = "err_PARAM_.txt";
Parameters = 4;
ParameterStart = 1;
ParameterStep = 1;
OutputSandbox = {"out_PARAM_.txt",
"err_PARAM_.txt"};
]
• Job monitoring / managing is always done through an
unique jobID, as if the job was single (see submission
of collection
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
32
Parametric job / 2
Enabling Grids for E-sciencE
• Parameter can be also a list of string
• InputSandbox (if present) has to be coherent with
parameters
[ui-test] /home/giorgio/param > cat param2.jdl
[
JobType = "Parametric";
Executable = “/bin/cat";
Arguments = “input_PARAM_.txt”;
InputSandbox = "input_PARAM_.txt";
StdOutput = "myoutput_PARAM_.txt";
StdError = "myerror_PARAM_.txt";
Parameters = {earth,moon,mars};
OutputSandbox = {“myoutput_PARAM_.txt”};
]
[ui-test] /home/giorgio/param > ls
inputEARTH.txt inputMARS.txt inputMOON.txt param2.jdl
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
33
MPI Overview
Enabling Grids for E-sciencE
• Execution of parallel jobs is an essential issue for
modern informatics and applications.
• Most used library for parallel jobs support is MPI
(Message Passing Interface)
• At the state of the art, parallel jobs can run inside
single Computing Elements (CE) only;
– several projects are involved into studies concerning the
possibility of executing parallel jobs on Worker Nodes (WNs)
belonging to different CEs.
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
34
References
Enabling Grids for E-sciencE
• gLite 3.0 User Guide
– https://edms.cern.ch/file/722398/1.1/gLite-3-UserGuide.pdf
• R-GMA overview page
– http://www.r-gma.org/
• GLUE Schema
– http://infnforge.cnaf.infn.it/glueinfomodel/
• JDL attributes specification for WM proxy
– https://edms.cern.ch/document/590869/1
• WMProxy quickstart
– http://egee-jra1-wm.mi.infn.it/egee-jra1wm/wmproxy_client_quickstart.shtml
• WMS user guides
– https://edms.cern.ch/document/572489/1
EGEE-II INFSO-RI-031688
International Summer School on Grid Computing 2006
35