Pegasus-a framework for planning for execution in grids Karan Vahi [email protected] USC Information Sciences Institute May 5th , 2004
Download ReportTranscript Pegasus-a framework for planning for execution in grids Karan Vahi [email protected] USC Information Sciences Institute May 5th , 2004
Pegasus-a framework for planning for execution in grids Karan Vahi [email protected] USC Information Sciences Institute May 5th , 2004 People Involved USC/ISI Advanced Systems: Ewa Deelman, Carl Kesselmann, Gaurang Mehta, Mei-Hui Su, Gurmeet Singh, Karan Vahi. Karan Vahi, ISI [email protected] May 5th, 2004 2 Outline Introduction To Planning DAX Pegasus Portal Demonstration Karan Vahi, ISI [email protected] May 5th, 2004 3 Planning in Grids One has various alternatives out on the grid in terms of data and compute resources. Planning – Select the best available resources and data sets, and schedule them on to the grid to get the best possible execution time. – Plan for the data movements between the sites Karan Vahi, ISI [email protected] May 5th, 2004 4 Recipe For Planning Understand the request – Figure out what data product the request refers to, and how to generate it from scratch. Locations of data products – Final data product – Intermediate data products which can be used to generate the final data product. Location of Job executables State of the Grid – Available processors, physical memory available, job queue lengths etc. Karan Vahi, ISI [email protected] May 5th, 2004 5 Constituents of Planning Domain Knowledge Resource Information Location Information Plan submitted the grid Planner Karan Vahi, ISI [email protected] May 5th, 2004 6 Terms (1) Abstract Workflow (DAX) – Expressed in terms of logical entities – Specifies all logical files required to generate the desired data product from scratch – Dependencies between the jobs – Analogous to build style dag Concrete Workflow – Expressed in terms of physical entities – Specifies the location of the data and executables – Analogous to a make style dag Karan Vahi, ISI [email protected] May 5th, 2004 7 Outline Introduction to Planning DAX Pegasus Portal Demonstration Karan Vahi, ISI [email protected] May 5th, 2004 8 DAX The format for specifying the abstract workflow, that identifies the recipe for creating the final data product at a logical level. In case of montage, the IPAC webservice ends up creating the dax for the user request. Developed at University Of Chicago Karan Vahi, ISI [email protected] May 5th, 2004 9 DAX Example <?xml version="1.0" encoding="UTF-8"?> <!-- generated: 2003-09-25T11:51:19-05:00 --> <!-- generated by: vahi [??] --> <adag xmlns="http://www.griphyn.org/chimera/DAX" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.griphyn.org/chimera/DAX http://www.griphyn.org/chimera/dax-1.6.xsd" count="1" index="0" name="black-diamond"> <!-- part 1: list of all files used (may be empty) --> <filename file="f.a" link="input"/> <filename file="f.b" link="inout"/> <filename file="f.c" link="output"/> <!-- part 2: definition of all jobs (at least one) --> <job id="ID000001" namespace="montage" name="preprocess" version="1.0" level = "2"> <argument>-a top -T60 -i <filename file="f.a"/> -o <filename file="f.b"/> </argument> <uses file="f.a" link="input" dontRegister="false" dontTransfer="false"/> <uses file="f.b" link="output" dontRegister="true" dontTransfer="true" temporaryHint="true"/> </job> <job id="ID000002" namespace="montage" name="analyze" version="1.0" level="1" > <argument>-a bottom -T60 -i <filename file="f.b"/> -o <filename file="f.c"/></argument> <uses file="f.b" link="input" dontRegister="false" dontTransfer="false"/> <uses file="f.c" link="output" dontRegister="false" dontTransfer="false"/> </job> <!-- part 3: list of control-flow dependencies (empty for single jobs) --> <child ref="ID000002"> <parent ref="ID000001"/> </child> </adag> Karan Vahi, ISI [email protected] May 5th, 2004 10 Outline Introduction to Planning DAX Pegasus Demonstration Portal Karan Vahi, ISI [email protected] May 5th, 2004 11 Pegasus A configurable system to map and execute complex workflows on the grids. – DAX Driven Configuration – Metadata Driven Configuration Can do full ahead planning or deferred planning to map the workflows. Karan Vahi, ISI [email protected] May 5th, 2004 12 Full Ahead Planning At the time of submission of the workflow, you decide where you want to schedule the jobs in the workflow. Allows you to perform certain optimizations by looking ahead for bottleneck jobs and then scheduling around them. However, for large workflows the decision you make at submission time may no longer be valid or optimum at the point the job is actually run. Karan Vahi, ISI [email protected] May 5th, 2004 13 Deferred Planning Delay the decision of mapping the job to the site as late as possible. Involves partitioning of the original dax into smaller daxes each of which refers to a partition on which Pegasus is run. Construct a Mega DAG that ends up running pegasus automatically on the partition daxes, as each partition is ready to run. Karan Vahi, ISI [email protected] May 5th, 2004 14 High Level Block Diagram IPAC/JPL WebService Abstract Worfklow Request Manager Workflow Planning Data Management Replica Locatio n Available Reources Workflow Reduction at io n in fo rm Concrete Workflow Globus Monitoring and Discovery Service Application Models M on ito rin g workflow executor (DAGman) Execution Data Publication Dynamic information Submission and Monitoring System Replica and Resource Selector Globus Replica Location Service Information and Models s ta Grid ks Raw data detector Karan Vahi, ISI [email protected] May 5th, 2004 15 Replica Discovery Pegasus needs to know where the input files for the workflow reside. In Montage case, it should know where the fits files that are required for the mProject jobs reside. Hence Pegasus needs to discover the files that are required for executing a particular abstract workflow. Karan Vahi, ISI [email protected] May 5th, 2004 16 RLS 1) Pegasus queries RLI with the LFN Pegasus RLI 2) RLI returns the list of LRC’s that contain the desired mappings. 3) Pegasus queries each LRC in the list to get the PFN’s. Each LRC sends periodic updates to the RLI LRCA LRCB LRCC Each LRC is responsible for one pool Figure (1) RLS Configuration for Pegasus Interfacing to RLS done by Karan Vahi, Shishir Karan Vahi, ISI [email protected] May 5th, 2004 17 Alternate Replica Mechanisms Replica Catalog – Pegasus supports the LDAP based Replica Catalog User defined mechanisms – Pegasus provides the flexibility for the user to specify his own replica mechanism instead of RLS or Replica Catalog – The user just has to implement the concerned interface Design and Implementation done by Karan Vahi Karan Vahi, ISI [email protected] May 5th, 2004 18 Transformation Catalog Pegasus needs to access a catalog to determine the pools where it can run a particular piece of code. If a site does not have the executable, one should be able to ship the executable to the remote site. Generic TC API for users to implement their own transformation catalog. Current Implementations – File Based – Database Based Karan Vahi, ISI [email protected] May 5th, 2004 19 File based Transformation Catalog Consists of a simple text file. – Contains Mappings of Logical Transformations to Physical Transformations. Format of the tc.data file #poolname logical tr isi preprocess physical tr /usr/vds/bin/preprocess env VDS_HOME=/usr/vds/; All the physical transformations are absolute path names. Environment string contains all the environment variables required in order for the transformation to run on the execution pool. Karan Vahi, ISI [email protected] May 5th, 2004 20 DB based Transformation Catalog Presently ported on MySQL. Postgres to be tested. Adds support for transformations, compiled for different architectures, OS, OS version and glibc combination, that would enable us to transfer transformation to remote sites if the executable does not reside there. Supports multiple profile namespaces. At present using only the env namespace. Supports multiple physical transformations for the same logical transformation,pool,type tuple. Karan Vahi, ISI [email protected] May 5th, 2004 21 Pool Configuration (1) Pool Config is an XML file which contains information about various pools on which DAGs may execute. Some of the information contained in the Pool Config file is – Specifies the various job-managers which are available on the pool for the different types of condor universes. – Specifies the GridFtp storage servers associated with each pool. – Specifies the Local Replica Catalogs where data residing in the pool has to be cataloged. – Contains profiles like environment hints which are common site wide. – Contains the working and storage directories to be used on the pool. Karan Vahi, ISI [email protected] May 5th, 2004 22 Pool Configuration (2) Two Ways to construct the Pool Config File. – Monitoring and Discovery Service – Local Pool Config File (Text Based) Client tool to generate Pool Config File – The tool genpoolconfig is used to query the MDS and/or the local pool config file/s to generate the XML Pool Config file. Karan Vahi, ISI [email protected] May 5th, 2004 23 Pool Configuration (3) This file is read by the information provider and published into MDS. Format gvds.pool.id : <POOL ID> gvds.pool.lrc : <LRC URL> gvds.pool.gridftp : <GSIFTP URL>@<GLOBUS VERSION> gvds.pool.gridftp : gsiftp://sukhna.isi.edu/nfs/asd2/[email protected] gvds.pool.universe : <UNIVERSE>@<JOBMANAGER URL>@< GLOBUS VERSION> gvds.pool.universe : [email protected]/[email protected] gvds.pool.gridlaunch : <Path to Kickstart executable> gvds.pool.workdir : <Path to Working Dir> gvds.pool.profile : <namespace>@<key>@<value> gvds.pool.profile : env@GLOBUS_LOCATION@/smarty/gt2.2.4 gvds.pool.profile : vds@VDS_HOME@/nfs/asd2/gmehta/vds Karan Vahi, ISI [email protected] May 5th, 2004 24 DAX Driven Configuration(1) Pegasus uses IPAC/JPL webservice as an abstract workflow generator Pegasus takes in this abstract workflow and creates a concrete workflow by consulting the various grid services described before Karan Vahi, ISI [email protected] May 5th, 2004 25 DAX Driven Configuration(2) IPAC/JPL Service (1) Abstract Workflow (DAG) Current State Generator (16) Results (12) DAGMan files (2) Abstract Dag Request Manager MCS RLS (9) Concrete Dag (3) Logical File Names (LFN’s) (11) DAGMan files Abstract Dag Reduction (10) Concrete Dag (4) Physical File Names (PFN’s) (15) Monitoring MDS Abstract and Concrete Planner Concrete Planner (5) Full Abstract Dag (6) Reduced Abstract DAG (7) Logical Transformations (8) Physical Transformations and VDL Generator Execution Environment Information Submit File Generator DAGMan Submission & Monitoring (13) DAG (14) Log files Transformation Catalog Condor-G/ DAGMan Karan Vahi, ISI [email protected] May 5th, 2004 26 DAG Reduction Abstract Dag Reduction – Pegasus queries the RLS with the LFN’s referred to in the Abstract Workflow – If data products are found to be already materialized, Pegasus reuses them and thus reduces the complexity of CW Karan Vahi, ISI [email protected] May 5th, 2004 27 Abstract Dag Reduction On applying the reduction algorithm additional jobs a,b,c are deleted Job c Job a Job b Job f Pegasus Queries the RLS and finds the data products of jobs d,e,f already materialized. Hence deletes those jobs Job d Job e Job g KEY The original node Job h Pull transfer node Registration node Job i Push transfer node Implemented by Karan Vahi Karan Vahi, ISI [email protected] May 5th, 2004 28 Concrete Planner (1) Job c Job a Pegasus adds transfer nodes for transferring the input files for the root nodes of the decomposed dag (job g) Pegasus schedules job g,h on pool X and job i on pool Y. Hence adding an interpool transfer node Job b Job f Job d These three nodes are for transferring the output files of the leaf job (f) to the output pool, since job f has been deleted by the Reduction Algorithm. Job e Job g Job h KEY The original node Job i Pull transfer node Pegasus adds replica nodes for each job that materializes data (g, h, i ). Registration node Push transfer node Node deleted by Reduction algo Inter-pool transfer node Implemented by Karan Vahi Karan Vahi, ISI [email protected] May 5th, 2004 29 Transient Files Selective Transfer of output files – Data Sets generated by intermediate nodes in DAG are huge – However, user maybe interested only in outputs of selected jobs – Transfer of all the files could severely overload the jobmanagers on the compute sites Need For Selective Transfer of Files – For each file at the virtual data, user can specify whether it is transient or not. – Pegasus bases it’s decision on whether to transfer the file or not on this. Implemented by Karan Vahi Karan Vahi, ISI [email protected] May 5th, 2004 30 Outline Introduction to Planning DAX Pegasus Portal Demonstration Karan Vahi, ISI [email protected] May 5th, 2004 31 Portal Architecture Karan Vahi, ISI [email protected] May 5th, 2004 32 Portal Demonstration Karan Vahi, ISI [email protected] May 5th, 2004 33 Outline Introduction to Planning DAX Pegasus Portal Demonstration Karan Vahi, ISI [email protected] May 5th, 2004 34 Demonstration Run a small black diamond dag using both full ahead planning and deferred planning on the isi condor pool. Show the various configuration files (tc.data and pool.config) and how to generate them (pool.config). Generate the condor submit files. Submit the condor dag to condor dagman. Karan Vahi, ISI [email protected] May 5th, 2004 35 Software Required!! Submit Host – – – – – Condor DAGMAN (to submit the workflows on the grid). Java 1.4 (to run Pegasus) Globus 2.4 or higher Globus RLS (the registration jobs run on the local host). Xerces, ant , cog etc that come with the VDS distribution Compute Sites (Machines in the pool) – – – – Globus 2.4 or higher (gridftp server, g-u-c, MDS) On one machine per pool, an lrc should be running. Condor daemon running. Various jobmanagers correctly configured. Karan Vahi, ISI [email protected] May 5th, 2004 36 TC File Walk through the editing of TC file. A command line client is also in the works that allows you to update, add and modify the entries in your transformation catalog regardless of the underlying implementation. Karan Vahi, ISI [email protected] May 5th, 2004 37 GenPoolConfig (Demo) genpoolconfig is the client to generate the pool config file required by Pegasus. It queries the MDS and/or a local pool config file (text based) and generates a XML file. Am going to generate the pool config file from the text based configuration. Usage : – genpoolconfig –Dvds.giis.host <MDS GIIS hostname> Dvds.giis.dn <MDS GIIS DN> --poolconfig <comma separated local pool config files> --output <pool config output> Karan Vahi, ISI [email protected] May 5th, 2004 38 gencdag The Concrete planner takes the DAX produced by Chimera and converts into a set of condor dag and submit files. Usage : gencdag –dax|--pdax <file> --p <list of execution pools> [--dir <dir for o/p files>] [--o <outputpool>] [-force] You can specify more then one execution pools. Execution will take place on the pools on which the executable exists. If the executable exists on more then one pool then the pool on which the executable will run is selected randomly. Output pool is the pool where you want all the output products to be transferred to. If not specified the materialized data stays on the execution pool Karan Vahi, ISI [email protected] May 5th, 2004 39 Mei’s Exploits Mei has been running the montage code for the past one year, including some huge 6 and 10 degree dags (for the m16 cluster). The 6 degree runs had about 13,000 compute jobs and the 10 degree run had about 40,000 compute jobs!!! The final mosaic files can be downloaded from http://www.isi.edu/~griphyn/out_M16_10.fits http://www.isi.edu/~griphyn/out_M16_6.fits Karan Vahi, ISI [email protected] May 5th, 2004 40 Questions? Karan Vahi, ISI [email protected] May 5th, 2004 41