Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
Download ReportTranscript Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute Pegasus Acknowledgements Ewa Deelman, Carl Kesselman, Saurabh Khurana, Gaurang Mehta, Sonal Patil, Gurmeet Singh, MeiHui Su, Karan Vahi (Center for Grid Computing, ISI) James Blythe, Yolanda Gil (Intelligent Systems Division, ISI) Collaboration with Miron Livny (UW Madison) http://pegasus.isi.edu Research funded as part of the NSF GriPhyN, NVO and SCEC projects and EU-funded GridLab Ewa Deelman Information Sciences Institute Outline Workflow Management in Grids Pegasus, Planning for Execution in Grids Applications Using Pegasus In-time planning Future Research Directions Ewa Deelman Information Sciences Institute Grid Applications Increasing in the level of complexity Use of individual application components Reuse of individual intermediate data products (files) Description of Data Products using Metadata Attributes Execution environment is complex and very dynamic Resources come and go Data is replicated Components can be found at various locations or staged in on demand Separation between the application description the actual execution description Ewa Deelman Information Sciences Institute Application Development and Execution Process Abstract Workflow Generation FFT Application Component Selection ApplicationDomain Specify a Different Workflow Concrete Workflow Generation FFT filea Resource Selection Data Replica Selection Transformation Instance Selection Abstract Workflow Pick different Resources transfer filea from host1:// home/filea to host2://home/file1 /usr/local/bin/fft /home/file1 DataTransfer Concrete Workflow host1 host2 host2 Retry Data Data Execution Environment Ewa Deelman Failure Recovery Method Information Sciences Institute Why Automate Workflow Generation? Usability: Limit User’s necessary Grid knowledge Complexity: User needs to make choices Alternative application components Alternative files Alternative locations The user may reach a dead end Many different interdependencies may occur among components Solution cost: Evaluate the alternative solution costs Monitoring and Directory Service Replica Location Service Performance Reliability Resource Usage Global cost: minimizing cost within a community or a virtual organization requires reasoning about individual user’s choices in light of other user’s choices Ewa Deelman Information Sciences Institute GriPhyN’s Executable Workflow Construction Build an abstract workflow based on VDL descriptions (Chimera) Build an executable workflow based on the abstract workflows (Pegasus) Execute the workflow (Condor’s DAGMan) RLS TC Abstract Worfklow VDL Chimera Concrete Workflow Pegasus Ewa Deelman MDS Jobs DAGMan Information Sciences Institute VDL and Abstract Workflow a d1 b VDL descriptions b d2 c User request data file “c” a Abstract Workflow Ewa Deelman d1 b d2 c Information Sciences Institute Condor’s DAGMan Developed at UW Madison (Livny) Executes a concrete workflow Makes sure the dependencies are followed Execute the jobs specified in the workflow Execution Data movement Catalog updates Provides a “rescue DAG” in case of failure Ewa Deelman Information Sciences Institute Pegasus: Planning for Execution in Grids Maps from abstract to concrete workflow Algorithmic and AI-based techniques Automatically locates physical locations for both components (transformations) and data Finds appropriate resources to execute Reuses existing data products where applicable Publishes newly derived data products Chimera virtual data catalog Provides provenance information Ewa Deelman Information Sciences Institute Information Components Used by Pegasus Globus Monitoring and Discovery Service (MDS) Locates available resources Finds resource properties Dynamic: load, queue length Static: location of gridftp server, RLS, etc Globus Replica Location Service Locates data that may be replicated Registers new data products Transformation Catalog Locates installed executables Ewa Deelman Information Sciences Institute Example Workflow Reduction Original abstract workflow a b d1 d2 c If “b” already exists (as determined by query to the RLS), the workflow can be reduced b Ewa Deelman d2 c Information Sciences Institute Mapping from abstract to concrete b d2 c Query RLS, MDS, and TC, schedule computation and data movement Move b from A to B Execute d2 at B Ewa Deelman Move c from B to U Register c in the RLS Information Sciences Institute Montage (NASA and NVO) Montage Deliver science-grade custom mosaics on demand Produce mosaics from a wide range of data sources (possibly in different spectra) User-specified parameters of projection, coordinates, size, rotation and spatial sampling. Mosaic created by Pegasus based Montage from a run of the M101 galaxy images on the Teragrid. Ewa Deelman Information Sciences Institute Small Montage Workflow ~1200 nodes Ewa Deelman Information Sciences Institute Montage Acknowledgments Bruce Berriman, John Good, Anastasia Laity, Caltech/IPAC Joseph C. Jacob, Daniel S. Katz, JPL http://montage.ipac. caltech.edu/ Testbed for Montage: Condor pools at USC/ISI, UW Madison, and Teragrid resources at NCSA, PSC, and SDSC. Montage is funded by the National Aeronautics and Space Administration's Earth Science Technology Office, Computational Technologies Project, under Cooperative Agreement Number NCC5-626 between NASA and the California Institute of Technology. Ewa Deelman Information Sciences Institute Applications Using Chimera, Pegasus and DAGMan GriPhyN applications: High-energy physics: Atlas, CMS (many) Astronomy: SDSS (Fermi Lab, ANL) Gravitational-wave physics: LIGO (Caltech, AEI) Astronomy: Biology Galaxy Morphology (NCSA, JHU, Fermi, many others, NVO-funded) BLAST (ANL, PDQ-funded) Neuroscience Tomography for Telescience(SDSC, NIH-funded) Ewa Deelman Information Sciences Institute Current System Pegasus(Abstract Workflow) Concrete Worfklow DAGMan(CW)) Original Abstract Workflow Current Pegasus Ewa Deelman Workflow Execution Information Sciences Institute Workflow Refinement and execution User’s Request Workflow refinement Levels of abstraction Application -level knowledge Logical tasks Tasks bound to resources and sent for execution Relevant components Policy info Workflow repair Full abstract workflow Task matchmaker Not yet executed Ewa Deelman Partial execution executed time Information Sciences Institute Incremental Refinement Partition Abstract workflow into partial workflows PW A PW B PW C A Particular Partitioning Ewa Deelman New Abstract Workflow Information Sciences Institute Meta-DAGMan Pegasus(A) Su(A) DAGMan(Su(A)) Pegasus(B) Su(B) DAGMan(Su(B)) Pegasus(X) –Pegasus generates the concrete workflow and the submit files for X = Su(X) DAGMan(Su(X))—DAGMan executes the concrete workflow for X Ewa Deelman Pegasus(C) Su(C) DAGMan(Su(C)) Information Sciences Institute Conclusions Pegasus maps complex workflows onto the Grid Uses Grid information services to find resources, data and executables Reduces the workflow based on existing intermediate products Used in many applications Part of GriPhyN’s Virtual Data Toolkit Ewa Deelman Information Sciences Institute Future Directions Investigate various scheduling techniques Investigating fault tolerance issues Enable flexible interactions between workflow refiners (GriPhyN-wide scope: Pegasus, DAGMan) http://pegasus.isi.edu GGF10 workshop on workflow management GGF Workflow management research group [email protected] Ewa Deelman Information Sciences Institute Summary: The Grid Now The Future Grid Syntax-based matchmaking of resources to job requirements Scheduling of jobs based on Grid-able users that specify job execution sequences and computing requirements Condor matchmaker Attribute based discovery and selection More agility and coordination Wide range of users can specify high level requirements in a mixedinitiative mode Semantic matchmaking Aggregate resource reasoning Task-level reasoning to plan and schedule jobs and resources Scripting languages Workflow languages, Task graphs Explicit mappings from task to jobs, simple job brokers Explicit service Ewa Deelman Knowledge-based reasoning about resources enables Mapping of high-level requirements to details required for execution End-to-end resource Information Institute negotiation and Sciences adaptive