Transcript Slide 1
Pegasus: Planning for Execution in Grids Virtual Data Concepts -- Capture and manage information about relationships among -- Data (of widely varying representations) -- Programs (& their execution needs) -- Computations (& execution environments) -- Apply this information to, e.g. -- Discovery: Data and program discovery -- Workflow: Structured paradigm for organizing, locating, specifying, & requesting data -- Explanation: provenance -- Research part of NSF funded GriPhyN project Pegasus: Planning for Execution in Grids -- Maps from abstract to concrete workflow -- Algorithmic and AI based techniques -- Automatically locates physical locations for both components (transformations) and data -- Uses Globus RLS and the Transformation Catalog -- Finds appropriate resources to execute -- via Globus MDS -- Reuses existing data products where applicable -- Publishes newly derived data products -- Chimera virtual data catalog & MCS Small Montage Workflow LIGO Scientific Collaboration -- Continuous gravitational waves are expected to be produced by a variety of celestial objects. -- Only know about a small fraction of potential sources. -- Need to perform blind searches, scanning the regions of the sky where we have no a priori information of the presence of a source -- Wide area, wide frequency searches -- The search is very compute and data intensive. -- LSC is using the occasion of SC2003 to initiate a month-long production run with science data collected during 8 weeks in the Spring of 2003 -- Search is performed for potential sources of continuous periodic waves near the Galactic Center and the galactic core. Testbed Montage -- Delivers science grade custom mosaics on demand -- Produces mosaics from a wide range of data sources (possibly in different spectra) -- User-specified parameters of projection, coordinates, size, rotation and spatial sampling. The Sword of Orion (M42, Trapezium, Great Nebula). This mosaic was obtained by running a Montage workflow through Pegasus and executing the concrete workflow the Teragrid resources. Just In-time planning PW A PW B PW C Additional resources used: Grid3 iVDGL resources Thanks to everyone involved in standing up the tested and contributing the resources! People Involved: LIGO: Bruce Allen, Scott Koranda, Brian Moe, Xavier Siemens, University of Wisconsin Milwaukee, USA, Stuart Anderson, Kent Blackburn, Albert Lazzarini, Dan Kozak, Hari Pulapaka, Peter Shawhan, Caltech, USA, Steffen Grunewald, Yousuke Itoh, Maria Alessandra Papa, Albert Einstein Institute, Germany, Many Others involved in the Testbed Montage: Bruce Berriman, John Good, Anastasia Laity, Caltech/IPAC, Joseph C. Jacob, Daniel S. Katz, JPL Pegasus:Ewa Deelman, Carl Kesselman, Saurabh Khurana, Gaurang Mehta, Sonal Patil, Gurmeet Singh, Mei-Hui Su, Karan Vahi, James Blythe, Yolanda Gil, ISI New Abstract Worfklow A Particular Partitioning Pegasus(A) Su(A) Workflow Submitted to DAGMAN DAGMan(Su(A)) Pegasus(B) Su(B) DAGMan(Su(B)) Pegasus(X) –Pegasus generates the concrete workflow and the submit files for X = Su(X) DAGMan(Su(X))—DAGMan executes the concrete workflow for X Pegasus(C) Su(C) DAGMan(Su(C))