Transcript Slide 1

Pegasus: Planning for Execution in Grids
Virtual Data Concepts
-- Capture and manage information about relationships among
-- Data (of widely varying representations)
-- Programs (& their execution needs)
-- Computations (& execution
environments)
-- Apply this information to, e.g.
-- Discovery: Data and program discovery
-- Workflow: Structured paradigm for
organizing, locating, specifying, & requesting data
-- Explanation: provenance
-- Research part of NSF funded GriPhyN project
Pegasus: Planning for Execution in Grids
-- Maps from abstract to concrete workflow
-- Algorithmic and AI based techniques
-- Automatically locates physical locations for both
components (transformations) and data
-- Uses Globus RLS and the Transformation Catalog
-- Finds appropriate resources to execute
-- via Globus MDS
-- Reuses existing data products where applicable
-- Publishes newly derived data products
-- Chimera virtual data catalog & MCS
Small Montage Workflow
LIGO Scientific Collaboration
-- Continuous gravitational waves are expected to be
produced by a variety of celestial objects.
-- Only know about a small fraction of potential sources.
-- Need to perform blind searches, scanning the regions
of the sky where we have no a priori
information of the presence of a source
-- Wide area, wide frequency searches
-- The search is very compute and data intensive.
-- LSC is using the occasion of SC2003 to initiate a
month-long production run with science data
collected during 8 weeks in the Spring of 2003
-- Search is performed for potential sources of
continuous periodic waves near the Galactic
Center and the galactic core.
Testbed
Montage
-- Delivers science grade custom
mosaics on demand
-- Produces mosaics from a wide
range of data sources
(possibly in different
spectra)
-- User-specified parameters of
projection, coordinates,
size, rotation and spatial
sampling.
The Sword of Orion (M42, Trapezium, Great Nebula). This mosaic was
obtained by running a Montage workflow through Pegasus and
executing the concrete workflow the Teragrid resources.
Just In-time planning
PW A
PW B
PW C
Additional resources used: Grid3 iVDGL resources
Thanks to everyone involved in standing up the tested and contributing the
resources!
People Involved:
LIGO: Bruce Allen, Scott Koranda, Brian Moe, Xavier Siemens, University of Wisconsin Milwaukee, USA, Stuart
Anderson, Kent Blackburn, Albert Lazzarini, Dan Kozak, Hari Pulapaka, Peter Shawhan, Caltech, USA,
Steffen Grunewald, Yousuke Itoh, Maria Alessandra Papa, Albert Einstein Institute, Germany, Many
Others involved in the Testbed
Montage: Bruce Berriman, John Good, Anastasia Laity, Caltech/IPAC, Joseph C. Jacob, Daniel S. Katz, JPL
Pegasus:Ewa Deelman, Carl Kesselman, Saurabh Khurana, Gaurang Mehta, Sonal Patil, Gurmeet Singh, Mei-Hui
Su, Karan Vahi, James Blythe, Yolanda Gil, ISI
New Abstract
Worfklow
A Particular Partitioning
Pegasus(A)
Su(A)
Workflow Submitted to
DAGMAN
DAGMan(Su(A))
Pegasus(B)
Su(B)
DAGMan(Su(B))
Pegasus(X) –Pegasus generates
the concrete workflow and the
submit files for X = Su(X)
DAGMan(Su(X))—DAGMan executes
the concrete workflow for X
Pegasus(C)
Su(C)
DAGMan(Su(C))