Transcript Slide 1
Workflow Task Clustering for Best Effort Systems with Pegasus Gurmeet Singh, Mei-Hui Su, Karan Vahi Ewa Deelman, Gaurang Mehta Information Sciences Institute University of Southern California Marina del Rey, CA 90292 pegasus.isi.edu Bruce Berriman, John Good Infrared Processing and Analysis Center California Institute of Technology Pasadena, CA 91125 Daniel S. Katz Center for Computation and Technology Louisiana State University Baton Rouge, LA 70803 Pegasus Generating mosaics of the sky Size of the mosaic is degrees square* Number of Number of Total Number input data Intermediate data of jobs files files footprint Approx. execution time (20 procs) 1 53 588 1.2GB 232 40 mins 2 212 3,906 5.5GB 1,444 49 mins 4 747 13,061 20GB 4,856 1hr 46 mins 6 1,444 22,850 38GB 8,586 2 hrs. 14 mins 10 3,722 54,434 97GB 20,652 6 hours Based on programming language principles Pegasus Abstract Workflow (Resource-independent) Leverage abstraction for workflow description to obtain ease of use, scalability, and portability Provide a compiler to map from high-level descriptions to executable workflows Correct mapping Performance enhanced mapping DAGMan Executable Workflow (Resources Identified) Ready Tasks LOCAL SUBMIT HOST (Community resource) Condor Queue information Rely on a runtime engine to carry out the instructions National CyberInfrastructure Scalable manner Reliable manner *The full moon is 0.5 deg. sq. when viewed form Earth, Full Sky is ~ 400,000 deg. sq. jobs DAGMan (Directed Acyclic Graph MANager) Image1 Project Diff Image2 Background Fitplane Project BgModel Diff Background Add Fitplane Image3 Project Background A view of the Rho Oph dark cloud constructed with Montage from deep exposures made with the Two Micron All Sky Survey (2MASS) Extended Mission Pegasus Workflow Mapping 1 4 5 8 Runs workflows that can be specified as Directed Acyclic Graphs Enforces DAG dependencies Progresses as far as possible in the face of failures Provides retries, throttling, etc. Runs on top of Condor (and is itself a Condor job) Automatic Node clustering Original workflow: 15 compute nodes devoid of resource assignment 9 4 10 12 13 15 8 3 7 Resulting workflow mapped onto 3 Grid sites: 9 11 compute nodes (4 reduced based on available intermediate data) 12 10 13 data stage-in nodes 15 8 inter-site data transfers 13 14 data stage-out nodes to longterm storage 14 data registration nodes (data cataloging) 60 jobs to execute The structure of a small Montage Two clusters per level Two tasks per cluster workflow Pegasus Can map portions of workflows at a time Supports the range of just-in-time to full-ahead mappings Can cluster workflow nodes to increase computational granularity Can minimize the amount of space required for the execution of the workflow 1 degree2 Montage On TeraGrid Dynamic data cleanup No clustering Can handle workflows on the order of 100,000 tasks Support for a variety of fault-recovery techniques jobs Hours SCEC CyberShake workflows run using Pegasus and DAGMan on the TeraGrid and USC resources 1,000,000 100,000 jobs / time in Hours Level-based, clustering factor 5 10,000 Cumulatively, the workflows consisted of over half a million tasks and used over 2.5 CPU Years. 1,000 100 10 1 0 42 43 44 45 49 50 51 2 3 4 5 7 8 25 26 Week of the year 2005-2006 27 29 30 31 34 41 The largest CyberShake workflow contained on the order of 100,000 nodes and accessed 10TB of data Support for LIGO on Open Science Grid LIGO Workflows: 185,000 nodes, 466,000 edges 10 TB of input data, 1 TB of output data.