Corporate PPT Template

Download Report

Transcript Corporate PPT Template

Pegasus on the Virtual Grid: A
Case Study of Workflow Planning
over Captive Resources
Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim
Oracle US Inc
Korea Advanced Institute of Science and Technology
Information Sciences Institute/University of Southern California
Sungkyunkwan University
Overview
 Motivation
 Background
–
–
Pegasus
Virtual Grid
 Pegasus-VG Proxy
 Conclusion
 Discussion
Motivation
 Challenges in scientific application
development
–
Data/control flow, task scheduling, data
replication, fault-tolerance, etc
 Challenges in resource management
–
Availability, performance, cost, reliability, faulttolerance, etc
 How to leverage existing cyber infrastructures
for easy and efficient scientific computing?
Separations of Concerns
 Application domain
–
–
Workflow management: application management
can be conducted independently of target
execution environments.
E.g.) Pegasus, Askalon, Triana
 Resource domain
–
–
Resource provisioning: resource management
can be encapsulated underneath abstractions or
virtualizations
E.g.) Virtual Grid, virtual cluster, cloud
Workflow planning and execution
over provisioned resources
Pegasus
 A framework for workflow planning and
execution
 Workflow lifecycle
–
–
–
Design: describe the data/control flows of
application via an abstract workflow
Planning: map the workflow tasks onto physical
resources
Execution: schedule and run the workflow tasks
on the mapped resources
Pegasus Workflow Management
Abstract workflow
Pegasus mapper
Condor
pool
Executable workflow
Pegasus
Condor DAGman
Monitoring
Information provenance
tasks
Condor
tasks
Monitoring
Information provenance
Computing environment
Virtual Grid
 A programmable virtualized resource
provisioning framework
 Components
–
–
–
vgDL (Virtual Grid Description Language)
 Specifies resource requirements
vgES (Virtual Grid Execution System)
 Compiles and coordinates resources
PC (Personal Cluster)
 Provides uniform job management
Application
A
vgdl=clusterof (node) [2] {
node = [Processor==“P4”]
}
Classification
C
B
D
C
D
program
VGDL
B
A
run
Virtual Grid
Resource Abstraction
Selection
VG
P4
Binding
ok
VG
Timeshare
P4
Environment
PBS
Lease
Timeshare
Batch
Pegasus on Virtual Grid
 Scope
–
A basic integration for workflow planning and
execution over provisioned resources
 Issues
–
–
Resource capacity estimation
 Resource specification (vgDL) synthesis for
Virtual Grid
Resource information publication
 Site catalog generation for Pegasus
Resource Capacity Estimation
 What Virtual Grid expects from Pegasus
–
vgDL description
 Available information
–
Task execution time, data transfer time, performance
metrics, minimum memory capacity, cost, deadline, etc
 Unknown information
–
# of virtual processors
 Resource capacity estimate
–
Minimize the # of processors that can execute a workflow
within a deadline
BTS (Balanced Time Scheduling)
p1
1
3
4
6
5
ET
1
5
2
2
1
1
1
3
Time
2
ID
1
2
3
4
5
6
2
4
5
6
How many processors do we need to run this
workflow within 7 units?
Ref: E-science’08 E.-K. Byun, Y.-S. Kee et. al
p2
Example



Execution time of each task - Xeon processor
Data transfer time - network with 1Gbs bandwidth.
Deadline is 1 hour.
f.input
preprocess
findrange
findrange
analyze
f.output
Diamond =
ClusterOf [2] (nd) [, 0:30:00] {
nd = [Processor == “Xeon”]
}
Resource Information Publication
 What Pegasus expects from Virtual Grid
–
Site catalog
 Virtual Grid
–
VG instance
 Resource information publication
–
Devirtualize a VG instance and generate a site
catalog for Pegasus
Application
A
vgdl=clusterof (node) [2] {
node = [Processor==“P4”]
}
Classification
C
B
D
C
D
program
VGDL
B
A
run
Virtual Grid
Resource Abstraction
Selection
VG
P4
Binding
ok
VG
Timeshare
P4
Environment
PBS
Lease
Timeshare
Batch
Personal Cluster
 A partition of resources dedicated to a user under the
control of a user-level resource manager during a
limited time period
GT4/PBS
GT4/PBS
Ref: HCW’08 Y.-S. Kee and C. Kesselman
Site Catalog Publication
<sitecatalog xmlns="http://pegasus.isi.edu/schema/sitecatalog" …>
…
<profile namespace="env" key="PEGASUS_HOME">/home/globus/pegasus2.1.0</profile>
<profile namespace="condor" key="grid_type">gt4</profile>
<profile namespace="condor" key="jobmanager_type">PBS</profile>
<lrc url="rlsn://cat7.kaist.ac.kr" />
<gridftp url="gsiftp://cat7.kaist.ac.kr:2811" storage="/home/globus" major="4" minor="0"
patch="7" />
<jobmanager universe="transfer"
url="https://cat7.kaist.ac.kr:9000/wsrf/services/ManagedJobFactoryService" major="4"
minor="0" patch="7" total-nodes="2" />
<jobmanager universe="vanilla"
url="https://cat7.kaist.ac.kr:9000/wsrf/services/ManagedJobFactoryService" major="4"
minor="0" patch="7" total-nodes="2" />
<workdirectory>$HOME/workdir</workdirectory>
</site>
…
</sitecatalog>
Workflow Planning over
Provisioned Resources
Pegasus
Creation
VG-Pegasus Proxy
Abstract
workflow
BTS
CCC
D
Planning
B
Scheduling/
Execution
CCC
D
Executable
workflow
Site catalog
A
Virtual Grid
vgdl =
ClusterOf (nd) [2] {
nd = [Proc==“Xeon”]
}
A
B
VGDL
VG
Devirtualization
GT4+PBS
Conclusion
 Pegasus on Virtual Grid
–
–
Implements workflow planning and execution
over on-demand captive resources
Enables easy and efficient application
development and execution
 Issues
–
–
Resource capacity estimation
Site catalog publication
Discussion
 Effective performance
–
What is the cost that a user has to pay to have a
successful execution?
 Ongoing studies
–
–
Find-grain planning for resource provisioning
 Performance, cost, reliability
Workflow execution for virtualization
 Recovery of failed tasks
Need More Information?
 Pegaus
–
http://pegasus.isi.edu
 VGrADS
–
–
–
–
Tuesday, 11:30am, RENCI booth (2633)
Wednesday, noon, GCAS booth (285)
Wednesday, 2:00Pm, SDSC booth (568)
Wednesday, 4:00pm, RENCI booth (2633)
QUESTIONS
ANSWERS