4+grid10-presentation_wei_chen

Download Report

Transcript 4+grid10-presentation_wei_chen

Exploiting Deadline Flexibility in
Grid Workflow Rescheduling
Wei Chen
Alan Fekete
Young Choon Lee
Agenda
• Introduction
• Deadline Guaranteed Rescheduling
• Workflow Scheduling
• Task Rescheduling
• Performance Study
• Conclusion
Computational Grid and
Workflow Application
• Computational Grid:
Grid  {R1, R2 ......,Rn }
– Heterogeneous Computing Site (Resource Instance)
– Advance Reservation
1
• Workflow Application
– Directed Acyclic Graph (DAG)
– Job (V, E), where V is the set of tasks and
E is directed edges represent precedence
constraints between corresponding tasks
2
3
4
5
Grid Workflow Scheduling
• List scheduling heuristics
• Heterogeneous Earliest-Finish-Time (HEFT)
– Greedy Best-First Strategy
– It lacks an overall consideration in scheduling
different workflow jobs
Agenda
• Introduction
• Deadline Guaranteed Rescheduling
• Workflow Scheduling
• Task Rescheduling
• Performance Study
• Conclusion
The Approach we build on: Deadline
Guaranteed Rescheduling (DGR)
• Deadline-based scheduling: it allows each job to
come with a deadline, and from this, each task
of the job can be placed more flexibly (not only
at the earliest possible timeslot)
• A rescheduling mechanism: the tasks of an
earlier job might be rearranged to other time
slots or resource instances, giving extra
resource availability for more urgent tasks
An Example of Scheduling and
Rescheduling Workflow Jobs
(A)
R1
1
2
R2
A
1
4
A
2
A
2
A
3
Deadline (B)
3
Deadline (A)
4
R1
A
1
1
2
R3
3
5
(B)
R2
A
4
A
3
A
4
B
3
B
4
R3
R1
A
1
B
1
B
2
A
2
R2
A
3
A
4
B
3
B
4
A
5
A
5
A
5
(a)
(b)
(c)
R3
B
1
B
2
A
2
The Key Points of Our Approach
• First, our approach loosely distributes tasks along the time axis
according to the deadline of the workflow job, but not squeezes
them on the earliest finish time. It is more flexible in rescheduling to
allow urgent tasks get required resource availability.
• Second, our approach is not to reconsider schedules of the whole
job again. Each task is rescheduled within a time slot boundary so
that it does not affect the current schedules of all its predecessors
and successors. This simplifies the complexity of our algorithm.
• Third, our rescheduling can be made not only in time dimension
(another time slot), but also in space dimension (different resource
instances). This increases the flexibility in rescheduling.
• Our rescheduling is to rearrange advance reservations of tasks
before they are submitted for execution. This approach does not
incur the cost in task migration.
Agenda
• Introduction
• Deadline Guaranteed Rescheduling
• Workflow Scheduling
• Task Rescheduling
• Performance Study
• Conclusion
Task Deadlines
• Weighted DAG
vi V , ET (vi )  workload(vi ) / S
S : the average computation speed
(1)
eij  E , DT (eij )  data(eij ) / B
B : the average network bandwidth
(2)
vi V , v j  predecessor (vi ), EFT(vi )  max{EFT(v j )  DT (eij )}  ET (vi )
(3)
m akespan max{EFT(vi )} vi V
(4)
ratio  deadline/ m akespan
ET '  ET  ratio
DT '  DT  ratio
(5)
• An advisable deadline for each task
vi V v j  successor(vi ), deadline(vi )  min{deadline(v j )  ET' (v j )  DT ' (eij )} (6)
• The deadline of a workflow job can be guaranteed if all of its tasks are
finished before their deadlines. These advisable deadlines reasonably
balance the time for each task based on their workload proportions.
Scheduling Algorithm
Input a DAG
Output scheduling of the job
calculate deadlines for each task; rank tasks into a priority list
for each task in the list do
schedule task within its deadline
if it fails then
schedule task in the earliest finish time
if this finish time > job’s deadline then break the loop
end if
end for
if scheduling is not done then
rollback schedules have been made
for each task in the list do
schedule task in the earliest finish time
if this finish time > job’s deadline then reject the job
end for
end if
Agenda
• Introduction
• Deadline Guaranteed Rescheduling
• Workflow Scheduling
• Task Rescheduling
• Performance Study
• Conclusion
Time Slot Boundary
• The time slot boundary is calculated when a task tries to
be rescheduled on a specific resource instance
• At the moment, the actual schedules of the task’s
predecessors and successors are known
• Since the target resource is specified, the actual network
bandwidths between the resource instance and that of
the task’s predecessors or successors are also known
vi  V , v j  predecessor (vi ), vk  successor(vi )
EST(vi , R )  max{AFT (v j )  data(eij ) bandwidth(eij )}
LFT (vi , R )  min{AST(vk )  data(eik ) bandwidth(eik )}
Bipartite Graph Matching
• We make all tasks one part of nodes T (no matter which
workflow job the task belongs to), and all resource
instances the other part R.
• Every task is linked with all its satisfiable resources. The
arrow of the line shows whether the task has been
scheduled on (or matched with) a resource instance,
which is represented by an arrow pointing to the task.
T
R
1
1
2
T
R
1
1
2
1
1
2
3
(a)
R
2
2
3
T
2
3
(b)
(c)
Rescheduling Algorithm
Input a task Output scheduling of the task
push the task into an empty stack S
while S is not empty
pop a task from S
for each satisfiable resource of the task do
calculate EST and LFT
if it can be scheduled in the boundary then
return: the scheduling
else if a task can be removed then
push it into S
end if
end for
end while
return: scheduling fails
Agenda
• Introduction
• Deadline Guaranteed Rescheduling
• Workflow Scheduling
• Task Rescheduling
• Performance Study
• Conclusion
Experiment Setup
• Heterogeneous Grid
– 1,000 heterogeneous computing sites
– Different setting in resource properties, computation
capacity and speed
– Computing sites are fully connected by varying
network bandwidths
• Workflow Jobs
– various sizes and parallelism degrees
– both computation intensive and communication
intensive ones
– some are more urgent than others
Acceptance Rate
Overall Acceptance Rate
100%
HEFT
DGR
DGR-L
80%
60%
40%
20%
0
100
200
Number of Jobs Submitted
300
Resource Utilization
Resource Utilization
100%
HEFT
DGR
DGR-L
80%
60%
40%
20%
0%
0
100
200
Number of Submitted Jobs
300
Running Time of Algorithms
Running Time
300
HEFT
DGR
DGR-L
Time (ms)
250
200
150
100
50
0
0
100
200
Number of Submitted Jobs
300
Agenda
• Introduction
• Deadline Guaranteed Rescheduling
• Workflow Scheduling
• Task Rescheduling
• Performance Study
• Conclusion
Conclusion
• A deadline-based strategy to schedule and
reschedule workflow jobs; individual tasks can
be rescheduled, based on the requirements of
later jobs as they arrive.
• The approach satisfies Grid users as more jobs
can be finished before their deadlines, and it
also benefits the Grid owner by improving
resource utilization.
• By using appropriate heuristics, the cost of the
scheduling decision-making is quite acceptable
and scalable to a large number of tasks
scheduled in the system.
Thanks
Questions