Adaptive Job Scheduler
Download
Report
Transcript Adaptive Job Scheduler
Faucets: Efficient Utilization of
Multiple Clusters
Laxmikant Kale, Jayant DeSouza,
Sameer Kumar, Sindhura Bandhakavi,
Mani Potnuru
Parallel Programming Laboratory
Department of Computer Science
University of Illinois at Urbana-Champaign
http://charm.cs.uiuc.edu/
7/17/2015
Charm++ Workshop 2002
1
Outline
Motivation, and
Faucets
Performance Results
Adaptive Queuing System, and
Job Submission
Job Monitoring
Adaptive Jobs, and
the Faucets solution
the Adaptive Jobs solution
Simulations and Performance Results
Future Work
7/17/2015
Charm++ Workshop 2002
2
Motivation
Demand for high end compute power, but
1.
Dispersed
Hard to use
2.
which machine will give me back my results quickest?
use ssh to login, ftp files, decide queue, create script,
submit
because of the hassle, users just submit same script to
same machine even if a better alternative exists
monitor a running job
Low operational efficiency of existing
computing systems
7/17/2015
Charm++ Workshop 2002
3
Solution 1: Faucets
Motivation #1: dispersed, hard to use
Central source of compute power
Match users and providers
Users
Providers of compute resources
User account not needed on every resource
Market economy ?
QoS requirements, contracts and bidding systems
GUI or web-based interface
Submission
monitoring
7/17/2015
Charm++ Workshop 2002
4
Parallel systems need to
maximize their efficiency!
Faucets
Cluster
Job
Submission
Cluster
Job Monitor
http://charm.cs.uiuc.edu/resear
ch/faucets
7/17/2015
Charm++ Workshop 2002
Cluster
5
Motivation #2: Inefficient Utilization
16 Processor system
Allocate A !
Conflict
!
B Queued
Job A
Job B
8 processors
Job A
Job B
Current Job Schedulers can have low system utilization !
7/17/2015
Charm++ Workshop 2002
6
Motivation #2, contd.
Chun & Culler paper
Compares FirstPrice (market-based
scheduling) with PrioFIFO.
Up to 2.5x improvement as degree of job
parallelism increases
Both have “head-of-line” blocking
Adaptive jobs fix this
Brent Chun and David Culler – User-centric Performance Analysis of Market-based Cluster Batch
Schedulers, CCGrid 2002.
7/17/2015
Charm++ Workshop 2002
7
Solution 2: Adaptive Jobs
Jobs that can shrink or expand the number of
processors they are running on at runtime
Improve system utilization and response time
Properties
Min_pe,
related to the memory requirements of the job
Max_pe,
7/17/2015
related to speedup
Charm++ Workshop 2002
8
Adaptive Job Scheduler
Scheduler can take advantage of this adaptivity
Improve system utilization and response time
Scheduling decisions
Shrink existing jobs when a new job arrives
Expand jobs to use all processors when a job finishes
Processor map sent to the job
Bit vector specifying which processors a job is allowed to
use
00011100 (use 3 4 and 5!)
Handles regular (non-adaptive) jobs
7/17/2015
Charm++ Workshop 2002
9
Two Adaptive Jobs
16 Processor system
AAllocate
Expands
A !!
Allocate
BA!
BShrink
Finishes
Job A
Job B
Job A
7/17/2015
Min_pe = 8
Max_pe= 16
Charm++ Workshop 2002
Job B
10
Outline
Motivation, and
Faucets
Performance Results
Adaptive Queuing System, and
Job Submission
Job Monitoring
Adaptive Jobs, and
the Faucets solution
the Adaptive Jobs solution
Simulations and Performance Results
Future Work
7/17/2015
Charm++ Workshop 2002
11
Faucets: Job Submission
7/17/2015
Charm++ Workshop 2002
12
Submission Mechanism
QoS requirements, contract, bidding
type, number of processors
memory
estimated compute time
or table: processors vs. compute time
deadline
price
Authentication, security
Accounting
Cluster Bartering
7/17/2015
Charm++ Workshop 2002
13
Parallel systems need to
maximize their efficiency!
Faucets
Cluster
Job
Submission
Cluster
Job Monitor
http://charm.cs.uiuc.edu/resear
ch/faucets
7/17/2015
Charm++ Workshop 2002
Cluster
14
Job Monitoring: Appspector
7/17/2015
Charm++ Workshop 2002
15
Using Appspector
Charm client-server (CCS) interface
Default server
Default Java client
User can write
Program code to send relevant data
Java class to display data
7/17/2015
Charm++ Workshop 2002
16
Clusters Status View
7/17/2015
Charm++ Workshop 2002
17
Adaptive Jobs
7/17/2015
Charm++ Workshop 2002
18
Adaptive Job Framework
Adaptive Application
Scheduler
AMPI
Proc. Map
CHARM++
Loadbalancer
Applications written in MPI
or Charm++
Scheduler controls the
processor map for each job
Processor map is used by
the job’s load balancer
Converse
7/17/2015
Charm++ Workshop 2002
19
Charm++
Charm++: Object based virtualization
Program written as a large number of objects
which can migrate
Number of objects typically much larger than
processors
Load-balancer can remap objects
Measurement based load balancing
7/17/2015
Charm++ Workshop 2002
20
Adaptive Charm++ Programs
Charm++ program is adaptive
automatically if a shrink expand enabled
centralized load-balancing strategy is used
Currently CommLB and RandcentLB are
shrink expand enabled
Compile with –module CommLB
Run with +balancer CommLB
7/17/2015
Charm++ Workshop 2002
21
MPI Jobs
How do we make MPI jobs adaptive?
AMPI
AMPI maps the MPI processes to user level
threads which can migrate
Each thread is embedded in a Charm++ object,
thus allowing load balancing and shrink-expand
7/17/2015
Charm++ Workshop 2002
22
Adaptive AMPI Programs
Build AMPI with an adaptive load
balancing strategy
Call MPI_MIGRATE() at regular intervals in
each MPI process, because it will not
listen to the processor map otherwise.
7/17/2015
Charm++ Workshop 2002
23
Performance Results for
Adaptive Jobs
7/17/2015
Charm++ Workshop 2002
24
Shrink Expand Overhead
Processors
Shrink Time (s)
Expand Time (s)
128
64
0.61
0.50
64
32
0.66
0.54
32
16
0.59
0.46
16
8
0.56
0.49
Performance for MD program with 10MB migrated
data per processor on NCSA Platinum
7/17/2015
Charm++ Workshop 2002
25
Residual Processes
Shrink
Objects are moved from the unallocated
processors to the allocated processors
Leaves behind a residual process
7/17/2015
Charm++ Workshop 2002
26
Utilization
(%)
Effect of Residual Process
Jobs In
System
Performance
cost (%)
2
1.98
4
1.43
8
3.24
Performance on a 16
processor system
Time (s)
Performance of Job1 and Job2
7/17/2015
Charm++ Workshop 2002
27
Adaptive Queuing System
7/17/2015
Charm++ Workshop 2002
28
AQS Features
Multithreaded
Reliable and robust
Tested on the cool.cs Linux cluster at PPL
Supports most features of standard queuing
systems
Has the ability to manage adaptive jobs currently
implemented in Charm++ and MPI
Handles regular (non-adaptive) jobs
7/17/2015
Charm++ Workshop 2002
29
AQS Scheduling Strategy
Scheduling Strategy
A library component that decides which jobs to schedule
Similar to equipartitioning [N Islam et al]
On job arrival and job completion
All running jobs and the new one are allocated their
minimum number of processors
Leftover processors are shared equally subject to each
job's maximum processor usage
If it is not possible to allocate the new job its minimum
number of processors, it is queued
7/17/2015
Charm++ Workshop 2002
30
Simulated Utilization
7/17/2015
Charm++ Workshop 2002
31
Simulated MRT
7/17/2015
Charm++ Workshop 2002
32
Experimental Utilization
7/17/2015
Charm++ Workshop 2002
33
Experimental MRT
7/17/2015
Charm++ Workshop 2002
34
Summary and Future Work
Ease of use – Faucets
Better utilization – Charm++/AMPI Adaptive Jobs
Go to http://charm.cs.uiuc.edu/research/faucets to
download
Future
Extend the system to other parallel machines
Eliminate residual processes
Integrate the scheduler with Globus
More comprehensive QoS contracts being developed
Sophisticated bidding schemes for the faucets framework
7/17/2015
Charm++ Workshop 2002
35