How High Throughput was my cluster? Greg Thain Center for High

download report

Transcript How High Throughput was my cluster? Greg Thain Center for High

How High Throughput
was my cluster?
Greg Thain
Center for High Throughput Computing
High Throughput Defined
๐ฝ๐‘œ๐‘
๐‘…๐‘ข๐‘›๐‘ก๐‘–๐‘š๐‘’๐‘ 
๐‘Š๐‘Ž๐‘™๐‘™ ๐‘‡๐‘–๐‘š๐‘’
2
More Correctly
๐ถ๐‘œ๐‘š๐‘๐‘™๐‘’๐‘ก๐‘’๐‘‘
๐ฝ๐‘œ๐‘ ๐‘…๐‘ข๐‘›๐‘ก๐‘–๐‘š๐‘’
๐‘Š๐‘Ž๐‘™๐‘™ ๐‘‡๐‘–๐‘š๐‘’
3
Even more Correctly
๐ถ๐‘œ๐‘š๐‘๐‘™๐‘’๐‘ก๐‘’๐‘‘
๐ฝ๐‘œ๐‘ ๐‘…๐‘ข๐‘›๐‘ก๐‘–๐‘š๐‘’
๐‘Š๐‘Ž๐‘™๐‘™ ๐‘‡๐‘–๐‘š๐‘’
*
Subject to some notion of fairness
4
โˆ—
Thereโ€™s always fine print
โ€บ Optimize goodput subject to following
โ€บ โ€œSubject to some notion of fairnessโ€
๏จRecent usage
๏จMachine ownership
๏จReal world urgency
โ€ข Temporary or otherwise
๏จGroup membership
๏จEtc, etc.
5
Whatโ€™s your policy?
โ€บ Are you sure you know?
โ€บ Weโ€™d like to know.
โ€บ Weโ€™ve got lots of mechanisms
โ€บ We would really like to know if sufficient
โ€บ Please talk to me!
6
Example policy
โ€บ Global limit on job from each group
โ€บ Also limit on sum of sub-groups
โ€บ One Free-for-all group, can use whole pool
๏จMaybe not such a good idea
โ€บ If any job runs longer than two days:
๏จItโ€™s drunk, send it home
7
Policy for CHTC pools
โ€บ Big question:
๏จLongest allowable job runtime
โ€บ Currently 72 hours. Good? Bad?
โ€บ Policy note: set with negotiator, not startd
8
Why do we care?
condor_status -tot
Total Owner Claimed Unclaimed Matched
INTEL/LINUX
1
0
1
0
0
X86_64/LINUX 6639
63
6141
435
0
Total 6640
63
6142
435
0
โ„Ž๐‘œ๐‘ข๐‘Ÿ๐‘  3600 ๐‘ ๐‘’๐‘๐‘œ๐‘›๐‘‘๐‘ 
72
โˆ—
๐‘—๐‘œ๐‘
โ„Ž๐‘œ๐‘ข๐‘Ÿ
6000 ๐‘š๐‘Ž๐‘โ„Ž๐‘–๐‘›๐‘’๐‘ 
9
= 43 secs
Problem: draining
โ€บ With homogenous slots, wait time a
function of pool size, which is big
โ€บ Assuming no checkpointing
โ€บ If draining needed, job wait time a function
of longest job. ๏Œ
โ€บ More demand for HTPC jobs.
10
CHTC: A Flocking
Nightmare
3
CHTC
Schedds
6,000 cores CHTC
2,000 cores CS
80
UW
Schedds
Infolab pool
Glidein!
CAE Pool
Non-UW
Schedds
ACI Pool
11
Negotiator Records
โ€บ โ€œThe Accountantโ€
โ€บ Access via
๏จcondor_userprio
โ€บ Records matches,
โ€บ Not jobs โ€“ e.g. glidein problem
12
Negotiator Reporting
13
Schedd Records
โ€บ โ€œEvent Logโ€: enable in config file
โ€บ โ€œHistory fileโ€: condor_history
โ€บ We donโ€™t control them all
14
Startd also keeps history
โ€บ This is the one we use
๏จcondor_history โ€“f startd_history
๏จEnable by setting
๏จSTARTD_HISTORY = /path/to/file
15
condor_pool_job_report
The following users have run vanilla jobs that have hit the
MaxJobRetirementTime (72) hour limit in CHTC yesterday.
# of
User
Jobs
------3 [email protected]
79 [email protected]
81 [email protected]
353 [email protected]
= 31 K hours badput!
16
What is/isnโ€™t a job โ€œcompletionโ€?
โ€บ Strict definition: job exits of own accord
๏จTwo problems:
โ€ข Very, very short jobs
โ€ข Self checkpointable jobs
โ€“
โ€“
โ€“
โ€“
How to ID?
When_to_transfer_output = on_exit_or_evict
Adding explicit flag โ€“ requires a carrot
+is_resumable = true
โ€บ All this requires understanding users
17
Then, on to runtimes.
Averages can be deceiving
User
Starts
gthain 8442
Total Mean
Hours
8427 00:59
18
What about quartiles?
1st quartile
00:01 (One Minute)
2nd quartile
00:12
3rd quartile
00:42
4th quartile
68:41
19
โ€œJobsโ€ vs โ€œExecution attemptsโ€
โ€บ If 25% of runs less than one minute
โ€บ Is that just one bad job?
โ€บ Or all of the jobs are bad?
20
Added new columns to report
โ€บ
โ€บ
โ€บ
โ€บ
โ€บ
โ€œRestarted jobsโ€
Quartiles
Short jobs (less than minute)
Removed hours
Mean, Median, SD
โ€บ Requires a lot of user facilitating
21
Problem: Zoo of a pool
Order of magnitude different speeds in pool
Naïve Solution:
Create scaled performance numbers
Actual solution
Remove very slow machines from pool
Require users to ask for fast machines
22
Results of looking at data
โ€บ Can lower 72 hour limit to 24
โ€บ Probably need โ€œescape hatchโ€ for some
โ€บ Can drastically improve draining response
23
Future Work
โ€บ Support for slot-based scheduling?
โ€บ Support for mixed HPC / HTC submissions/
24
Thank you!
โ€บ Please talk to me about pool policy
๏จWeโ€™d love to hear from you!
โ€บ Important to know the shape of jobs
โ€บ Pure hours consumed not important metric
โ€บ Preempt-Resume right the first time!
25