Transcript Slide 1

and Cloud
A Grid Research Toolbox
The Failure
Trace
Archive
DGSim
A. Iosup, O. Sonmez, N. Yigitbasi,
H. Mohamed, S. Anoep, D.H.J. Epema
M. Jan
PDS Group, ST/EWI, TU Delft
LRI/INRIA Futurs Paris, INRIA
I. Raicu, C. Dumitrescu, I. Foster
H. Li, L. Wolters
U. Chicago
LIACS, U. Leiden
July 17, 2015
Paris, France
1
A Layered View of the Grid World
• Layer 1: Hardware + OS
• Automated
• Non-grid (XtreemOS?)
Grid Applications
• Low Level: file transfers,
local resource allocation, etc.
• High Level: grid scheduling
• Very High Level: application
environments (e.g., distributed
objects)
• Automated/user control
• Simple to complex
• Layer 5: Grid Applications
Grid MW Stack
• Layers 2-4: Grid Middleware Stack
• User control
• Simple to complex
Grid Very High Level MW
Grid High Level MW
Grid Low Level MW
HW + OS
July 17, 2015
2
Grid Work: Science or Engineering?
• Work on Grid Middleware and Applications
• When is work in grid computing science?
•
•
•
•
Studying systems to uncover their hidden laws
Designing innovative systems
Proposing novel algorithms
Methodological aspects:
repeatable experiments to verify and extend hypotheses
• When is work in grid computing engineering?
• Showing that the system works in a common case, or in a
special case of great importance (e.g., weather prediction)
• When our students can do it (H. Casanova’s argument)
July 17, 2015
3
Grid Research Problem:
We Are Missing Both Data and Tools
• Lack of data
• Infrastructure
• number and type of resources, resource availability and failures
• Workloads
We
problems
to solve
• arrivalhave
process, resource
consumption
• … grid computing (as a science)!
in
• Lack of tools
• Simulators
• SimGrid, GridSim, MicroGrid, GangSim, OptorGrid, MONARC, …
• Testing tools that operate in real environments
• DiPerF, QUAKE/FAIL-FCI
• …
July 17, 2015
4
Anecdote: Grids are far from being
reliable job execution environments
Server
• 99.99999% reliable
Small Cluster
• 99.999% reliable
• 5x
decrease in cannot
failure rate
So at theProduction
moment our
students
Cluster
after first year [Schroeder and Gibson,
work in grid computing
DSN‘06] engineering!
CERN LCG jobs
DAS-274.71% successful
• >10% jobs fail
[Iosup et al., CCGrid’06]
25.29% unsuccessful
TeraGrid
• 20-45% failures
Grid3
• 27% failures, 5-10 retries
[Khalili et al., Grid’06]
[Dumitrescu et
al., GCC’05]
July 17, 2015
Source: dboard-gr.cern.ch, May’07.
5
The Anecdote at Scale
• NMI Build-and-Test Environment at U.Wisc.-Madison:
112 hosts, >40 platforms (e.g., X86-32/Solaris/5, X86-64/RH/9)
• Serves >50 grid middleware packages: Condor,
Globus, VDT, gLite, GridFTP, RLS, NWS, INCA(-2), APST, NINF-G,
BOINC …
Two years of functionality tests (‘04-‘06):
over 1:3 runs have at least one failure!
(1) Test or perish!
(2) In today’s grids, reliability is
more important than performance!
A. Iosup, D.H.J.Epema, P. July
Couvares,
A. Karp, M. Livny,
17, 2015
Build-and-Test Workloads for Grid Middleware: Problem,
Analysis, and Applications, CCGrid, 2007.
6
A Grid Research Toolbox
• Hypothesis: (a) is better than (b).
For scenario 1, …
1
3
DGSim
2
July 17, 2015
7
Research Questions
Q1: How to exchange grid/cloud data?
(e.g., Grid/Cloud * Archive)
Q2: What are the characteristics of grids/clouds?
(e.g., infrastructure, workload)
Q3: How to test and evaluate grids/clouds?
July 17, 2015
8
Outline
1. Introduction and Motivation
2. Q1: Exchange Data
1. The Grid Workloads Archive
2. The Failure Trace Archive
3. The Cloud Workloads Archive (?)
3. Q2: System Characteristics
1. Grid Workloads
2. Grid Infrastructure
4. Q3: System Testing and Evaluation
July 17, 2015
9
Traces in Distributed Systems Research
• “My system/method/algorithm is better than yours
(on my carefully crafted workload)”
• Unrealistic (trivial): Prove that “prioritize jobs from
users whose name starts with A” is a good scheduling policy
• Realistic? “85% jobs are short”; “10% Writes”; ...
• Major problem in Computer Systems research
• Workload Trace = recording of real activity from a (real)
system, often as a sequence of jobs / requests submitted
by users for execution
• Main use: compare and cross-validate new job and resource
management techniques and algorithms
• Major problem: real workload traces from several sources
August 26, 2010
10
2.1. The Grid Workloads Archive [1/3]
Content
http://gwa.ewi.tudelft.nl
6 traces
online
1.5 yrs
>750K
>250
A. Iosup, H. Li, M. Jan, S. Anoep, C. Dumitrescu, L. Wolters,
D. Epema, The Grid Workloads Archive, FGCS 24, 672—686, 2008.
July 17, 2015
11
2.1. The Grid Workloads Archive [2/3]
Approach: Standard Data Format (GWF)
• Goals
• Provide a unitary format for Grid workloads;
• Same format in plain text and relational DB (SQLite/SQL92);
• To ease adoption, base on the Parallel Workloads Format (SWF).
• Existing
• Identification data: Job/User/Group/Application ID
• Time and Status: Sub/Start/Finish Time, Job Status and Exit code
• Request vs. consumption: CPU/Wallclock/Mem
• Added
• Job submission site
• Job structure: bag-of-tasks, workflows
• Extensions: co-allocation, reservations, others possible
17, 2015 C. Dumitrescu, L. Wolters,
A. Iosup, H. Li, M. Jan, S. July
Anoep,
D. Epema, The Grid Workloads Archive, FGCS 24, 672—686, 2008. 12
2.1. The Grid Workloads Archive [3/3]
Approach: GWF Example
Used
Submit Wait[s] Run #CPUs
Req
Mem [KB] #CPUs
A. Iosup, H. Li, M. Jan, S. Anoep, C. Dumitrescu, L. Wolters,
D. Epema, The Grid Workloads Archive, FGCS 24, 672—686, 2008.
July 17, 2015
13
2.2. The Failure Trace Archive
Presentation
The Failure
Trace
Archive
Types of systems
•
•
•
•
(Desktop) Grids
DNS servers
HPC Clusters
P2P systems
http://fta.inria.fr
Stats
• 25 traces
• 100,000 nodes
• Decades of
operation
July 17, 2015
14
2.2. The Cloud Workloads Archive [1/2]
One Format Fits Them All
• Flat format
CWJ
CWJD CWT CWTD
• Job and Tasks
• Summary (20 unique data fields) and Detail (60 fields)
• Categories of information
• Shared with GWA, PWA: Time, Disk, Memory, Net
• Jobs/Tasks that change resource consumption profile
• MapReduce-specific (two-thirds data fields)
A. Iosup, R. Griffith, A. Konwinski, M. Zaharia, A. Ghodsi, I.
Stoica, Data Format for the Cloud Workloads Archive, v.3, 13/07/10
July 17, 2015
15
15
2.2. The Cloud Workloads Archive [2/2]
The Cloud Workloads Archive
• Looking for invariants
• Wr [%] ~40% Total IO, but absolute values vary
Trace ID
Total IO [MB] Rd. [MB]
Wr [%]
HDFS Wr[MB]
CWA-01
10,934
6,805
38%
1,538
CWA-02
75,546
47,539
37%
8,563
• # Tasks/Job, ratio M:(M+R) Tasks, vary
• Understanding workload evolution
July 17, 2015
16
Outline
1. Introduction and Motivation
2. Q1: Exchange Data
1. The Grid Workloads Archive
2. The Failure Trace Archive
3. The Cloud Workloads Archive (?)
3. Q2: System Characteristics
1. Grid Workloads
2. Grid Infrastructure
4. Q3: System Testing and Evaluation
July 17, 2015
17
3.1. Grid Workloads [1/7]
Analysis Summary: Grid workloads different,
e.g., from parallel production envs. (HPC)
• Traces: LCG, Grid3, TeraGrid, and DAS
• long traces (6+ months), active environments (500+K jobs per trace, 100s
of users), >4 million jobs
• Analysis
• System-wide, VO, group, user characteristics
• Environment, user evolution
• System performance
• Selected findings
• Almost no parallel jobs
• Top 2-5 groups/users dominate the workloads
• Performance problems: high job wait time, high failure rates
A. Iosup, C. Dumitrescu, D.H.J. Epema, H. Li, L. Wolters,
How are Real Grids Used? The Analysis of Four Grid Traces
and Its Implications, GridJuly
2006.
17, 2015
18
3.1. Grid Workloads [2/7]
Analysis Summary:
Grids vs. Parallel Production Systems
• Similar CPUTime/Year, 5x larger arrival bursts
LCG cluster
daily peak:
22.5k jobs
Grids
Parallel Production
Environments (Large
clusters,
supercomputers)
A. Iosup, D.H.J. Epema, C. Franke, A. Papaspyrou, L. Schley,
B. Song, R. Yahyapour, On Grid Performance Evaluation using
Synthetic Workloads, JSSPP’06.
July 17, 2015
19
3.1. Grid Workloads [3/7]
More Analysis: Special Workload Components
Bags-of-Tasks (BoTs)
Workflows (WFs)
Time [units]
BoT = set of jobs…
…that start at most Δs after the first job
Parameter Sweep App. =
BoT with same binary
WF = set of jobs with precedence
(think Direct Acyclic Graph)
July 17, 2015
20
3.1. Grid Workloads [4/7]
BoTs are predominant in grids
• Selected Findings
• Batches predominant in grid workloads; up to 96% CPUTime
Grid’5000
NorduGrid
GLOW
(Condor)
Submissions 26k
50k
13k
Jobs
808k (951k)
738k (781k)
205k (216k)
CPU time
193y (651y)
2192y (2443y) 53y (55y)
• Average batch size (Δ≤120s) is 15-30 (500 max)
• 75% of the batches are sized 20 jobs or less
A. Iosup, M. Jan, O. Sonmez, and D.H.J. Epema, The
Characteristics and Performance of Groups of Jobs in Grids,
Euro-Par, LNCS, vol.4641, July
pp.17, 382-393,
2007.
2015
21
3.1. Grid Workloads [5/7]
Workflows exist, but they seem small
• Traces
• Selected Findings
•
•
•
•
Loose coupling
Graph with 3-4 levels
Average WF size is 30/44 jobs
75%+ WFs are sized 40 jobs or less, 95% are sized 200 jobs or less
S. Ostermann, A. Iosup, R. Prodan, D.H.J. Epema, and T.
Fahringer. On the Characteristics of Grid Workflows,
July 17, 2015
CoreGRID Integrated Research in Grid Computing (CGIW), 2008.22
3.1. Grid Workloads [6/7]
Modeling Grid Workloads: Feitelson adapted
• Adapted to grids: percentage parallel jobs, other values.
• Validated with 4 grid and 7 parallel production env. traces
A. Iosup, D.H.J. Epema, T. Tannenbaum, M. Farrellee, and M.
Livny. Inter-Operating Grids Through Delegated MatchMaking,
ACM/IEEE Conference on High
Networking and
July Performance
17, 2015
Computing (SC), pp. 13-21, 2007.
23
3.1. Grid Workloads [7/7]
Modeling Grid Workloads: adding users, BoTs
• Single arrival process for both BoTs and parallel jobs
• Reduce over-fitting and complexity of “Feitelson adapted”
by removing the RunTime-Parallelism correlated model
• Validated with 7 grid workloads
A. Iosup, O. Sonmez, S. Anoep, and D.H.J. Epema. The
Performance of Bags-of-Tasks in Large-Scale Distributed
Systems, HPDC, pp. 97-108,July
2008.
17, 2015
24
3.2. Grid Infrastructure [1/5]
Existing resource models and data
• Compute Resources
• Commodity clusters [Kee et al., SC’04]
• Desktop grids resource availability [Kondo et al., FCFS’07]
Static!
Source: H. Casanova
• Network
Resources
Resource dynamic, evolution, …
• Structural generators: GT-ITM [Zegura et al., 1997]
NOT considered
• Degree-based generators: BRITE [Medina et al., 2001]
• Storage Resources, other resources
• ?
July 17, 2015
25
3.2. Grid Infrastructure [2/5]
Resource dynamics in cluster-based grids
• Environment: Grid’5000 traces
• jobs 05/2004-11/2006 (30 mo., 950K jobs)
• resource availability traces 05/2005-11/2006 (18 mo., 600K events)
• Resource availability model for multi-cluster grids
Grid-level availability: 70%
A. Iosup, M. Jan, O. Sonmez, and D.H.J. Epema, On the
Dynamic Resource Availability in Grids, Grid 2007, Sep 2007.
July 17, 2015
26
3.2. Grid Infrastructure [3/5]
Correlated Failures
• Correlated failure
Maximal set of failures (ordered according to increasing event time),
of time parameter
in which for any two successive failures E and F,
where
returns the timestamp of the event;
= 1-3600s.
• Grid-level view
CDF
• Range: 1-339
Average
• Average: 11
• Cluster span
• Range: 1-3
Grid-level view
• Average: 1.06
• Failures “stay” within cluster Size of correlated failures
A. Iosup, M. Jan, O. Sonmez,
and
July 17,
2015D.H.J. Epema, On the
Dynamic Resource Availability in Grids, Grid 2007, Sep 2007.27
3.2. Grid Infrastructure [4/5]
Dynamics Model
MTBF
MTTR
Correl.
• Assume no correlation of failure occurrence between clusters
• Which site/cluster?
• fs, fraction of failures at cluster s
• Weibull distribution for IAT
• Shape parameter > 1: increasing hazard rate
the longer a node is online, the higher the chances that it will fail
A. Iosup, M. Jan, O. Sonmez, and D.H.J. Epema, On the
Dynamic Resource Availability in Grids, Grid 2007, Sep 2007.
July 17, 2015
28
3.2. Grid Infrastructure [5/5]
Evolution Model
A. Iosup, O. Sonmez, and
D. Epema, DGSim:
Comparing Grid Resource
Management Architectures
through Trace-Based
Simulation, Euro-Par
2008.
July 17, 2015
29
Q1,Q2: What are the characteristics of grids
(e.g., infrastructure, workload)?
• Grid workloads very different from those of other
systems, e.g., parallel production envs. (large clusters,
supercomputers)
•
•
•
•
Batches of jobs are predominant [Euro-Par’07,HPDC’08]
Almost no parallel jobs [Grid’06]
Workload model [SC’07, HPDC’08]
Clouds? (upcoming)
• Grid resources are not static
• Resource dynamics model [Grid’07]
• Resource evolution model [EuroPar’08]
• Clouds? [CCGrid’11]
• Archives: easy to share traces and associated research
July 17, 2015
http://gwa.ewi.tudelft.nl
30
Outline
1. Introduction and Motivation
2. Q1: Exchange Data
1. The Grid Workloads Archive
2. The Failure Trace Archive
3. The Cloud Workloads Archive (?)
3. Q2: System Characteristics
1. Grid Workloads
2. Grid Infrastructure
4. Q3: System Testing and Evaluation
July 17, 2015
31
4.1. GrenchMark: Testing in LSDCSs
Analyzing, Testing, and Comparing Systems
• Use cases for automatically analyzing, testing, and
comparing systems (or middleware)
•
•
•
•
Functionality testing and system tuning
Performance testing/analysis of applications
Reliability testing of middleware
…
• For grids and clouds, this problem is difficult !
•
•
•
•
Testing in real environments is difficult/costly/both
Grids/clouds change rapidly
Validity and reproducibility of tests
…
July 17, 2015
32
4.1. GrenchMark: Testing LSDCSs
Architecture Overview
GrenchMark = Grid Benchmark
July 17, 2015
33
4.1. GrenchMark: Testing LSDCSs
Testing a Large-Scale Environment (1/2)
• Testing a 1500-processors Condor environment
• Workloads of 1000 jobs, grouped by 2, 10, 20, 50, 100, 200
• Test finishes 1h after the last submission
• Results
• >150,000 jobs submitted
• >100,000 jobs successfully run, >2 yr CPU time in 1 week
• 5% jobs failed (much less than other grids’ average)
• 25% jobs did not start in time and where cancelled
July 17, 2015
35
4.1. GrenchMark: Testing LSDCSs
Testing a Large-Scale Environment (2/2)
• Performance metrics
system-, job-, operational-, application-, and service-level
July 17, 2015
36
4.1. GrenchMark: Testing in LSDCSs
ServMark: Scalable GrenchMark
DiPerF
GrenchMark
ServMark
• Blending DiPerF and GrenchMark.
• Tackles two orthogonal issues:
• Multi-sourced testing
(multi-user scenarios, scalability)
• Generate and run dynamic test
workloads with complex structure
(real-world scenarios, flexibility)
• Adds
• Coordination and automation layers
• Fault tolerance module
July 17, 2015
37
Performance Evaluation of Clouds [1/3]
C-Meter: Cloud-Oriented GrenchMark
Yigitbasi et al.: C-Meter: A Framework for
Performance Analysis of Computing Clouds.
Proc. of CCGRID 2009
July 17, 2015
38
Performance Evaluation of Clouds [2/3]
Low Performance for Sci.Comp.
• Evaluated the performance of resources from four
production, commercial clouds.
• GrenchMark for evaluating the performance of cloud resources
• C-Meter for complex workloads
• Four production, commercial IaaS clouds: Amazon Elastic
Compute Cloud (EC2), Mosso, Elastic Hosts, and GoGrid.
• Finding: cloud performance low for sci.comp.
S. Ostermann et al., A Performance Analysis of EC2 Cloud
Computing Services for Scientific Computing, Cloudcomp 2009,
LNICST 34, pp.115–131, 2010.
A. Iosup et al.,Performance Analysis of Cloud Computing Services
for Many-Tasks Scientific Computing, IEEE TPDS, vol.22(6), 2011.
July 17, 2015
39
Performance Evaluation of Clouds [3/3]
Cloud Performance Variability
• Long-term performance variability of production cloud services
• IaaS:
Amazon Web Services
• PaaS:
Google App Engine
Amazon S3: GET US HI operations
• Year-long performance information for nine services
• Finding: about half of the cloud services investigated in
this work exhibits yearly and daily patterns; impact of
performance variability depends on application.
A. Iosup, N. Yigitbasi, and D. Epema, On the Performance
Variability of Production Cloud Services, CCGrid 2011.
July 17, 2015
40
4.2. DGSim: Simulating Multi-Cluster Grids
Goal and Challenges
• Simulate various grid resource management architectures
• Multi-cluster grids
• Grids of grids (THE grid)
• Challenges
Two GRM architectures
• Many types of architectures
• Generating and replaying grid workloads
• Management of simulations
•
•
•
•
DGSim
Many repetitions of a simulation for statistical relevance
Simulations with many parameters
Managing results (e.g., analysis tools)
Enabling collaborative experiments
July 17, 2015
41
4.2. DGSim: Simulating Multi-Cluster Grids
Overview
Discrete-Event
Simulator
DGSim
July 17, 2015
42
4.2. DGSim: Simulating Multi-Cluster Grids
Simulated Architectures (Sep 2007)
Hybrid
hierarchical/
decentralized
Independent
Hierarchical
DGSim
Centralized
Decentralized
A. Iosup, D.H.J.Epema,
T. Tannenbaum, M.
July 17, 2015
Farrellee, M. Livny, Inter-Operating Grids
through Delegated MatchMaking, SC, 2007.
43
Q3: How to test and evaluate grids/clouds?
• GrenchMark+C-Meter: testing large-scale distrib. sys.
•
•
•
•
Framework
Testing in real environments performance, reliability, functionality
Uniform process: metrics, workloads
Real tool available grenchmark.st.ewi.tudelft.nl
dev.globus.org/wiki/Incubator/ServMark
• DGSim: simulating multi-cluster grids
• Many types of architectures
• Generating and replaying grid workloads
• Management of the simulations
July 17, 2015
44
Take Home Message: Research Toolbox
• Understanding how real systems work
• Modeling workloads and infrastructure
• Compare grids and clouds with other platforms (parallel production env.,…)
• The Archives: easy to share system traces and associated research
• Grid Workloads Archive
• Failure Trace Archive
• Cloud Workloads Archive (upcoming)
• Testing/Evaluating Grids/Clouds
•
•
•
•
GrenchMark
ServMark: Scalable GrenchMark
C-Meter: Cloud-oriented GrenchMark
DGSim: Simulating Grids (and Clouds?)
Publications
2006: Grid, CCGrid, JSSPP
2007: SC, Grid, CCGrid, …
2008: HPDC, SC, Grid, …
2009: HPDC, CCGrid, …
2010: HPDC, CCGrid (Best
Paper Award), EuroPar, …
2011: IEEE TPDS, IEEE
Internet Computing, CCGrid,
…
July 17, 2015
45
Thank you for your attention!
Questions? Suggestions? Observations?
More Info:
- http://www.st.ewi.tudelft.nl/~iosup/research.html
- http://www.st.ewi.tudelft.nl/~iosup/research_gaming.html
- http://www.st.ewi.tudelft.nl/~iosup/research_cloud.html
Do not hesitate to
contact me…
Alexandru Iosup
[email protected]
http://www.pds.ewi.tudelft.nl/~iosup/ (or google “iosup”)
Parallel and Distributed Systems Group
Delft University of Technology
July 17, 2015
46