Hadoop System simulation with Mumak Fei Dong, Tianyu Feng, Hong Zhang

Download Report

Transcript Hadoop System simulation with Mumak Fei Dong, Tianyu Feng, Hong Zhang

Hadoop System simulation with
Mumak
Fei Dong, Tianyu Feng, Hong Zhang
Dec 8, 2010
Agenda
•
•
•
•
•
Objective
Comparison between MRPerf and Mumak
Modifications to Mumak
Results and discussion
Conclusion
Objective
• Large scale distributed system has enormous
amount of parameters.
• Running time of a user program depends nonlinearly on these parameters.
• Predict the running time under various
settings to help user choose the “optimal”
setting.
• We start by varying the most basic parameter:
cluster size.
MRPerf and Mumak
• MRPerf
– Build upon a network simulator
– Calculate the task running time and network delay
from physical parameters
– Implemented the Hadoop system in TCL
– Flexible in simulation
MRPerf and Mumak
Running
Time
Reduce slots per node
Map slots per node
4 nodes double rack data center (Chunk Size = 64M) By MRPerf
MRPerf and Mumak
4 nodes (Chunk Size = 64M) By Mumak
MRPerf and Mumak
• Mumak
– Inherit the JobTracker class from Hadoop and only
defines the simulation interface
– Use trace file to build the cluster topology / job
story, then feed it into simulator
– Can only reproduce previous finished experiment
– Designed to verify/debug Hadoop system design
– Only simulate the Map/Reduce tasks, no sort
phase and shuffle phase
MRPerf and Mumak
• The approach taken by MRPerf is better
– Take in parameters to estimate running time
– Can make predictions
• MRPerf is simulating their implementation of Hadoop
• The design of Mumak is better
– Inherit source code from Hadoop
– Easy to understand and to extend
• We decide to take the good parts of MRPerf and then
implement them in the framework of Mumak
– Modify the Rumen log to change the parameters
– Modify Mumak source code to add network simulator
Implementation
• Simulate a different cluster size
– Hack the rumen log, change data replication
factor/ locality
– Modify the topology, add in / delete nodes, for
example, from 2 slave nodes to 6 slave nodes.
– The job tracker will assign the tasks to different
nodes.
Implementation
• Simulate network delay
– We defined a simple network
simulator interface
– Modified the source code of
Mumak to add in the network
delay
– Actual the network delay can be
ignored
Results and Discussion
2500000
4 nodes
2000000
time (ms)
1500000
log
estimate
1000000
estimate_local
500000
0
job_0003
job_0004
job_0005
job_0006
job_0007
job_0008
job
job_0009
job_0010
job_0011 Reference
Results and Discussion
2500000
6 nodes
2000000
time (ms)
1500000
log
estimate
1000000
estimate_local
500000
0
job_0003
job_0004
job_0005
job_0006
job_0007
job_0008
job
job_0009
job_0010
job_0011 Reference
Results and Discussion
Results and Discussion
• Limitations and future work
– Sort phase time not included
– Only used single rack topology
– Prediction is not always consistent for the same
job with the same configuration
Conclusion
• Our objective is to predict the running time
with different parameters
• We take the methods of MRPerf and
implemented it on Mumak
• To have more flexible and accurate prediction,
more modification to Mumak is needed
– Independent from trace file
– Solve the unstable problem
Questions?