Transcript Slide 1

ElasticTree: Saving Energy in
Data Center Networks
Very offended by
KALYAN MANDA
LEI XIA
Facts about Network Elements
•Network Elements only consume 5% of Total Power and not 10-20% as
stated in your paper.(hmmm……..)
•ElasticTree can save upto 40% of network energy, i.e. we have saved
1billion kWh.(best case scenario)
•This is achieved by deciding on trade-off between Performance and Energy
Consumption.
Some Maths to calculate the Profit
Ok, so we have just saved:
= 1 *10^9 *.088(price/kWh) = $88000000.
Number of servers in Datacenters in 2007:
= 11.8Million.
(http://royal.pingdom.com/2008/07/25/us-data-centers-consuming-as-much-power-as-5-million-
houses/)
Approx. number of Datacenters:
= 11.8*10^6/10000(a very generous assumption)
= 1180
Profit/Datacenter
• Profit/Datacenter
= 88000000 / 1180
= $74576.27/year
HURRAY!!!
Wait a Second….
• I will have to hire a Network Admin to configure ElasticTree
to satisfy the needs for performance and Fault tolerance.
(source - paper)
• And since my datacenter will be running 24/7 will have to hire
3 network admins- one per shift.(ideally 6- other 3 as backup)
• And since I am a miser I will be paying them only $60000 pa.
• So my total investment = 3 * 60000 = $180000/ year
• So my Total Profit = 74576.27 – 180000 = $-105423.73
My Achievements
• I have degraded the performance of my
datacenter by selecting on a trade-off
between energy and performance to
save 40% network energy…assuming
the best case scenario.
• I have made a profit of $-105423.73
• And, I got myself fired…..
Applications of ElasticTree
• Only to datacenters with Fat tree topology at
present.
• Cannot be applied to a 2N Tree topology.
• Future interest, to explore the applicabilty to other
topologies such as HyperCubes and Butterfly
topology.
Common guys……
• You based all of your conclusions by performing
tests on a k = 4,6 Fat Tree and with a maximum of
54 hosts.
• How hard it could be to simulate other topologies,
considering the fact that you have not simulated on
a real datacenter.
• This also clearly indicates the lack of interest of
Stanford university, Duetsche Telekom and HP labs
– your employers, in your work.
ElasticTree v/s Third Party Computing
• Will Elastic Tree approach be more profitable than
Third Party Computing?
• Wouldn’t it make more sense to outsource my
datacenter’s additional capacity to third party rather
than shutting them so as to save power costs?
• Considering the fact that VM’s are instantiated on
demand, will the optimizer algorithm be any good
since it needs a lot of previous information?
ElasticTree v/s DDOS
• Wouldn’t a datacenter running on a ElasticTree
be an ideal target for a DDOS attack. Since there
is no way to predict a DDOS attack even from
previous statistics???
Robustness Analysis
• A datacenter will use a MST-k configuration to
absorb k simultaneous faults.
• With a MST-k configuration our total energy
savings =
5% * 40% * Total Power
k
Testbed: Far from the real
• Switches: Virtual Switch?
▫ Power consumption, on/off
▫ Latency
▫ Configuration operability
• Topology: simplified fat tree (k=4)
▫ Real DCs: k>12~
▫ Other topologies, not only fat tree
Testbed: Far from the real – cont’d
• Workload Traffic:
▫ collected from DC with only 290 servers
▫ different fat tree configurations with simulation.
▫ Traffic collected in very coarse granularity.
 10 minutes interval
 Anything could happen during this interval in real
DC (burst, raise/down)
System response time
• The algorithms itself
▫
▫
▫
▫
▫
3.5th order for formal model
2th order for greedy
How many time will these take in real system?
What the response time?
None of these are given in the paper.
Real Switch Latency
• Time to boot a real switch
▫ Up to 3 mins?
▫ How to handle this network traffic burst
 The latency time is unbearable in most of DCs
▫ Authors propose us to use the possible fantasy
feathers from these future switches, like sleeping
mode
How Control Module works in real DC?
• How the control signals sent out
▫ In simulation, easy, all software
▫ In real DC, how?
 By extra wires from control machine to each switch?
 You justify that in real DC one can use mechanisms
such as command line interface, SNMP etc to on/off
switches, why not use them rather than state
them……
• OpenFlow & NOX
▫ Does widely used switches and DC has these?
▫ How many efforts to port these?
Centralized Control Module? – not
good idea
• Control modules – single point of failure
▫ What is the potential harmfulness of control
failure?
 If down, the whole DC could be malfunctioned, or
stopped
▫ What is the recovery process from failure
 Cost/time
Results – Not convinced
• What the overall power saving for the real DC
traffics?
• In figures 15 and 16. What is the typical latency
and lost percentage for DCs without ElasticTree
applied?
▫ Authors claim the results from figures 15 and 16
are helpful for DC operators to choose needed
tradeoff,
 Actually not!!
 The statistics differ a lot for DCs with different
configurations and scales