Transcript Slide 1

Application-driven Energy-efficient
Architecture Explorations for Big Data
Authors:
Xiaoyan Gu
Rui Hou
Ke Zhang
Lixin Zhang
Weiping Wang
(Institute of Computing Technology,
Chinese Academy of Sciences)
Reviewed by-
Siddharth Bhave
(University of Washington, Tacoma)
Big Data
 What is Big Data?
 Problems with Big data
 Energy Consumption
 Velocity (Operation latency and throughput)
 Volume (storing capacity)
 Variety
 Managing Big Data Problems
 Storage Technologies
 Partitioning
 Multithreading
 Parallel Processing
 Efficient Architecture
 Hadoop, Map Reduce, MAHOUT
 Find bottle neck
Introduction
 Big data management at architecture level
 Two architecture systems
 Xeon-based cluster
 Atom Based (micro-server) Cluster
 Comparison Based on:  Energy consumption
 Execution time
Motivation
 Ever increasing data.
 Energy and Time tradeoff in Xeon and Atom based clusters.
 Bottleneck by the processes of compression/decompression
 Stateless data processing
Mastiff
 Mastiff - Targeted application for performance analysis
 Big data processing engine
 Columnar store policy
Compressio
n Ratio on 3
GB data
Compressio
n Ratio on
100 GB data
Compressio
n Ratio on
500 GB data
Mastiff
0.54
0.53
0.518
Hadoop
HDFS
0.72
0.71
0.7
Working flow of the Mastiff
Methodology
 TPC-H test benchmark of queries and concurrent data
 1 TB of verification data
 2 cases - data load and data query
 Fluke NORMA 4000
 Average cases and median results are reported
Power and Performance Evaluation
 Take 3 cases for time and energy consumption
 31 nodes – Atom Cluster (1 master node)
 31 nodes – Xeon Cluster (1 master node)
 16 nodes – Xeon Cluster (1 master node)
Time on
Atom
Cluster (30
nodes)
Time on
Xeon
Cluster (30
nodes)
Time on
Xeon
Cluster (15
nodes)
Data Load
3.435 hours
1.543 hours
3.242 hours
Data Query
5.877 hours
2.724 hours
5.564 hours
Power and Performance Evaluation (cont’d)
Energy consumption between 30-node Atom Cluster and 30-node
Xeon Cluster
Power and Performance Evaluation (cont’d)
Energy consumption between 30-node Atom Cluster and 15-node
Xeon Cluster
Power and Performance Evaluation (cont’d)
Time Breakdown in Map Phase
Power and Performance Evaluation (cont’d)
Time Breakdown in Reduce phase
Findings
 Atom platform more power efficient
 Data compression and decompression occupies significant
percentage.
 Compression and decompression can be done in software
pipeline fashion i.e. with multiple interleave
Propositions
 Heterogeneous architecture
 Accelerators to perform data compression/decompression
 Multiple interleaved compression/decompression
Off-chip and On-chip Accelerators
Multiple Interleaved Tasks
Strengths
 A much needed innovative concept
 Organized well
 Detailed description of energy and time investigation
 Already implemented propositions
Weaknesses
 Not enough power meters to monitor all nodes
 2 assumptions
 Power of every network router is evenly counted towards
nodes
 Energy consumption of each node is similar
 Results are generalized by Hadoop even if they might not be
true for every application.
 Vague propsitions implementation
FAWN: A Fast Array of Wimpy Nodes
Authors:
David G. Andersen
Jason Franklin
Michael Kaminsky
Amar Phanishayee
Lawrence Tan
Vijay Vasudevan
(Carnegie Mellon University)
Introduction
 High performance, energy efficient system for storage
 Large number of small low-performance (hence wimpy)
nodes with moderate amounts of local storage
 2 parts: FAWN-DS (data store) and FAWN-KV (key value)
 Motivation
 Traditional architecture consumes too much power
 I/O bottleneck due to current storage inabilities
Features
 Pairs of low powered embedded nodes with flash storage
 FAWN-DS is the backend that consists of the large number of nodes
 Each node has some RAM and flash
 FAWN-KV is a consistent, replicated, highly available and high
performance key value storage system
FAWN Architecture
Efficient Data Streaming with On-chip
Accelerators: Opportunities and
Chanllenges
Authors:
Rui Hou
Lixin Zhang
Michael C. Huang
Kun Wang
Hubertus Franke
Yi Ge
Xiaotao Chang
(University of Rochester)
Motivation
 Transistor density increasing day by day
 Many cores are integrated in a single die
 Advantage of on-chip accelerator instead of using it as PCI
On-Chip Accelerator Architecture
Features
 3 types of accelerators
 Crypto accelerators
 Decompression accelerators
 Network offload accelerator
 Some common characteristics of data stream in the 3 accelerators
 Optimize the power and performance of the accelerators.
Thank You