幻灯片 1

Transcript 幻灯片 1

MapReduce & Cloud
PengBo
Dec 6, 2010
MapReduce
Imperative Programming

In computer science, imperative programming
is a programming paradigm that describes
computation in terms of statements that change
a program state.
Declarative Programming

In computer science, declarative
programming is a programming paradigm that
expresses the logic of a computation without
describing its control flow
Functional Language
map f lst: (’a->’b) ->
(’a list) -> (’b list)
把f作用在输入list的每个元
素上，输出一个新的list.
fold f x0 lst: ('a*'b->'b)->
'b->('a list)->'b
把f作用在输入list的每个元
素和一个累加器元素上，f
返回下一个累加器的值
f
f
f
f
f
f
f
initial
f
f
f
f
returned
From Functional Language View
map f lst: (’a->’b) ->
fold f x0 lst: ('a*'b->'b)->
(’a list) -> (’b list)
'b->('a list)->'b
把f作用在输入list的每个元
把f作用在输入list的每个元
 Functional运算不修改数据，总是产生新数据
素上，输出一个新的list.
素和一个累加器元素上，f
 map和reduce具有内在的并行性
返回下一个累加器的值


Map可以完全并行
Reduce在f运算满足结合律时，可以乱序并发执行
f
f
f
f
f
returned
f
f
f
f
f
f
initial
Reduce  foldl ：(a [a] a)
Example




fun foo(l: int list) = sum(l) + mul(l) + length(l)
fun sum(lst) = foldl (fn (x,a)=>x+a) 0 lst
fun mul(lst) = foldl (fn (x,a)=>x*a) 1 lst
fun length(lst) = foldl (fn (x,a)=>1+a) 0 lst
MapReduce is…

“MapReduce is a programming model and an
associated implementation for processing and
generating large data sets.”[1]
J. Dean and S. Ghemawat, "MapReduce: Simplified Data
Processing on Large Clusters," in Osdi, 2004, pp. 137-150.
From Parallel Computing View

MapReduce是一种并行编程模型
f是一个map算子
homomorphic skeletons
map f (x:xs) =
f x : map f xs
g是一个reduce算子
reduce g y (x:xs) =
reduce g ( g y x) xs
the essence is a single function that executes in
parallel on independent data sets, with outputs
that are eventually combined to form a single or
small number of results.
Mapreduce Framework
Typical problem solved by MapReduce


读入数据: key/value 对的记录格式数据
Map: 从每个记录里extract something


Shuffle: 混排交换数据


把相同key的中间结果汇集到相同节点上
Reduce: aggregate, summarize, filter, etc.


map (in_key, in_value) -> list(out_key, intermediate_value)
 处理input key/value pair
 输出中间结果key/value pairs
reduce (out_key, list(intermediate_value)) -> list(out_value)
 归并某一个key的所有values，进行计算
 输出合并的计算结果 (usually just one)
输出结果
Shuffle Implementation
Partition and Sort Group
Partition function: hash(key)%reducer number
Group function: sort by key
Word Frequencies in Web pages


输入：one document per record
用户实现map function，输入为



key = document URL
value = document contents
map输出 (potentially many) key/value pairs.

对document中每一个出现的词，输出一个记录<word, “1”>
Example continued:


MapReduce运行系统(库)把所有相同key的记录收集到一
起 (shuffle/sort)
用户实现reduce function对一个key对应的values计算



求和sum
Reduce输出<key, sum>
Inverted Index
Build Inverted Index
Map: <doc#, word> ➝[<word, doc-num>]
Reduce: <word, [doc1, doc3, ...]> ➝ <word, “doc1, doc3, …”>
Build index


Input: web page data
Mapper:


Shuffle & Sort:


Sort by term
Reducer:


<url, document content> <term, docid, locid>
<term, docid, locid>*  <term, <docid,locid>*>
Result:

Global index file, can be split by docid range
Quiz



PageRank Algorithm
Clustering Algorithm
Recommendation Algorithm
1. 串行算法表述
1. 算法的核心公式、步骤描述和说明
2. 输入数据表示、核心数据结构
2. MapReduce下的实现：
1. map, reduce如何写
2. 各自的输入和输出是什么
Stories of the Cloud…
A Picture is Worth…
The Information Factories

Googleplex





servers number 450,000,
according to the lowest
estimate
200 petabytes of hard disk
storage
four petabytes of RAM
To handle the current load of
100 million queries a day,
input-output bandwidth must
be in the neighborhood of 3
petabits per second
The Supercomputer that Connects
Everything and Everyone




LARRY PAGE :
And, actually, the ultimate search
engine, which would understand, you
know, exactly what you wanted when
you typed in a query, and it would
give you the exact right thing back,
in computer science we call that
artificial intelligence.
That means it would be smart, and
we're a long ways from having smart
computers.
The Prototype (1995)
Early Google System
Spring 2000 Design
Late 2000 Design
Spring 2001 Design
Empty Google Cluster
Three Days Later…
Age of DataCenters

High-end MainFrame .vs. commodity PC Cluster
Scale in
可靠性高
性价比高，scale out
But 可靠性差
High Capability System

SC5832








5832 Gigaflops
7776 Gigabytes ECC memory
972 6-core 64-bit nodes
2916 2 GByte/s fabric links
about 1 microsecond MPI latency
108 8-lane PCI-Express
18 KW
1 Cabinet
Millicomputers 2007
Millicomputers 2008
Guesses for 2010??
Packaging Comparisons in 1U
Cloud Computing

“The desktop is dead.
Welcome to the Internet
cloud, where massive
facilities across the globe will
store all the data you'll ever
use.”
What is Cloud Computing?
1.
2.
3.
First write down your own opinion about “cloud
computing” , whatever you thought about in
your mind.
Question: What ? Who? Why? How? Pros and
cons?
The most important question is: What is the
relation with me?
Cloud Computing is…




No software
access everywhere by Internet
power -- Large-scale data processing
Appeal for startups




Cost efficiency
实在是太方便了
Software as platform
Cons


Security
Data lock-in
SaaS
PaaS
Utility Computing
Software as a Service (SaaS)

a model of software deployment whereby a
provider licenses an application to customers for
use as a service on demand.
Platform as a Service (PaaS)

对于开发Web Application和Services，PaaS提供了一
整套基于Internet的，从开发，测试，部署，运营到维护
的全方位的集成环境。特别它从一开始就具备了Multitenant architecture，用户不需要考虑多用户并发的问
题，而由platform来解决，包括并发管理，扩展性，失效
恢复，安全。
Utility Computing

“pay-as-you-go” 好比让用户把电源插头插在墙上，你得
到的电压和Microsoft得到的一样，只是你用得少，pay
less；utility computing的目标就是让计算资源也具有这
样的服务能力，用户可以使用500强公司所拥有的计算资
源，只是use less pay less。这是cloud computing的一
个重要方面
Cloud Computing is…
Key Characteristics
illusion of infinite
computing resources
available on demand;

elimination of an up-front
commitment by Cloud users;
创业启动花费

ability to pay for use of
computing resources on a
very large datacenters
short-term basis as needed。
小时间片的billing，报告指
large-scale software infrastructure
出utility computing在这一
点上的实践是失败的
operational expertise

Why now?


very large-scale datacenter的实践，
因为新的技术趋势和Business模式

pay-as-you-go computing
Key Players



Amazon Web Services
Google App Engine
Microsoft Windows Azure
Key Applications




Mobile Interactive applications, Tim O’Reilly相信未来是
属于能够实时对用户提供信息的服务。Mobile必定是关键。
而后台在datacenter中运行是很自然的模式，特别是那些
mashup融合类型的服务。
Parallel batch processing。大规模数据处理使用Cloud
Computing技术很自然，MapReduce，Hadoop在这里起
到重要作用。这里，数据移入/移出cloud是很大的开销，
Amazon开始尝试host large public datasets for free。
The rise of analytics。数据库应用中transaction based应
用还在增长，而analytics的应用增长迅速。数据挖掘，用
户行为分析等应用的巨大推动。
Extension of compute-intensive desktop application。计
算密集型的任务，说matlab, mathematica都有了cloud
computing的扩展，woo~
Cloud Computing = Silver Bullet?

Google文档在3月7日发生
了大批用户文件外泄事件。
美国隐私保护组织就此提
请政府对Google采取措施，
使其加强云计算产品的安
全性。

Problem of Data Lock-in
Challenges
Some other Voices
The interesting thing about Cloud Computing is that we’ve redefined
Cloud Computing to include everything that we already do. . . . I
don’t understand what we would do differently in the light of Cloud
Computing other than change the wording of some of our ads.
Larry Ellison, quoted in the Wall Street Journal, September 26, 2008
It’s stupidity. It’s worse than stupidity: it’s a marketing hype
campaign. Somebody is saying this is inevitable — and
whenever you hear somebody saying that, it’s very likely to be
a set of businesses campaigning to make it true.
Richard Stallman, quoted in The Guardian, September 29,
2008
What’s matter with ME?!

What you want to do with 1000pcs, or even
100,000 pcs?
Cloud is coming…
Cloud Computing Initiative

Google and IBM team on cloud
computing initiative for
universities(2007-1008)



provide several hundred computers
access through the Internet to test
parallel programming projects
The idea for the program from
Google senior software engineer
Christophe Bisciglia

Google Code University
M45 : Open Academic Clusters

Collaboration with Major Research
Universities



Seed Facility: Datacenter in a Box (DiB)






Foster open research
Focus on large-scale, highly parallel computing
500 nodes, 4000 cores, 3TB RAM, 1.5PB disk
High bandwidth connection to Internet
Located on Yahoo! corporate campus
Runs Yahoo! / Apache Grid Stack
Carnegie Mellon University is Initial
Partner
Public Announcement 11/12/07
Summary

MapReduce



Distributed Programming
Model
It’s fun!
Infrastructure


Cloud computing
Imagination!
Readings

[1] J. D. a. S. Ghemawat, "MapReduce: Simplified
Data Processing on Large Clusters," in Osdi, 2004,
pp. 137-150.
Resources



[Ghemawat,2004] J. D. a. S. Ghemawat, "MapReduce: Simplified
Data Processing on Large Clusters," in Osdi, 2004, pp. 137-150.
[Gruber,2006]
F. C. a. J. D. a. S. G. a. W. C. H. a. D. A. W. a.
M. B. a. T. C. a. A. F. a. R. Gruber, "Bigtable: A Distributed Storage
System for Structured Data (Awarded Best Paper!)," in Osdi, 2006, pp.
205-218.
[Jeffrey,2006]
D. Jeffrey, "Experiences with MapReduce, an
abstraction for large-scale computation," in Proceedings of the 15th
international conference on Parallel architectures and compilation
techniques. Seattle, Washington, USA: ACM Press, 2006.


[Sanjay, et al.,2003] G. Sanjay, G. Howard, and L. Shun-Tak, "The
Google file system," in Proceedings of the nineteenth ACM
symposium on Operating systems principles. Bolton Landing, NY, USA:
ACM Press, 2003.
http://lucene.apache.org/hadoop/, 2008
Thank You!
Q&A
Calculate PageRank


Input: WebGraph <from , <PR,<to>*>>
Iteration Until Convergence




Mapper:
 <from, <PR,<to>*>> 
<to , PR / outDegree(from)>
 <from, <PR,<to>*>>  <from, <0,<to>*>>
Shuffle & Sort
 By <to>
Reducer:
 <to , valude>* 以及<to, <0, <out>*>  <to,
∑(value), <out>*>
Result:

<to, ∑(value)> are PR[]，the PageRank result array
Mapreduce Framework
Input key*value
pairs
Input key*value
pairs
...
map
map
Data store 1
Data store n
(key 1,
values...)
(key 2,
values...)
(key 3,
values...)
(key 2,
values...)
(key 1,
values...)
(key 3,
values...)
== Barrier == : Aggregates intermediate values by output key
key 1,
intermediate
values
key 2,
intermediate
values
key 3,
intermediate
values
reduce
reduce
reduce
final key 1
values
final key 2
values
final key 3
values

幻灯片 1

Transcript 幻灯片 1

Directory