Transcript Hadoop

Dennis
Mulder
Windows Azure Center of Excellence
Spotlight
Services
Global Services Team
10 Senior Cloud Architects
Assessment
Pilots
Pilots
Architecture
and Design
Guidance
Cloud Apps
Global Scale
Modern Apps
Champs
8
US, EMEA, APAC
Assess
Design
Pilots
Contact
Design Sessions
Dennis Mulder, Solution Architect, [email protected]
Engage
Four megatrends will dominate the next decade
Mobility
91%
of organizations expect
to spend on mobile
devices in 2012
In 2012, mobile
devices will
outship PCs by
more than
2:1
and generate more
revenue than PCs
for the first time
85
BILLION
Social
Social networking will
follow not just people
but also appliances,
devices and products
mobile apps
will be
downloaded
in 2012
=
1/2
of
companies
expect to use
internal social
network apps
in 2012
Big data
Cloud
>80%
of new apps in
2012 will be distributed/
49%
of CIOs rank BI as the
top project priority for 2012
deployed on clouds
The strategic focus
in the cloud
will shift
in 2012
from infrastructure
to application
34%
platforms
of CIOs say technology as a service
(cloud) will have the most profound
effect on the CIO role in the future
2/3
of mobile apps
developed in 2012 will integrate
with analytics offerings
2.7
zettabytes
in 2012
32%
of businesses
are likely to
invest in BI
and analytics
in 2012
Microsoft is embracing these megatrends
Mobility
91%
of organizations expect
to spend on mobile
devices in 2012
In 2012, mobile
devices will
outship PCs by
more than
2:1
and generate more
revenue than PCs
for the first time
85
BILLION
Social
Social networking will
follow not just people
but also appliances,
devices and products
mobile apps
will be
downloaded
in 2012
=
1/2
of
companies
expect to use
internal social
network apps
in 2012
Big data
Cloud
>80%
of new apps in
2012 will be distributed/
49%
of CIOs rank BI as the
top project priority for 2012
deployed on clouds
The strategic focus
in the cloud
will shift
in 2012
from infrastructure
to application
34%
platforms
of CIOs say technology as a service
(cloud) will have the most profound
effect on the CIO role in the future
2/3
of mobile apps
developed in 2012 will integrate
with analytics offerings
2.7
zettabytes
in 2012
32%
of businesses
are likely to
invest in BI
and analytics
in 2012
Rethinking and evolving business strategies
Mobility
Social
Cloud
Big data
How will technology megatrends enable you to save money,
drive innovation, grow your business, and attract and retain customers?
Terabytes
(10E12)
Click Stream
Mobile
Volume
Petabytes
(10E15)
Internet
of
things
Wikis / Blogs
Sensors / RFID / Devices
Social Sentiment
Exabytes
(10E18)
Audio / Video
WEB 2.0
Advertising eCommerce
ERP / CRM
Gigabytes
(10E9)
Log Files
Collaboration
Spatial & GPS Coordinates
Digital Marketing
Data Market Feeds
Payables
Contacts
Search Marketing
Payroll
Deal Tracking
Web Logs
Inventory
Sales Pipeline
Recommendations
eGov Feeds
Weather
Text/Image
Velocity - Variety - variability
ERP / CRM
1980
190,000$
Storage/GB
1990
9,000$
WEB
2.0
Internet of things
2000
15$
2010
0.07$
4
54235
$75
7
10025
$60
2
53705
$30
1
02115
4
$15
DataNode2
3
54235
$75
5 7 53705
10025
$65$60
0
8
54235
10025
$22
$95
44313
$55
5
53705
$65
0
54235
$22
5
53705
$15
2
53705
$30
6
44313
$10
1
02115
$15
Mapper
3
10025
$95
5
53705
6
44313
8
44313
$55
6
44313
$25
9
02115
$15
Group
By
54235
$75
54235
$22
10025
$60
10025
$95
44313
$55
53705
$65
One output bucket
per reduce task
Mapper
DataNode1
Blocks
of the
Sales
file in
HDFS
DataNode3
(custId, zipCode, amount)
$15
6 $1044313
$25
9
$15
02115
Group
By
Map tasks
53705
$30
53705
$15
02115
$15
02115
$15
44313
$10
44313
$25
21
Mapper
Reducer
$65
54235
$75
53705
$30
54235
$22
53705
$15
10025
$60
10025
$95
44313
$55
$65
53705
$30
53705
$15
02115
$15
02115
$15
44313
$10
44313
$25
Sort
53705
$65
53705
$30
53705
$15
SUM
53705
$110
10025
$155
44313
$90
02115
$30
54235
$97
Reducer
Shuffle
53705
Mapper
53705
44313
$10
10025
$60
44313
$25
10025
$95
10025
$60
44313
$10
10025
$95
44313
$25
44313
$55
44313
$55
Sort
Reduce tasks
SUM
Reducer
54235
$75
54235
$22
02115
$15
02115
$15
Sort
02115
$15
02115
$15
54235
$75
54235
$22
SUM
HDFS API
Name Node
Azure Blob Storage
de
Front
Front
end
Frontend
end
Data Node
Data Node
…
DFS (1 Data Node per Worker Role)
and Compute Cluster
Partition
Layer
Stream
Layer
Azure Storage (ASV)
Distributed Processing
(MapReduce)
Distributed Storage
(HDFS)
ODBC
Query
(Hive)
Legend
Red = Core
Hadoop
Blue = Data
processing
Purple =
Microsoft
integration
points and
value adds
Orange = Data
Movement
Green =
Packages
Hive, Pig, Mahout, Cascading, Scalding, Scoobi,
Pegasus…
C#, F# Map/Reduce, LINQ to Hive, .NET
management clients
JavaScript Map/Reduce, Browser hosted console,
Node.js management clients
PowerShell, Cross Platform CLI tools
http://www.windowsazure.com/
http://hadoop.apache.org/
http://nuget.org/packages?q=hadoop
http://hadoopsdk.codeplex.com
Dennis
Mulder
Windows Azure Center of Excellence
Spotlight
Services
Global Services Team
10 Senior Cloud Architects
Assessment
Pilots
Pilots
Architecture
and Design
Guidance
Cloud Apps
Global Scale
Modern Apps
Champs
8
US, EMEA, APAC
Assess
Design
Pilots
Contact
Design Sessions
Dennis Mulder, Solution Architect, [email protected]
Engage
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of
Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.