An Overview of Cloud Computing Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing

Download Report

Transcript An Overview of Cloud Computing Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing

An Overview of Cloud
Computing
Raghu Ramakrishnan
Chief Scientist, Audience and Cloud Computing
Research Fellow, Yahoo! Research
Reflects many discussions with:
Eric Baldeschwieler, Jay Kistler, Chuck Neerdaels, Shelton Shugar, and Raymie Stata
and joint work with the Sherpa team, in particular:
Brian Cooper, Utkarsh Srivastava, Adam Silberstein and Nick Puz in Y! Research
Chuck Neerdaels, P.P. Suryanarayanan and many others in CCDI
1
CCDI—Research Collaboration
Yahoo! Research
CCDI
•
•
•
•
•
•
•
•
•
•
•
Raghu Ramakrishnan
Brian Cooper
Utkarsh Srivastava
Adam Silberstein
Nick Puz
Rodrigo Fonseca
Chuck Neerdaels
P.P.S. Narayan
Kevin Athey
Toby Negrin
Plus Dev/QA teams
2
Pie-in-the-sky
SCENARIOS
3
Living in the Clouds
• We want to start a new website, FredsList.com
• Our site will provide listings of items for sale, jobs,
etc.
• As time goes on, we’ll add more features
– And illustrate how more cloud capabilities (and
corresponding infrastructure components) are used as
needed
• List of capabilities/components is illustrative, not exhaustive
• Our cloud provides a “dataset” abstraction
– FredsList doesn’t worry about the underlying
components
4
Step 1: Listings
FredsList wants to store listings as (key, category, description)
FredsList.com application
1234323,
transportation,
For sale: one
bicycle, barely
used
5523442,
childcare,
Nanny available
in San Jose
215534,
wanted,
Looking for
issue 1 of
Superman comic
book
DECLARE DATASET Listings AS
( ID String PRIMARY KEY,
Category String,
Description Text )
Simple Web Service API’s
Sherpa
Database
5
Step 2: Search
FredsList’s customers quickly ask for keyword search
FredsList.com application
“dvd’s”
“bicycle”
“nanny”
ALTER Listings
SET Description SEARCHABLE
Simple Web Service API’s
Sherpa
Vespa
Database
Search
MessagingYMB
6
Step 3: Photos
FredsList decides to add photos to listings
FredsList.com application
ALTER Listings
ADD Photo BLOB
Simple Web Service API’s
Sherpa
Foreign key
MObStor
Vespa
photo → listing
Database
Storage
Search
MessagingYMB
7
Step 4: Data Analysis
FredsList wants to analyze its listings to get statistics about category, do geocoding, etc.
FredsList.com application
Pig query to
analyze
categories
Hadoop
program to
geocode data
Hadoop program to
generate fancy
pages for listings
ALTER Listings
MAKE ANALYZABLE
Simple Web Service API’s
Grid
Sherpa
Foreign key
MObStor
Vespa
photo → listing
Compute
Database
Batch export
Storage
Search
MessagingYMB
8
Step 5: Performance
FredsList wants to reduce its data access latency
FredsList.com application
ALTER Listings
MAKE CACHEABLE
Simple Web Service API’s
Grid
Sherpa
Foreign key
MObStor
Vespa
memcached
photo → listing
Compute
Database
Batch export
Storage
Search
Caching
MessagingYMB
9
Motherhood-and-Apple-Pie
EYES TO THE SKIES
10
Why Clouds?
• On-demand infrastructure to
create a fundamental shift
in the OE curve. Let’s us:
– Do things we can’t do
– Reduce time to market
– Build more robustly, more
efficiently, more globally, more
completely, for a given budget
• Cloud services should do
heavy lifting of heavy-lifting
of scaling & high-availability
– Today, this is done at the applevel, which is not productive
11
Requirements for Cloud Services
• Multitenant. A cloud service must support multiple, organizationally
distant customers.
• Elasticity. Tenants should be able to negotiate and receive
resources/QoS on-demand.
• Resource Sharing. Ideally, spare cloud resources should be
transparently applied when a tenant’s negotiated QoS is insufficient, e.g.,
due to spikes.
• Horizontal scaling. It should be possible to add cloud capacity in small
increments; this should be transparent to the tenants of the service.
• Metering. A cloud service must support accounting that reasonably
ascribes operational and capital expenditures to each of the tenants of the
service.
• Security. A cloud service should be secure in that tenants are not made
vulnerable because of loopholes in the cloud.
• Availability. A cloud service should be highly available.
• Operability. A cloud service should be easy to operate, with few
operators. Operating costs should scale linearly or better with the capacity
of the service.
12
Types of Cloud Services
• Two kinds of cloud services:
– Horizontal Cloud Services
• Functionality enabling tenants to build applications or new
services on top of the cloud
– Functional Cloud Services
• Functionality that is useful in and of itself to tenants. E.g.,
various SaaS instances, such as Saleforce.com; Google
Analytics and Yahoo!’s IndexTools; Yahoo! properties aimed
at end-users and small businesses, e.g., flickr, Groups, Mail,
News, Shopping
• Could be build on top of horizontal cloud services or from
scratch
• Yahoo! has been offering these for a long while (e.g., Mail for
SMB, Groups, Flickr, BOSS, Ad exchanges)
13
Horizontal Cloud Services
• Horizontal cloud services are foundations on which
tenants build applications or new services. They should
be:
– Semantics-free. Must be "generic infrastructure,” and not tied to
specific app-logic.
• May provide the ability to inject application logic through well-defined
APIs
– Broadly applicable. Must be broadly applicable (i.e., it can't be
intended for just one or two properties).
– Fault-tolerant over commodity hardware. Must be built using
inexpensive commodity hardware, and should mask component
failures.
• While each cloud service provides value, the power of the
cloud paradigm will depend on a collection of well-chosen,
loosely coupled services that collectively make it easy to
quickly develop and operate innovative web applications.
14
What’s in the Horizontal Cloud?
Simple Web Service API’s
Horizontal Cloud Services
Provisioning
&
Virtualization
e.g., EC2
Batch
Storage &
Processing
e.g.,
Hadoop
& Pig
Operational
Storage
e.g., S3,
MObStor,
Sherpa
Edge
Content
Services
e.g., YCS,
YCPI
Other Services
Messaging,
Workflow,
virtual DBs &
Webserving
ID & Account Management
Metering, Billing, Accounting
Security
Monitoring & QoS
Shared
Infrastructure
Common Approaches to QA, Production Engineering,
Performance Engineering, Datacenter Management, and Optimization
15
Yahoo! CCDI Thrust Areas
• Fast Provisioning and Machine Virtualization: On
demand, deliver a set of hosts imaged with desired
software and configured against standard services
– Multiple hosts may be multiplexed onto the same physical
machine.
• Batch Storage and Processing: Scalable data storage
optimized for batch processing, together with
computational capabilities
• Operational Storage: Persistent storage that supports
low-latency updates and flexible retrieval
• Edge Content Services: Support for dealing with
network topology, communication protocols, caching, and
BCP
Rest of
today’s talk
16
Hadoop: Batch Storage/Analysis
Why is batch processing
important?
[Workflow]
High-level query layer
(Pig)
Map-Reduce
HDFS
• Whether it’s
–
–
–
–
response-prediction for advertising
machine-learned relevance for Search, or
content optimization for audience,
data-intensive computing is increasingly
central to everything Yahoo! does
– Hadoop is central to addressing this need
• Hadoop is a case-study in our cloud vision
– Processes enormous amounts of data
– Provides horizontal scaling and faulttolerance for our users
– Allows those users to focus on their app
logic
17
SHERPA
To Help You Scale Your Mountains of Data
18
The Yahoo! Storage Problem
– Small records – 100KB or less
– Structured records - tens, hundreds or thousands of fields
– Extreme data scale - Tens of TB
– Extreme request scale - Tens of thousands of requests/sec
– Low latency globally - 20+ datacenters worldwide
– High Availability - outages cost $millions
– Variable usage patterns - as applications and users change
19
19
The Sherpa Solution
The next generation global-scale record store
– Record-orientation: Routing, data storage optimized
for low-latency record access
– Scale out: Add machines to scale throughput (while
keeping latency low)
– Asynchrony: Pub-sub replication to far-flung
datacenters to mask propagation delay
– Consistency model: Reduce complexity of
asynchrony for the application programmer
– Cloud deployment model: Hosted, managed service
to reduce app time-to-market and enable on demand
scale and elasticity
20
20
What is Sherpa?
A
B
C
D
E
F
42342
42521
66354
12352
75656
15677
E
W
W
E
C
E
Parallel database
CREATE TABLE Parts (
ID VARCHAR,
StockNumber INT,
Status VARCHAR
…
)
Structured, flexible schema
A
B
C
D
E
F
42342
42521
66354
12352
75656
15677
E
W
W
E
C
E
A
B
C
D
E
F
42342
42521
66354
12352
75656
15677
E
W
W
E
C
E
Geographic replication
Hosted, managed infrastructure
21
21
What Will Sherpa Become?
A
B
C
D
E
F
42342
42521
66354
12352
75656
15677
E
W
W
E
C
E
Parallel database
A
B
C
D
E
F
42342
42521
66354
12352
75656
15677
E
W
W
E
C
E
Indexes and views
CREATE TABLE Parts (
ID VARCHAR,
StockNumber INT,
Status VARCHAR
A 42342
E
…
B 42521
W
)
C 66354
W
D
E
F
12352
75656
15677
Geographic replication
Structured, flexible schema
E
C
E
Hosted, managed infrastructure
22
Sherpa Design Goals
Scalability
Consistency
•
•
•
•
•
•
Thousands of machines
Easy to add capacity
Restrict query language to avoid costly queries
Per-record guarantees
Timeline model
Option to relax if needed
Geographic replication
Multiple access paths
•
•
•
•
Asynchronous replication around the globe
Low-latency local access
Hash table, ordered table
Primary, secondary access
High availability and fault tolerance
Hosted service
•
•
•
•
Automatically recover from failures
Serve reads and writes despite failures
Applications plug and play
Share operational cost
23
23
Technology Elements
Applications
Tabular API
PNUTS API
YCA: Authorization
PNUTS
• Query planning and execution
• Index maintenance
Distributed infrastructure for tabular data
• Data partitioning
• Update consistency
• Replication
YDOT FS
• Ordered tables
YDHT FS
• Hash tables
YMB
• Pub/sub messaging
Zookeeper
• Consistency service
24
24
Data Manipulation
• Per-record operations
– Get
– Set
– Delete
• Multi-record operations
– Multiget
– Scan
– Getrange
• Web service (RESTful) API
25
25
Tablets—Hash Table
0x0000
Name
Description
Grape
Grapes are good to eat
$12
Lime
Limes are green
$9
Apple
Apple is wisdom
$1
Strawberry
0x2AF3
0x911F
0xFFFF
Strawberry shortcake
Price
$900
Orange
Arrgh! Don’t get scurvy!
$2
Avocado
But at what price?
$3
Lemon
How much did you pay for this lemon?
$1
Tomato
Is this a vegetable?
$14
Banana
The perfect fruit
$2
New Zealand
$8
Kiwi
26
26
Tablets—Ordered Table
A
Name
Description
Price
Apple
Apple is wisdom
$1
Avocado
But at what price?
$3
Banana
The perfect fruit
$2
Grape
Grapes are good to eat
$12
New Zealand
$8
How much did you pay for this lemon?
$1
Limes are green
$9
H
Kiwi
Lemon
Lime
Q
Orange
Strawberry
Tomato
Arrgh! Don’t get scurvy!
$2
Strawberry shortcake
$900
Is this a vegetable?
$14
Z
27
27
Flexible Schema
Posted date
Listing id
Item
Price
6/1/07
424252
Couch
$570
6/1/07
763245
Bike
$86
6/3/07
211242
Car
$1123
6/5/07
421133
Lamp
$15
Color
Condition
Good
Red
Fair
28
Detailed Architecture
Remote regions
Local region
Clients
REST API
Routers
YMB
Tablet controller
Storage
units
29
29
Tablet Splitting and Balancing
Each storage unit has many tablets (horizontal partitions of the table)
Storage unit may become a hotspot
Storage unit
Tablet
Overfull tablets split
Tablets may grow over time
Shed load by moving tablets to other servers
30
30
QUERY
PROCESSING
31
31
Accessing Data
4 Record for key k
1
Get key k
3 Record for key k
SU
SU
2
Get key k
SU
32
32
Bulk Read
1
{k1, k2, … kn}
2
Get k1
Get k2
SU
SU
Get k3
Scatter/
gather
server
SU
33
33
Range Queries in YDOT
• Clustered, ordered retrieval of records
Apple
Avocado
Grapefruit…Pear?
Banana
Blueberry
Canteloupe
Grape
Kiwi
Lemon
Grapefruit…Lime?
Lime…Pear?
Router
Lime
Mango
Orange
Strawberry
Apple
Tomato
Avocado
Watermelon
Banana
Blueberry
Storage unit 1
Canteloupe
Storage unit 3
Lime
Storage unit 2
Strawberry
Storage unit 1
Strawberry
Tomato
Watermelon
Storage unit 1
Lime
Mango
Orange
Canteloupe
Grape
Kiwi
Lemon
Storage unit 2
Storage unit 3
34
Updates
8 Sequence # for key k
1
Write key k
Routers
Message brokers
3
7 Sequence # for key k
2
Write key k
4
Write key k
5
SU
SU
SU
6
SUCCESS
Write key k
35
35
ASYNCHRONOUS
REPLICATION AND
CONSISTENCY
36
36
Asynchronous Replication
37
37
Consistency Model
• Goal: make it easier for applications to reason about updates
and cope with asynchrony
• What happens to a record with primary key “Brian”?
Record
inserted
v. 1
Update
Update Update
Update
v. 2
v. 3
v. 4
Update
Update
v. 5
v. 6
Generation 1
v. 7
Delete
Update
v. 8
Time
Time
38
38
Consistency Model
Read
Stale version
v. 1
v. 2
v. 3
v. 4
Stale version
v. 5
v. 6
Generation 1
v. 7
Current
version
v. 8
Time
39
39
Consistency Model
Read up-to-date
Stale version
v. 1
v. 2
v. 3
v. 4
Stale version
v. 5
v. 6
Generation 1
v. 7
Current
version
v. 8
Time
40
40
Consistency Model
Read ≥ v.6
Stale version
v. 1
v. 2
v. 3
v. 4
Stale version
v. 5
v. 6
Generation 1
v. 7
Current
version
v. 8
Time
41
41
Consistency Model
Write
Stale version
v. 1
v. 2
v. 3
v. 4
Stale version
v. 5
v. 6
Generation 1
v. 7
Current
version
v. 8
Time
42
42
Consistency Model
Write if = v.7
ERROR
Stale version
v. 1
v. 2
v. 3
v. 4
Stale version
v. 5
v. 6
Generation 1
v. 7
Current
version
v. 8
Time
43
43
Consistency Model
Write if = v.7
ERROR
Stale version
Stale version
Current
version
Mechanism: per record mastership
v. 1
v. 2
v. 3
v. 4
v. 5
v. 6
Generation 1
v. 7
v. 8
Time
44
44
Mastering
A
B
C
D
E
F
42342
42521
66354
12352
75656
15677
E
W
W
E
C
E
A
B
C
D
E
F
Tablet master
A
B
C
D
E
F
42342
42521
66354
12352
75656
15677
42342
42521
66354
12352
75656
15677
E
W
W
E
C
E
E
W
W
E
C
E
46
46
Bulk Insert/Update/Replace
Client
Source Data
Bulk manager
1. Client feeds records to bulk
manager
2. Bulk loader transfers records
to SU’s in batches
• Bypass routers and
message brokers
• Efficient import into
storage unit
47
Bulk Load in YDOT
• YDOT bulk inserts can cause performance
hotspots
• Solution: preallocate tablets
48
Index Maintenance
• How to have lots of interesting indexes,
without killing performance?
• Solution: Asynchrony!
– Indexes updated asynchronously when base
table updated
Planned functionality
49
SHERPA
IN CONTEXT
50
50
MObStor
• Yahoo!’s next-generation globally replicated, virtualized
media object storage service
• Better provisioning, easy migration, replication, better
BCP, and performance
• New features (Evergreen URLs, CDN integration, REST
API, …)
• The object metadata problem addressed using Sherpa,
though MObStor is focused on blob storage.
5151
Storage & Delivery Stack
52
The World Has Changed
• Web applications need:
– Scalability!
• Preferably elastic
– Geographic distribution
– High availability
– Reliable storage
• Web applications can do without:
– Complicated queries
– Strong transactions
53
Web Data Management
• Scan oriented
workloads
• Focus on
sequential disk
I/O
• $ per cpu
cycle
Large data analysis
(Hadoop)
Structured record
storage
(PNUTS)
Blob storage
(SAN/NAS)
• CRUD
• Point lookups
and short
scans
• Index
organized
table and
random I/Os
• $ per latency
• Object
retrieval and
streaming
• Scalable file
storage
• $ per GB
54
Types of Record Stores
• Query expressiveness
S3
PNUTS
Oracle
Simple
Feature rich
Object
retrieval
Retrieval from
single table of
objects/records
SQL
55
Types of Record Stores
• Consistency model
S3
PNUTS
Oracle
Best effort
Eventual
consistency
Timeline
consistency
Object-centric
consistency
ACID
Strong
guarantees
Program
centric
consistency
56
Types of Record Stores
• Elasticity (ability to add resources on
demand)
Oracle
PNUTS
S3
Not scalable
Elastic
Limited
(via data
distribution)
VLSD
(Very Large
Scale
Distribution
/Replication)
57
Data Stores Comparison
Versus PNUTS
•
User-partitioned SQL stores
– Microsoft Azure SDS
– Amazon SimpleDB
•
Multi-tenant application databases
– Salesforce.com
– Oracle on Demand
•
•
•
More expressive queries
Users must control partitioning
Limited elasticity
•
Highly optimized for complex
workloads
Limited flexibility to evolving
applications
Inherit limitations of underlying
data management system
•
•
•
Mutable object stores
– Amazon S3
•
Object storage versus record
management
58
Application Design Space
Get a few
things
Sherpa
MySQL Oracle
BigTable
Scan
everything
Everest
Records
MObStor
YMDB
Filer
Hadoop
Files
59
59
SQL/ACID
Consistency
model
Updates
Structured
access
Global low
latency
Availability
Operability
Elastic
Alternatives Matrix
Sherpa
Y! UDB
MySQL
Oracle
HDFS
BigTable
Dynamo
Cassandra
60
60
Further Reading
Efficient Bulk Insertion into a Distributed Ordered Table (SIGMOD 2008)
Adam Silberstein, Brian Cooper, Utkarsh Srivastava, Erik Vee,
Ramana Yerneni, Raghu Ramakrishnan
PNUTS: Yahoo!'s Hosted Data Serving Platform (VLDB 2008)
Brian Cooper, Raghu Ramakrishnan, Utkarsh Srivastava,
Adam Silberstein, Phil Bohannon, Hans-Arno Jacobsen,
Nick Puz, Daniel Weaver, Ramana Yerneni
61
Opening Up Yahoo! Search
Phase 1
Giving site owners and developers
control over the appearance of Yahoo!
Search results.
Phase 2
BOSS takes Yahoo!’s open strategy to
the next level by providing Yahoo!
Search infrastructure and technology to
developers and companies to help them
build their own search experiences.
62
Search Results of the Future
yelp.com
Gawker
babycenter
New York Times
epicurious
LinkedIn
answers.com
webmd
63
BOSS Offerings
BOSS offers two options for companies and developers and has partnered with top technology
universities to drive search experimentation, innovation and research into next generation
search.
API
A self-service, web services model for
developers and start-ups to quickly
build and deploy new search
experiences.
CUSTOM
Working with 3rd parties to build a
more relevant, brand/site specific
web search experience.
This option is jointly built by Yahoo!
and select partners.
ACADEMIC
Working with the following
universities to allow for wide-scale
research in the search field:
• University of Illinois
Urbana Champaign
• Carnegie Mellon
University
• Stanford University
• Purdue University
• MIT
• Indian Institute of
Technology Bombay
• University of
Massachusetts
(Slide courtesy Prabhakar Raghavan) 64
Partner Examples
65
QUESTIONS?
66
66