Microsoft Zune Media Storage and Delivery BING REALTIME FACEBOOK/TWITTER SEARCH INGESTION ENGINE VM VM VM Windows Azure Blobs Windows Azure Tables VM.

Download Report

Transcript Microsoft Zune Media Storage and Delivery BING REALTIME FACEBOOK/TWITTER SEARCH INGESTION ENGINE VM VM VM Windows Azure Blobs Windows Azure Tables VM.

Microsoft Zune Media
Storage and Delivery
BING REALTIME FACEBOOK/TWITTER SEARCH INGESTION ENGINE
VM
VM
VM
Windows Azure Blobs
Windows Azure Tables
VM
What’s new for Blobs, Tables
and Queues
•
•
•
•
Blobs
Tables
Queues
Drives
Container
Blobs
https://<account>.blob.core.windows.net/<container>
Account
Table
Entities
https://<account>.table.core.windows.net/<table>
Queue
Messages
https://<account>.queue.core.windows.net/<queue>
– What is new?
•
•
•
Range requests of the form “Range: bytes 100-”
Return “Accept-Ranges” response header
ETags to be quoted
– What is new?
• Query Projection ($select)
• Project only selected
columns
• Upsert Entity
• InsertOrReplace
• InsertOrMerge
Projection
public class Customer
{
public string PartitionKey { get; set; } // Customer Name
public string RowKey { get; set; }
// Customer Phone Number
public DateTime CustomerSince { get; set; }
public double TotalPurchase { get; set; }
public string State { get; set; }
// 100 more properties including profile picture etc.…
}
// Partial entity defined here
public class CustomerDiscount
{
public string PartitionKey { get; set; }
public string RowKey { get; set; }
public double TotalPurchase { get; set; }
}
Projection
// Select partial entities by choosing properties to be projected
var
from
in
CustomerDiscount
"Customers" /*Table Name*/
select new CustomerDiscount
PartitionKey
RowKey
TotalPurchase
TotalPurchase
CustomerDiscount
CustomerDiscount
// Calculate the discount to be given based on total purchases made
Upsert
// When user logs in from mobile device, it will register the user using upsert
Customer
new Customer "Thomas Anderson"
“555-555-0100"
"4567 Main St. Redmond 48188"
"Washington"
// Note: AttachTo method is called without an Etag which indicates
// that this is an Upsert Command
"Customers"/*Table Name*/
//
//
//
//
No SaveChangeOptions indicates that a MERGE verb will be used
to get InsertOrMerge semantics
Use SaveChangesOptions.ReplaceOnUpdate for InsertOrReplace semantics.
But InsertOrReplace will overwrite TotalPurchase if it existed
SaveChangesOptions.ReplaceOnUpdate
context.SaveChanges();
– What is new?
UPDATE MESSAGE EXAMPLE
7:04
7:00
AM
7:09
7:07AM
Periodically store progress information in message content
Extend visibility timeout with another 5 minutes
Get Message with 5 minutes visibility timeout
Expires
Expires @
@ 7:05AM
7:09AM
Work items
Azure Queue
7:09
7:05
7:14
Retrieve progress from queue
message and resume
Windows Azure Storage Analytics
Log Version
Accessing Account
Owner Account
Service Type
Request URL
Object Key
Request ID
Operation Number
Request Version
Operation Type
Start Time
Application End to End Latency
Storage Server Latency
Authentication Type
Request Status
HTTP Status Code
Client IP
User Agent
Referrer
Client Request ID
ETag
LMT
Request Packet Size
Request Header Size
Response Packet Size
Response Header Size
Request MD5
Server MD5
Conditions Used
Log Version: 1.0
Log Entry in Blob:
Start Time: 2011-07-28T18:02:40.6271789Z
1.0;2011-0728T18:02:40.6271789Z;PutBlob;Success;201;28;21;authenticated;sally;s
Operation Type: PutBlob
ally;blob;"http://sally.blob.core.windows.net/thumbnails/lake.jpg?time
Status: Success
out=30000";"/sally/thumbnails/lake.jpg";fb658ee6-6123-41f5-81e2HTTP Status Code: 201
4bfdc178fea3;0;201.9.10.20;2009-09Application E2E Latency (milliseconds): 28
19;438;100;223;0;100;;"66CbMXKirxDeTr82SXBKbg==";"0x8CE1B67AD
Storage Server Latency (milliseconds): 21
25AA05";Thursday, 28-Jul-11 18:02:40 GMT;;;;"req12345“
Accessing Account: sally
Owner Account: sally
Service Type: blob
Request URL: PUT http://sally.blob.core.windows.net/thumbnails/lake.jpg
Object Key: /sally/thumbnails/lake.jpg
Request ID: fb658ee6-6123-41f5-81e2-4bfdc178fea3
Operation Number: 0
Request Version: 2009-09-19
Client IP: 201.9.10.20
Client Request ID: req12345
•
•
•
•
•
•
Total Transactions
Availability
% Success, % Network Errors, % Timeout, % Throttled, etc.
Average Latency (Application E2E and Storage Server latency)
Total Ingress
Total Egress
•
Capacity and # of objects
Application E2E Latency
Request arrives at
storage service
Storage Server Latency
Done
1400
8/23/2011 10:00
8/23/2011 12:00
8/23/2011 14:00
8/23/2011 16:00
8/23/2011 18:00
8/23/2011 20:00
8/23/2011 22:00
8/24/2011 0:00
8/24/2011 2:00
8/24/2011 4:00
8/24/2011 6:00
8/24/2011 8:00
8/24/2011 10:00
8/24/2011 12:00
8/24/2011 14:00
8/24/2011 16:00
8/24/2011 18:00
8/24/2011 20:00
8/24/2011 22:00
8/25/2011 0:00
8/25/2011 2:00
8/25/2011 4:00
8/25/2011 6:00
8/25/2011 8:00
8/25/2011 10:00
8/25/2011 12:00
8/25/2011 14:00
8/25/2011 16:00
8/25/2011 18:00
8/25/2011 20:00
8/25/2011 22:00
8/26/2011 0:00
8/26/2011 2:00
8/26/2011 4:00
8/26/2011 6:00
Avg. Application E2E Latency (ms)
Avg. Storage Server Latency (ms)
1200
1000
800
600
400
200
0
0
8/26/2…
8/26/2…
8/26/2…
8/26/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
10000000
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/23/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/24/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/25/2…
8/26/2…
8/26/2…
8/26/2…
8/26/2…
1400
1200
1000
800
600
400
200
0
Avg. Application E2E Latency (ms)
Total Table Transactions
8000000
6000000
4000000
2000000
•
http://account.blob.core.windows.net/$logs/
•
http://account.table.core.windows.net/$Metrics*
North
Central US
North
Europe
Geo-replication
East Asia
South East
Asia
Geo-replication
Europe
West
Geo-replication
South
Central US
Microsoft Windows Azure Support
http://account.blob.core.windows.net/
Azure
DNS
Hostname
IP Address
account.blob.core.windows.net
North Central
Central US
South
US
Update DNS
DNS lookup
Data access
North Central US
Failover
Geo-replication
South Central US
Windows Azure Storage Internals
Design Goals
•
“Windows Azure Storage: A Highly Available Cloud Storage Service with Strong
Consistency”
Access blob storage via the URL: http://<account>.blob.core.windows.net/
Storage
Location
Service
Data access
LB
LB
Front-Ends
Front-Ends
Partition Layer
Partition Layer
DFS Layer
Intra-stamp replication
Storage Stamp
Inter-stamp (Geo) replication
DFS Layer
Intra-stamp replicaion
Storage Stamp
•
•
•
•
•
All data from the Partition Layer is stored into files (extents) in the DFS layer
An extent is replicated 3 times across different fault and upgrade domains
Checksum all stored data
•
•
Verified on every client read
Scrubbed every few days
•
3 replicas are randomly allocated across a candidate set of servers based on available
resources
Any of the 3 replicas can be read from and read load balancing is used
Use a journal drive to keep the write latencies low
Re-replicate on disk/node/rack failure or checksum mismatch
Load balancing
•
•
Distributed
File System
(DFS)
Layer
M
M
Paxos
M
DFS Servers
•
•
•
•
Provide transaction semantics and strong consistency for high level data abstractions
Stores and reads the objects to/from extents in the DFS layer
Provides inter-stamp (geo) replication by shipping logs to other stamps
Scalable object index via partitioning
Partition
Master
Lock
Service
Partition Layer
Partition
Server
Partition
Server
Partition
Server
Partition
Server
M
DFS
Layer
M
Paxos
M
DFS Servers
•
•
•
Front End
Layer
FE
FE
Stateless Servers
Authentication + authorization
Request routing
FE
FE
FE
Partition
Master
Lock
Service
Partition Layer
Partition
Server
Partition
Server
Partition
Server
Partition
Server
M
DFS
Layer
M
Paxos
M
DFS Servers
Incoming Write Request
Ack
Front End
Layer
FE
FE
FE
FE
FE
Partition
Master
Lock
Service
Partition Layer
Partition
Server
Partition
Server
Partition
Server
Partition
Server
M
DFS
Layer
M
Paxos
M
DFS Servers
• Need a scalable index for the objects that can
• Spread the index across 100s of servers
• Dynamically load balance
•
Dynamically change what servers are serving each part of the index based on
load
Blob Index
Account
Account
Name
Name
Container
Container
Name
Name
Blob
Blob
Name
Name
aaaa
aaaa
aaaa
aaaa
aaaaa
aaaaa
……..
………
……..
………
……..
………
……..
………
……..
………
……..
………
……..
……..
Account
Container
harry
pictures
Name
Name
……..
……..
Front-End
harry
pictures
…….. Server
……..
………
………
……..
……..
A-H:
PS1
………
………
……..
……..
PS2
Account H’-R:
Container
richard
videos
Name R’-Z:
Name
PS3
……..
……..
richard
videos
……..
……..
Partition
………
………
…….. Map……..
……..
Blob
sunrise
Name
……..
sunset
……..
………
……..
………
……..
Blob
soccer
Name
……..
tennis
……..
………
……..
………
……..
………
……..
………
……..
zzzz
zzzz
zzzz
zzzz
zzzzz
zzzzz
Storage Stamp
PS 1
PS 2
A-H: PS1
Partition
H’-R: PS2
Master
R’-Z: PS3
Partition
Server
A-H
Partition
Server
H’-R
Partition
Map
Partition
Server
R’-Z
PS 3
VIP
Legend
- RangePartition
- Server Load
FE 2 PM
FE 1 PM
Partition
Server 1
Partition
Server 2
DFS Layer
FE 3
Partition
Server 3
Partition
Server 4
1. Scalability targets of a single storage account
2. Scalability targets for Blobs, Table Entities and Queues within a storage account
Scalability targets of a single storage account
Account Scalability Targets
•
•
•
•
•
•
Capacity – Up to 100 TBs
Transactions – Up to 5000 entities per second
Bandwidth – Up to 3 gigabits per second
Partition data across storage accounts to go beyond these targets
Scalability targets for Blobs, Table Entities and Queues within a storage account
•
•
•
•
Single Blob – up to 60MBytes per second
Single PartitionKey in a Table – up to 500 entities per second
Single Queue - up to 500 messages per second
• “Windows Azure Storage: A Highly Available Cloud Storage Service with
Strong Consistency”
http://blogs.msdn.com/windowsazurestorage/
http://forums.dev.windows.com
http://bldw.in/SessionFeedback