Nicholas Dritsas Represents the customer-facing resources from the Server Product Groups.

Download Report

Transcript Nicholas Dritsas Represents the customer-facing resources from the Server Product Groups.

Nicholas Dritsas

Represents the customer-facing resources from the Server Product Groups. Azure CAT is comprised of product and solution experts that regularly engage in the largest, most complex, and most unique customer deployments worldwide.

http://sqlcat.com

http://blogs.msdn.com/b/windowsazure

• − − −

Objectives

− Learn about the largest SQL Server and Azure SQL DB projects See the techniques they use to scale Fit for purpose •

Takeaways

− SQL Server and Azure SQL DB will scale to handle very large projects − − − Both are enterprise ready Very flexible architecture choices DBAs more critical than ever

Category

Largest single database Largest table Biggest total data 1 application Highest database transactions per second 1 server (from Perfmon) Fastest I/O subsystem in production (SQLIO 64k buffer serial read test) Fastest “real time” cube data load for 1TB Largest MOLAP cube

Metric

350 TB 1.5 trillion rows 88 PB 130,000 18 GB/sec Millisecond latency 30 minutes* 24 TB

Category

Largest sharded database Largest number of databases 1 app

Metric

20 TB 14,000 Most concurrent users 1 app 3 M Largest Deployment?

*.cs = 14,832 Web.config=149 *.csproj=889 *.sln = 210

• • • • • • • • • Compression Table Partitioning Partitioned Views Service Broker File Group and File management Caching SODA – Services Oriented Database Architecture DDR – Data Dependent Routing Scale up or Scale out

• • •

DATA WAREHOUSE / BI

Building an 80TB data warehouse Very fast disk subsystem Scale out AS and RS

• • • A large electronics company wanted to build a system to support 80TB of data for their reporting needs.

The data is coming from sensors in their manufacturing line.

They hold 36 months of data, one per filegroup, to manage compression and HA.

H/W Configurations

Production Server

HP DL980 (Production) • MS SQL Server 2008 R2 • Windows Server 2008 R2 • 2 CPU * 10 Core = 20 Core • 512GB memory • 8Gbps FC * 20port 20 PowerPath

Standby Server

20 HP DL980 (Standby) • MS SQL Server 2008 R2 • Windows Server 2008 R2 • 2 CPU * 10 Core = 20 Core • 512GB memory • 8Gbps FC * 20port SAN SAN Switch • DS5300B * 2ea • 8Gbps FC * 96port Total 24 U17TB BCV U17TB U17TB BCV Storage BCV VMAX SE * 3 ea • Cache 128GB • 8Gbps FC * 16 Port • 300GB 15k * 88ea Raid-5 : U17TB for Data Area • 300GB 15k * 6ea : Hot Spare • 600GB 10k * 44ea Raid-5 : U17TB for BCV • 600GB 10k * 4ea : Hot Spare • Same Specs. For 3 VMAX SE • EMC Replication Manager, TimeFinder/Clone, PowerPath

VMAX SE Building Block • • • • VMAX-SE 88 450GB 15K rpm Disks − − 8 Hot Spares 80 “Data disks” Raid 5 3+1 (3RAID5) − − 20 RAID sets Each “set” is a single 3+1 group 2,800 MB/sec target throughput

• • • • • • • Room forecasting system Full suite of SQL products (SQL, AS, IS, RS)

Scale out AS and RS

If you need more capacity, just add another server

• Load Balanced Analysis Services reader machines 40 to 50 concurrent users per RS server Complex queries Large data sets returned to many clients

• • • •

OLTP

Scale out one database Scale up one database No downtime allowed – ever 16,000 + instances

• • • The Commerce Transaction Platform supports Billing and subscriptions (eCommerce) for Microsoft products such as adCenter, Xbox Live, Zune, Windows Live Hotmail Plus, and Azure. The Commerce Transaction Platform supports payments using

13 payment methods

spanning

42 currencies

across

65 localized markets

.

5 DBAs

• • • 2 Datacenters 5 Webstore Clusters 15 Replicas (Financial Reporting; Fraud etc) • • • • • • • •

220 SQL 2012 servers (no VM)

736 databases (excluding Mirrors and Secondary's) 121 TB of datafiles (excluding Mirrors and Secondary's) 420 TB of storage – DAS & SAN (EMC/HDS)

12 TB monthly growth

82 DB Mirror Pairs – in DC most asynch, some with auto failover 70 Log Shipping pairs – cross DC 400 Replication subscription streams, 6 distributors

• • • • • Chose to scale up

17 TB OLTP database

10,000 concurrent user connections Replication for reporting database

SAN hardware replication used for Disaster Recovery

Private WAN Load Balancer

OLTP

DB Server SQL2008 CPU CPU SQL Replication

Primary Data Centre (Active) Reporting

DB Server SQL2008 CPU CPU

How do we scale the solution ?

• “Performance Engineering” is at the heart of our methodology • Scale Application layer by adding additional servers into the VIP • Scale DB by adding CPU’s support for large SMP hardware platforms is fundamental  MS • Segregation of OLTP and Reporting workloads  this allows specific tuning of the workloads

• World’s biggest publicly listed online gaming platform • • •

15 million page views and up to 980,000 unique users a day

Environment • • • • • • • •

5 DBA’s & 1 Database Architect

200+ SQL Server Instances 150+ TB of data, 4,000+ Databases 2+ PB storage 10+ TB RAM

450,000+ SQL Statements per second on a single server 500+ Billion database transactions per day No downtime allowed

• • • • • • Fujitsu RX-600 S5 64 Cores 1TB RAM Multiple 1 Gbit NIC interfaces • Separate VLANs for Client Access, Cluster Intercom, Backup and Mirroring traffic External SAS Diskshelves for Data Files • Attached with 4x 8GB Fibre Channel FusionIO PCI-E Solid State Devices for Transaction Log

Scale UP and Scale OUT

• • Scale UP main financial transactions Scale OUT other application functions

High Availability

• • 3 data centers • • Synchronous Database Mirroring adds 1 ms per transaction

Backup 2 TB per hour over the network

http://sqlcat.com/whitepapers/archive/2009/08/13/a-technical-case-study fast-and-reliable-backup-and-restore-of-a-vldb-over-the-network.aspx

Case study

http://www.microsoft.com/casestudies/Microsoft-SQL-Server 2012/bwin.party/Company-Cuts-Reporting-Time-by-up-to-99-Percent-to-3 Seconds-and-Boosts-Scalability/710000000087

• • • • • • • • •

Scale UP

Table Partitioning NUMA Worker Threads

Scale Out

Table Partitioning Partitioned Views SODA Data Dependent Routing

• • • • •

Windows Server 2012 limit is 640 Cores

New concept: Kernel Groups A bit like NUMA, but an extra layer in the hierarchy

SQL Server 2012 the limit is 640 cores 4 TB RAM, both Windows and SQL Server

• • • •

How do you:

Manage an 80 TB Back up a petabyte Build an index on a trillion row table •

Answer

− Break it into manageable size pieces

PDW Appliance

Control Rack Control Nodes Active / Passive Landing Zone Backup Node SQL Management Servers Compute Rack Compute Nodes Database Servers Storage Nodes SQL SQL SQL SQL SQL Spare Database Server SQL SQL SQL SQL SQL SQL

• • • • • −

Separate your data by business function

Example: HR, Payroll, Accounting, etc

Or by user function

− Example: Login, chat, email, pictures, etc

Each function goes in a different database A common database for shared tables databases (to keep all joins local)

auto-scale

36

• • • • • • • • • Smart TVs requires frequent updates with new apps and software changes for better support and compatibility Due to the high success of the Smart TV line (20m TVs live and counting), Samsung needs a more scalable and elastic system. Achieved 10x performance levels compared to On Premises system, from 500 web requests / sec to 5,000. Main optimizations: Dropped using ibatis, a .net library to access SQL. Now, use stored procedures and ado.net

• Used IIS cache for the lookup data. We have a web role that refreshes the cache every 5 minutes Moved heavy sequential log inserts to table storage Moved from 20L to 10XL web roles, and we raised connection limit to 500 from 100 per web role. Having XL, we have better bandwidth. Switched from wcf to mvc.

Host their SQL Azure DB in XL Resource Reservation to guarantee cores, IOPS and threads for it.

The Solution/Architecture

Administrator

User ID/Password

Web Role – Firmware Upload System Website

ASP .NET

HTML Administration Reporting Single Tenant

Blob Storage

Smart TV firmware

Windows Azure CDN Worker Role

Task Automation – Firmware encryption and batch update

Devices (Smart TV x N) Web Role – Firmware download System

Update Management Single Tenant ASP .NET

Device Validation

SQL Azure

Device logs & update status Single Tenant / No Federation

39

• • Large Australian based ISV developing Accounting software for small businesses.

User count in the thousands, coming from desktop era

• • The application is developed in C#/.NET using LINQ to SQL and Entity Framework Databases on premises and Azure are kept in sync via Sync Framework

Workload Review: Cash Converters

Company Information • Cash Converters is a worldwide pawn broking and personal finance franchise company • 1,000+ stores in 19 countries; major presence in Australia (home country), the UK, and Spain Project Description • Store Point-Of-Sale (POS) built on Azure • • • • • Very low transaction volumes ~20 POS txns per minute per store Requirement: No downtime during store hours Check-out transactions must go through Up to 15 seconds is an acceptable response time 44

Architecture: Cash Converters (Proposed)

Victoria (VIC) Queensland (QLD)

Store 1 Store 2 Store N-5 . . . . . . Store N Azure Subscription QLD.trafficmgr.cloudapp.net

VIC.trafficmgr.cloudapp.net

QLD DB (SGP) GeoDR QLD DB (HK) On-premises Monitoring Aggregations VIC DB (SGP) GeoDR VIC DB (HK)

Failover

45

• •

Current/Next Steps…

Preparing application for multi-tenancy (in progress) • Test the entire stack as a multi-tenanted application • Tooling for moving store data from one DB to another • Intent is to migrate single-tenanted store data to multi-tenanted DBs Multi-tenanted DBs to be initially put under RR and GeoDR’ed • Gap here is the time to move a DB under RR - manual action during preview. This is also required after a failover to the original target (now source) and re-seeding to a new target 46

47

4 1 2 Reflect and Share your experience Buy and give the Story Read, Explore and Re-discover the Story Play and participate in the Story 3

• • • • • • • • • Harry Potter Online Experience Why Cloud?

Original Beta Launch July 2011 on-premises solution 4.2M Page views < 2 minutes 1M Registrations < 10 hours Could not scale easily • Target User Base Unknown popularity, estimated ~20M users Since port to Azure - Open to all • • • • • 1 billion page views in first 2 weeks ~15K Registrations/day >5M signups Peak ~84K concurrent users, now 25K Silent Launch: on April 14, 2012 and for 2 nd book in July, 2012 Pottermore = Very Happy CTO “Overwhelmed by support from Microsoft”

Pottermore Architecture

Users (

Cookie

)

Session State

Content

Taxonomy

Resources

Distributed Cache (

Worker Role

) Local Cache

Shared Feeds

Great Hall

Common Room

Others… Write Only

All User Data

Profile

Friends

• • •

Inventory Games Others…

Cache Builder (

Worker Role

) ( Web *

Web Role

)

Federated by UserID

Activities

Comments

• •

Games (Wins) Others….

( Database

Sharded SQL

)

Cache Builder

OnStart: Content

OnTime – Shared Feeds

30 to 120 seconds

Job Scheduler (

Worker Role

) RMF App * (

Web Role

)

Stateless application servers serving to web site…

Email Sender (

Worker Role

) Email Proxy

On-Premise External System (*) Web and App Server hosted together – different App Pools

AZ-302-A | Projects Largest Azure 52

http://channel9.msdn.com/Events/TechEd/Australia/2013 http://www.microsoftvirtualacademy.com/ http://technet.microsoft.com/en-au/ http://msdn.microsoft.com/en-au/

http://technet.microsoft.com/en-us/evalcenter/dn205290.aspx

www.windowsazure.com

http://www.windowsazure.com/en-us/documentation/services/hdinsight/?fb=en-us www.powerbi.com