Ensuring IT services and operational continuity in the enterprise Protect mission critical SQL Server databases using Always On Technologies Maintenance Analysis Testing Solution Design Implementation Deployments and Best Practices.
Download ReportTranscript Ensuring IT services and operational continuity in the enterprise Protect mission critical SQL Server databases using Always On Technologies Maintenance Analysis Testing Solution Design Implementation Deployments and Best Practices.
Ensuring IT services and operational continuity in the enterprise Protect mission critical SQL Server databases using Always On Technologies Maintenance Analysis Testing Solution Design Implementation Deployments and Best Practices High availability RTO operational continuity during a given measurement period RPO Disaster Recovery business operations due to a natural or human-induced disaster service level Availability Class Acceptable Downtime (hrs/yr) OR RTO Acceptable Data Loss (time of last copy) OR RPO Tier 1 >99.99% (1 hr or less) 5 min or less Tier 2 99.9% - 99.99% (1- 8.5 hrs) 5 mins to 8.5 hrs Tier 3 (<99.9%) (Hours to days) Hours to days agreements Regional DR Geographic DR Protection against Local HA Natural Disasters Protection against Network Outages Site Failures Location Redundancy – City, County – < 100 Location Redundancy – State, Country – > 100 miles Unplanned Downtime Planned Downtime Mirror is always redoing—it remains current Witness Application Commit Principal Mirror 1 5 2 SQL Server 2 Log SQL Server 4 >2 Data 3 >3 Log Data Shared Disk for FC2 InstA InstB FC2 Passive Node FC2 Active Node FC1 Active Node FC1 Passive Node Shared Disk for FC1 InstC FC1 Passive Node FC1 Passive Node Windows Server Cluster Transactional Replication Peer to Peer Replication Reporting + Redundancy Query Scale Out + Redundancy Boston New York England Shanghai New Jersey Tokyo Seattle Unplanned Downtime Planned Downtime 1. 2. 3. Optional: switch roles again Backup principal log with norecovery Recover secondary Re-direct clients to secondary Prepare for Upgrade Upgrade Step 1 FOI1 FOI1 FOI2 FOI2 Prepare for Upgrade: 1. Ensure .NET 3.5 and MSI 4.5 are installed on each node 2. Consider upgrading SQL Server shared components on each node first Upgrade Step 2 Upgrade Step 1: - Upgrade half of the nodes - Start upgrading passive nodes first to minimize failovers - Consider moving other instances to avoid service restart if this is the first Katmai instance on the node Upgrade Step 3 FOI1 FOI1 FOI2 FOI2 Passive Nodes Offline FOI1 Possable Owners Upgrade Step 2: - When half nodes are upgraded or when specified, setup will roll ownership to upgraded nodes. Downlevel nodes will be removed from possible owners and up-level nodes will be added Upgrade Step 3: 1. Upgrade remaining nodes 2. Return ownership to desired nodes if needed FOI1 Upgraded Nodes Unplanned Downtime Planned Downtime RPO Redundancy and Utilization Failover Cost Hard-ware App Perf Impact Manageability Low Low Low * Low High Low * Low Low Low Cluster High*** Low *** Low*** Transactional Replication Low Low High Peer-Peer Replication Low Low High Solutions No Data Loss (RPO=0) Failover Unit Inst DB Tab Auto Failover (RTO) Read Sync Async Write * Log Shipping DBM Multiple + ** * Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or database snapshots respectively ** Database Mirroring provides fastest failover to hot secondary *** Depends on SAN technology AdventureWorks Inc Scenario Adventureworks Inc is a manufacturing company that manufactures and sells bicycles across the world. There are a number of applications, some that are mission critical that run on multiple SQL Server Instances The DBA team is run by Darren who is responsible for deploying and managing the application databases. One of his core responsibilities is to ensure availability of all application databases in order to meet the application SLA Application Requirements Applications Data Loss RPO=0 RTO in secs Failover Unit Inst Manufacturing Finance Scheduling DB Auto Failover Tab Read Multiple Sites Read Write Solution Choice for Manufacturing Application Solutions Data Loss RPO=0 Fast RTO Failover Unit Inst DB Tab Auto Failover Read >1 Sites\ Copy Read Write Cluster SAN Replication DBM - Sync DBM - Async Clustering can provide a zero data loss solution that can also provide fast instance level failover Use RAID configuration to provide data redundancy on the SAN Log Shipping If a redundant copy is required that can provide instance failover with zero Transactional data loss use SAN replication Replication High Cost Solution Peer-Peer Use synchronous database mirroring if instance failover is not needed Replication Clustering with RAID Solution Choice for Finance Application Solutions Data Loss RPO=0 Fast RTO Failover Unit Inst DB Tab Auto Failover Read >1 Sites\ Copy Read Write Cluster SAN Replication DBM - Sync DBM - Async Shipping Log For database level redundancy with acceptable data loss with minimal perf impact, asynchronous database mirroring is an Transactional optimal choice Replication Use database snapshots at periodic intervals to provide a Peer-Peer readable Replication snapshot of the data for reporting Reports Finance Scheduling Low cost solution Async Database Mirroring Omaha Datacenter Db Snapshot every hour Adding a regional datacenter into the mix Regional Site Solution Choices Manufacturing Cluster with SAN Sync Mirroring no witness Reports Finance Scheduling Async Database Mirroring Db Snapshot every hour Log Shipping Omaha Datacenter CB Datacenter Complete Architecture Topology Diagram Sync Mirroring No witness Manufacturing Cluster with SAN Log Shipping Licensing Facts HA Features Edition Support Feature Express Workgroup Standard Database Mirroring 1 Failover Clustering 2 Enterprise Comments Advanced high availability solution that includes fast failover and automatic client redirection Backup Log-shipping Data backup and recovery solution Online System Changes Includes Hot Add Memory, dedicated administrative connection, and other online operations Online Indexing Online Restore Fast Recovery ₁Single thread redo ₂ Limited to 2 node cluster Database available when undo operations begin Summary “one size fits all” DAT401 | High Availability and Disaster Recovery: Best Practices for Customer Deployments DAT305 | See the Largest Mission Critical Deployment of Microsoft SQL Server around the World DAT303 | Architecting and Using Microsoft SQL Server Availability Technologies in a Virtualized World DAT407 | Windows Server 2008 R2 and Microsoft SQL Server 2008: Failover Clustering Implementations WSV313 | Failover Clustering Deployment Success WSV314 | Failover Clustering Pro Troubleshooting with Windows Server 2008 R2 DAT09-HOL | Installing a Microsoft SQL Server 2008 + SP1 Clustered Instance DAT12-HOL | Maintaining a Microsoft SQL Server 2008 Failover Cluster VIR06-HOL | Implementing High Availability and Live Migration with Windows Server 2008 R2 Hyper-V “At the end of the day, IT operations is really It’s a free download! Go to www.microsoft.com/ipd about running your business as efficiently as you can so you have more dollars left for innovation. IPD guides help us achieve this.” Peter Zerger, Consulting Practice Lead for Management Solutions, AKOS Technology Services www.microsoft.com/teched www.microsoft.com/learning http://microsoft.com/technet http://microsoft.com/msdn Sign up for Tech·Ed 2011 and save $500 starting June 8 – June 31st http://northamerica.msteched.com/registration You can also register at the North America 2011 kiosk located at registration Join us in Atlanta next year