Ensuring IT services and operational continuity in the enterprise Protect mission critical SQL Server databases using Always On Technologies Maintenance Analysis Testing Solution Design Implementation Deployments and Best Practices.

Download Report

Transcript Ensuring IT services and operational continuity in the enterprise Protect mission critical SQL Server databases using Always On Technologies Maintenance Analysis Testing Solution Design Implementation Deployments and Best Practices.

Ensuring IT services
and operational
continuity in the
enterprise
Protect mission
critical SQL Server
databases using Always On
Technologies
Maintenance
Analysis
Testing
Solution Design
Implementation
Deployments and Best Practices
High availability
RTO
operational continuity during a
given measurement period
RPO
Disaster Recovery
business operations due to a
natural or human-induced disaster
service level
Availability
Class
Acceptable Downtime
(hrs/yr) OR RTO
Acceptable Data Loss
(time of last copy) OR
RPO
Tier 1
>99.99%
(1 hr or less)
5 min or less
Tier 2
99.9% - 99.99% (1- 8.5 hrs)
5 mins to 8.5 hrs
Tier 3
(<99.9%)
(Hours to days)
Hours to days
agreements
Regional DR
Geographic DR
 Protection against
Local HA
 Natural Disasters
 Protection against
 Network Outages
 Site Failures
 Location Redundancy
– City, County
– < 100
 Location Redundancy
– State, Country
– > 100 miles
Unplanned Downtime
Planned Downtime
Mirror is always redoing—it
remains current
Witness
Application
Commit
Principal
Mirror
1
5
2
SQL Server
2
Log
SQL Server
4
>2
Data
3
>3
Log
Data
Shared
Disk for
FC2
InstA
InstB
FC2
Passive
Node
FC2
Active
Node
FC1
Active
Node
FC1
Passive
Node
Shared
Disk for
FC1
InstC
FC1
Passive
Node
FC1
Passive
Node
Windows Server Cluster
Transactional Replication
Peer to Peer Replication
Reporting + Redundancy
Query Scale Out + Redundancy
Boston
New York
England
Shanghai
New Jersey
Tokyo
Seattle
Unplanned Downtime
Planned Downtime
1.
2.
3.
Optional: switch roles again
Backup principal log with norecovery
Recover secondary
Re-direct clients to secondary
Prepare for Upgrade
Upgrade Step 1
FOI1
FOI1
FOI2
FOI2
Prepare for Upgrade:
1. Ensure .NET 3.5 and MSI 4.5 are installed on each node
2. Consider upgrading SQL Server shared components on
each node first
Upgrade Step 2
Upgrade Step 1:
- Upgrade half of the nodes
- Start upgrading passive nodes first to minimize failovers
- Consider moving other instances to avoid service restart if
this is the first Katmai instance on the node
Upgrade Step 3
FOI1
FOI1
FOI2
FOI2
Passive Nodes Offline
FOI1 Possable Owners
Upgrade Step 2:
- When half nodes are upgraded or when specified, setup will
roll ownership to upgraded nodes. Downlevel nodes will be
removed from possible owners and up-level nodes will be
added
Upgrade Step 3:
1. Upgrade remaining nodes
2. Return ownership to desired nodes if needed
FOI1 Upgraded Nodes
Unplanned Downtime
Planned Downtime
RPO
Redundancy and
Utilization
Failover
Cost
Hard-ware
App
Perf
Impact
Manageability
Low
Low
Low
*
Low
High
Low
*
Low
Low
Low
Cluster
High***
Low ***
Low***
Transactional
Replication
Low
Low
High
Peer-Peer
Replication
Low
Low
High
Solutions
No Data Loss
(RPO=0)
Failover Unit
Inst
DB
Tab
Auto
Failover
(RTO)
Read
Sync
Async
Write
*
Log Shipping
DBM
Multiple
+ **
* Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or
database snapshots respectively
** Database Mirroring provides fastest failover to hot secondary
*** Depends on SAN technology
AdventureWorks Inc Scenario
Adventureworks Inc is a manufacturing company
that manufactures and sells bicycles across the
world. There are a number of applications, some
that are mission critical that run on multiple SQL
Server Instances
The DBA team is run by Darren who is responsible
for deploying and managing the application
databases. One of his core responsibilities is to
ensure availability of all application databases in
order to meet the application SLA
Application Requirements
Applications
Data Loss
RPO=0
RTO in
secs
Failover Unit
Inst
Manufacturing
Finance
Scheduling
DB
Auto
Failover
Tab
Read
Multiple
Sites
Read
Write
Solution Choice for Manufacturing Application
Solutions
Data Loss
RPO=0
Fast
RTO
Failover Unit
Inst
DB
Tab
Auto
Failover
Read
>1
Sites\ Copy
Read
Write
Cluster
SAN Replication
DBM - Sync

DBM - Async

Clustering can provide a zero data loss solution that can also provide
fast instance level failover
Use RAID configuration to provide data redundancy on the SAN
Log Shipping
If a redundant copy is required that can provide instance failover with
zero
Transactional
data loss use SAN replication
Replication
High Cost Solution
Peer-Peer
Use synchronous database mirroring if instance failover is not needed
Replication
Clustering with RAID
Solution Choice for Finance Application
Solutions
Data Loss
RPO=0
Fast
RTO
Failover Unit
Inst
DB
Tab
Auto
Failover
Read
>1
Sites\ Copy
Read
Write
Cluster
SAN Replication
DBM - Sync

DBM - Async

Shipping
 Log
For
database level redundancy with acceptable data loss with
minimal perf impact, asynchronous database mirroring is an
Transactional
optimal choice
Replication
 Use database snapshots at periodic intervals to provide a
Peer-Peer
readable
Replication
snapshot of the data for reporting
Reports
Finance
Scheduling
 Low cost solution
Async Database
Mirroring
Omaha Datacenter
Db Snapshot
every hour
Adding a regional datacenter into the mix
Regional Site Solution Choices
Manufacturing
Cluster with SAN
Sync Mirroring
no witness
Reports
Finance
Scheduling
Async Database
Mirroring
Db Snapshot
every hour
Log Shipping
Omaha Datacenter
CB Datacenter
Complete Architecture
Topology Diagram
Sync Mirroring
No witness
Manufacturing
Cluster with SAN
Log Shipping
Licensing Facts
HA Features Edition Support
Feature
Express
Workgroup
Standard
Database Mirroring
1
Failover Clustering
2
Enterprise
Comments
Advanced high
availability solution that
includes fast failover and
automatic client
redirection
Backup Log-shipping
Data backup and
recovery solution
Online System Changes
Includes Hot Add
Memory, dedicated
administrative
connection, and other
online operations
Online Indexing
Online Restore
Fast Recovery
₁Single thread redo
₂ Limited to 2 node cluster
Database available when
undo operations begin
Summary
“one size fits all”
DAT401 | High Availability and Disaster Recovery: Best Practices for Customer Deployments
DAT305 | See the Largest Mission Critical Deployment of Microsoft SQL Server around the World
DAT303 | Architecting and Using Microsoft SQL Server Availability Technologies in a Virtualized World
DAT407 | Windows Server 2008 R2 and Microsoft SQL Server 2008: Failover Clustering
Implementations
WSV313 | Failover Clustering Deployment Success
WSV314 | Failover Clustering Pro Troubleshooting with Windows Server 2008 R2
DAT09-HOL | Installing a Microsoft SQL Server 2008 + SP1 Clustered Instance
DAT12-HOL | Maintaining a Microsoft SQL Server 2008 Failover Cluster
VIR06-HOL | Implementing High Availability and Live Migration with Windows Server 2008 R2 Hyper-V
“At the end of the day, IT operations is really
It’s a free download!
Go to www.microsoft.com/ipd
about running your business as efficiently as you
can so you have more dollars left for innovation.
IPD guides help us achieve this.”
Peter Zerger, Consulting Practice Lead for Management
Solutions, AKOS Technology Services
www.microsoft.com/teched
www.microsoft.com/learning
http://microsoft.com/technet
http://microsoft.com/msdn
Sign up for Tech·Ed 2011 and save $500
starting June 8 – June 31st
http://northamerica.msteched.com/registration
You can also register at the
North America 2011 kiosk located at registration
Join us in Atlanta next year