Gopal Ashok Program Manager Microsoft Corporation DAT306 Agenda Introduction to High Availability and Disaster Recovery SQL Server Always On Technologies Developing Your Availability Solution Conclusion.

Download Report

Transcript Gopal Ashok Program Manager Microsoft Corporation DAT306 Agenda Introduction to High Availability and Disaster Recovery SQL Server Always On Technologies Developing Your Availability Solution Conclusion.

Gopal Ashok
Program Manager
Microsoft Corporation
DAT306
Agenda
Introduction to High Availability and
Disaster Recovery
SQL Server Always On Technologies
Developing Your Availability Solution
Conclusion
Introduction to High Availability and
Disaster Recovery
Definitions
Introduce key terms and concepts
Business Continuity Planning
Overview of the BCP process
SQL Server High Availability Planning
How does BCP apply to SQL Server availability?
High Availability and Disaster Recovery:
Definition
High Availability
High availability is a system design
protocol and associated implementation
that ensures a certain absolute degree
of operational continuity during a given
measurement period
Availability defined in terms of service
level agreements (SLA)
Recovery Time
Data loss during unplanned downtime
A highly available application should be
accessible by users x% of the time
Disaster Recovery
Processes and procedures
designed to restore business
operations due to a natural or
human-induced disaster
Typically involves providing
redundancy spanning multiple sites
or across geographic regions
Defining x and SLA
Availability
Class
Acceptable
Downtime (hrs/yr)
OR RTO
Acceptable Data
Loss (time of last
copy) OR RPO
Tier 1
>99.99%
(1 hr or less)
5 min or less
Tier 2
99.9% - 99.99% (18.5 hrs)
5 mins to 8.5 hrs
Tier 3
(<99.9%)
(Hours to days)
Hours to days
Recovery Time Objective (RTO)
guided by availability requirements
How much downtime can you tolerate?
Recovery Point Objective (RPO)
guided by criticality of
application data
How much data can you lose?
RPO
Tier1
RTO
Protection Levels
Protection against resource failures
Machine
Database Corruption
Disk
Regional DR
Location Redundancy
Building
< 10 miles
Geographic DR
 Protection against
 Natural Disasters
 Protection against
 Network Outages
 Site Failures
 Location Redundancy
Local HA
– City, County
– < 100-200 miles
 Location Redundancy
– State, Country
– > 100-200 miles
Business Continuity Planning
Impact Analysis
Critical Functions
Threat Identification
Recovery Objectives
Analysis
Solution Design
Maintenance
Solution
Design
Achieve recovery objectives for relevant
threats within specified constraints like
budget, human resources etc
Cost\Benefit analysis of solutions
Implementation
Deploy the recommended solution
Testing
Testing
Implementati
on
Test to see if the solution meets the
recovery requirements
Maintenance
Yearly testing and review of procedures
SQL Server High Availability Planning
Analysis
Application tiers serviced by the databases
Causes of database downtime
Protection levels: Local HA, Regional DR, Geographic DR
Analysis
Solution Design
Need to understand what solutions exists?
What are the characteristics and
cost of the solution?
Maintenance
Solution Design
Implementation
What are the deployment steps and best practices?
Testing
How do I test my implementation?
Maintenance
How do I monitor and maintain the solution?
Testing
Implementation
Database Downtime Drivers
Failure
Unplanned
Downtime
Protection
User Errors
Database
Downtime
Online
Planned
Downtime
Administration
Predictable
Resourcing
Analysis
Solution Design
Understand the
solutions and
choices before
making a decision
Solution
Design
Solution Architecture
HA Capabilities
Limitations and Caveats
Cost Vector
Solution
Design
SQL Server
Always On Technologies
Always On Technologies
Provides a full range
of options to minimize
downtime and
maintain appropriate
levels of application
availability
Increases
Availability
Decreased
Downtime
•
•
•
•
•
Solution
Design
Backup and Restore
Log Shipping
Database Mirroring
Failover Clustering
Peer-Peer Replication
• Online Index Operations
• Table Partitioning
• Enhanced Locking
• Resource Governor
• Database Snapshot
• Dedicated Admin Connection
• Dynamic Configuration
Always On Technology Overview
Solution
Design
Architecture Overview
How does it work?
Solution Characteristics
Data Loss Guarantees
Failover Characteristics
Redundancy Levels
and Utilization
Cost
Limitations and Caveats
Increases
Availability
•
•
•
•
•
Backup and Restore
Log Shipping
Database Mirroring
Failover Clustering
Peer-Peer Replication
Backup and Restore
Base availability technology for any solution
Protects against failures and recovery from errors
Provides Local HA and Site DR
Need to ensure the backups are accessible if site goes down
High RTO due to restore time
RPO=0 can never be guaranteed
Types: Full, Differential, and Transaction Log
File-group backup/restore for large databases
Backup Compression provides faster and
smaller backups in SQL Server 2008
Solution
Design
Log Shipping
Automated transaction log backup and
restore provides redundancy at the
database level
SQLLogship.exe provides the underlying
framework for doing automated backup,
copy and restore
Backup on primary instance
Restore on secondary instance(s)
Scheduling is done through
SQL Server Agent jobs
SQL Server 2008 provides sub-minute
scheduling interval providing the ability
to do quick backup and restores
No automatic failover capabilities
Solution
Design
Database Mirroring
A database level high availability solution
that provides complete protection
against data loss and fast recovery
through automatic failover
Maintains a redundant database by
shipping log blocks when the
transactions are committed on the
principal
Synchronous and Asynchronous
modes provide the spectrum of
options to choose between
availability and performance
Automatic failover when using
witness server
Solution
Design
Failover Clustering
Solution
Design
Instance level protection built on Windows Failover
Clustering shared disk model
Cluster nodes typically co-located within the
same site to provide local HA
Regional DR possible using VLAN and stretch
storage level replication
No built in data redundancy like database mirroring
and log shipping
Data protection has to be provided at the
storage level or by combining with other solutions
Transactional Replication
Solution
Design
A high performance data replication solution that provides
granular table level replication
Logical data movement provides flexibility and
better hardware utilization
Key scenarios:
Customized application-specific DR
Real-time reporting on secondary server that be used for Site DR
Scale out application queries with ability to use any one
database copy for Site DR
Two types relevant for HA and DR
Transactional and Peer-to-Peer
Always On Solution Characteristics
RPO
Redundancy and
Utilization
Failover
Solution
Design
Cost
Hardware
App Perf
Impact
Manageability
Low
Low
Low
*
Low
High
Low
*
Low
Low
Low
Cluster
High***
Low ***
Low***
Transactional
Replication
Low
Low
High
Peer-Peer
Replication
Low
Low
High
Solutions
No Data
Loss
(RPO=0)
Failover Unit
Inst
DB
Tab
Auto
Failover
(RTO)
Sync
Async
Multiple
*
Log Shipping
DBM
Read
+ **
Write
* Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or
database snapshots respectively
** Database Mirroring provides fastest failover to hot secondary
*** Depends on SAN technology
What’s New in SQL Server 2008
New Features
Resource Governor
Manage SQL Server workloads
and resources by specifying
limits on resource consumption
Backup Compression
Reduce backup and restore time
Feature Enhancements
Database Mirroring
Automatic recovery from
page corruption
Log stream compression
Faster recovery on failover
Log Shipping
Sub-Minute Log Shipping
Backup compression
Failover Clustering
16 nodes
Rolling upgrade
Peer-Peer Replication
Hot add new nodes
Backup Compression
Common questions:
“How much compression will I see?”
“Will it be comparable to, say,
SQL Litespeed?”
One simple answer:
“It depends!”
All data compresses
differently – the compression
ratio achieved depends on:
The type of data in the database
Whether the data in the database is
already compressed
Whether the data/database is encrypted
“We saw an 85 percent
reduction in file size using SQL
Server 2008 Backup
Compression,” says Colin
Neller, Senior Software
Engineer at ServiceU and part
of the company’s SQL Server
2008 implementation team.
“A backup file that was
previously over 300 GB is now
only 40 GB, and the job runs
in about half the time.”
Backup Compression: Backup Performance
Backup of a 322 MB Adventureworks database
Compressed
Uncompressed
Hardly any CPU used (avg 5%),
runtime = 39.5s, compression
ratio of 0.
A LOT more CPU used (avg 25%)
BUT runtime = 21.6s (45%
improvement) and backup stored in
76.7MB (4.2x compression ratio)
DEMO: Increasing Availability Using
Always On Technologies
Solution
Design
Developing Your Availability
Solution using SQL Server Always
On Technologies
Recap
Application availability requirements or
SLA drive primary solution choices
RPO and RTO are the key metrics used
to define the SLA
Need mitigation against planned and
unplanned downtimes
Solution
Design
Application Availability
Unplanned
downtime
Multiple solution choices that provides
varying cost\benefits
Clustering
Planned Downtime
Database
Mirroring
Other requirements apart from
application SLA factor into the choice
Understand constraints and tradeoffs
you can make
Log
Shipping
Peer-Peer
Replication
Always On Solution Characteristics
RPO
Redundancy and
Utilization
Failover
Solution
Design
Cost
Hardware
App Perf
Impact
Manageability
Low
Low
Low
*
Low
High
Low
*
Low
Low
Low
Cluster
High***
Low ***
Low***
Transactional
Replication
Low
Low
High
Peer-Peer
Replication
Low
Low
High
Solutions
No Data
Loss
(RPO=0)
Failover Unit
Inst
DB
Tab
Auto
Failover
(RTO)
Sync
Async
Multiple
*
Log Shipping
DBM
Read
+ **
Write
* Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or
database snapshots respectively
** Database Mirroring provides fastest failover to hot secondary
*** Depends on SAN technology
AdventureWorks Inc Scenario
Adventureworks Inc is a
manufacturing company that
manufactures and sells bicycles
across the world. There are a
number of applications, some that
are mission critical that run on
multiple SQL Server Instances
The DBA team is run by Darren who
is responsible for deploying and
managing the application
databases. One of his core
responsibilities is to ensure
availability of all application
databases in order to meet the
application SLA
Solution
Design
One datacenter located in Omaha
Three applications
Manufacturing – Tier 1
Finance – Tier 2
Scheduling – Tier 3
Manufacturing application runs on a
dedicated SQL Server 2008 Instance
All other applications run on
a second instance
Availability of manufacturing
application is critical
Implement a solution at the lowest
possible cost
Application Requirements
Applications
Data
Loss
RPO=0
RTO in
secs
Failover Unit
Inst
DB
Auto
Failover
Read
Solution
Design
Multiple
Sites
Read
Write
Tab
Manufacturing
Finance
Scheduling
Manufacturing application has strict SLA’s
Finance application requires readability on the secondary
The reports are run every 4 hours and need to be fresh as of the last
one hour. To offload the reporting load from the main system they
would like to utilize the mirror
Solution Choice for Manufacturing Application
Solution
Design
Solutions
Data Loss
RPO=0
Fast
RTO
Failover Unit
Inst
DB
Tab
Auto
Failover
Read
>1
Sites\
Copy
Read
Write
Cluster
SAN Replication
DBMClustering
- Sync
can provide a zero data loss
solution that can also provide fast
DBM - Async
instance level failover
Use RAID configuration to provide data
Log Shipping
redundancy on the SAN
Transactional
If a redundant copy is required that can
Replication
provide instance failover with zero
data loss use SAN replication
Peer-Peer


Replication High Cost Solution
Use synchronous database mirroring if
instance failover is not needed
Clustering with RAID
Solution Choice for Finance Application
Solution
Design
Solutions
Data Loss
RPO=0
Fast
RTO
Failover Unit
Inst
DB
Tab
Auto
Failover
Read
>1
Sites\
Copy
Read
Write
Cluster
SAN Replication
DBM - Sync

DBM - Async

Shipping
 LogFor
database level redundancy with acceptable
data loss with minimal perf impact,
Transactional
asynchronous database mirroring is an optimal
Replication
choice
 Peer-Peer
Use database snapshots at periodic intervals to
provide
Replication
a readable
snapshot of the data for reporting
 Low cost solution
Reports
Finance
Scheduling
Async Database
Mirroring
Omaha Datacenter
Db Snapshot
every hour
Adding a Regional Datacenter Into the Mix
Solution
Design
Regulatory and compliance requirements drive the
need for having a additional datacenter within a 10
mile radius to provide redundancy against site level
failure.
It is now required that all applications have the ability to
failover to the regional datacenter across the river in Council
Bluff
The SLA need to be maintained for tier 1 applications
even in the case of site failures
Regional Site Solution Choices
Solution
Design
Manufacturing
Cluster with SAN
Sync Mirroring
no witness
Reports
Finance
Scheduling
Async Database
Mirroring
Db Snapshot
every hour
Log Shipping
Omaha Datacenter
CB Datacenter
A Complete Topology
Solution
Design
Considering the potential of floods and tornadoes
destroying the regional data centers, Adventureworks
Inc wants to maintain a disaster recovery site in
San Antonio, TX
The disaster recovery site has lower SLA requirements
for all applications
The manufacturing application can have an RPO of 1 hour
The RTO is set at 4 hours
Topology Diagram
Manufacturing
Sync Mirroring
No witness
Cluster with SAN
Log Shipping
Solution
Design
Scale Out and Availability Scenario
Adventureworks is
building a new web
based order
management system that
allows customers from all
over the world access the
system and place orders
The core group of
customers are in Western
Europe, South East Asia
and North America
Solution
Design
 Requirements
–
–
–
–
Geo Redundancy
Data Locality
High Availability
Local Read-Scale
 Workload Characteristics
– Mainly reads
– Few writes
 Application Characteristics
– Each user logging in connects to a
particular server
 Partitioned based on user-id and region
 Writes from a user always happen on one
server regardless of the region the user log in
from
– All reads redirected to the closest geolocation
 Reasonable tolerance for latency (5-10 minutes)
Replication Topology
Asia1
Peer Nodes
Read-Only Servers
Solution
Design
Asia2
Implementing and maintaining a HA solution
Licensing Facts
Passive servers are mirror, log
shipped secondary and clustering
passive node
No license required on passive if it is
truly passive
A passive server does not need a
license if the number of processors
in the passive server is equal to or
less than the number of processors
in the active server.
The passive server can take the
duties of the active server for 30
days. Afterwards, it must be licensed
accordingly.
HA Features Edition Support
Feature
Express
Workgroup
Standard
Database Mirroring
1
Failover Clustering
2
Enterprise
Comments
Advanced high
availability solution
that includes fast
failover and
automatic client
redirection
Backup Log-shipping
Data backup and
recovery solution
Online System
Changes
Includes Hot Add
Memory, dedicated
administrative
connection, and
other online
operations
Online Indexing
Online Restore
Fast Recovery
₁Single thread redo
₂ Limited to 2 node cluster
Database available
when undo
operations begin
Summary
There is no “one size fits all” solution
Consider the cost\benefits\constraints and compare that
to availability requirements of the organization to
determine the best solution
Use the charts to understand cost, benefit and constraints of
the various SQL Server High Availability solutions
TEST the solution to ensure it can meet the availability
requirements and meet SLA’s
SQL Server Word of the Day
Wednesday, May 13
DATA COMPRESSION
*Game cards may be picked up at the SQL Server booths in the TLC
Additional Resources
•
•
Team Forum: http://social.msdn.microsoft.com/Forums/enUS/sqldisasterrecovery/threads
Other: http://sqlcat.com/
External Resources
http://www.microsoft.com/sqlserver/2008/en/us/hi
gh-availability.aspxURL
http://sqlcat.com/
SQL Server 2008 Business Value Calculator:
www.moresqlserver.com
Resources
www.microsoft.com/teched
www.microsoft.com/learning
Sessions On-Demand & Community
Microsoft Certification & Training Resources
http://microsoft.com/technet
http://microsoft.com/msdn
Resources for IT Professionals
Resources for Developers
www.microsoft.com/learning
Microsoft Certification and Training Resources
Related Content
Breakout Sessions
DAT302 All You Need to Know about Microsoft SQL Server 2008 Failover Clusters
DAT318 Microsoft SQL Server 2008 Virtualization Considerations and Best Practices
DAT322 Tips and Tricks for Successful Database Mirroring Deployments with
Microsoft SQL Server
Hands On Lab
DAT07-HOL Microsoft SQL Server 2008 Peer-to-Peer Replication
DAT08-HOL Microsoft SQL Server 2008 Table Index and Partitioning
DAT12-HOL Microsoft SQL Server 2008 Database Mirroring, Part 1
DAT13-HOL Microsoft SQL Server 2008 Database Mirroring, Part 2
DAT16-HOL Using Microsoft SQL Server 2008 Resource Governor for
Predictable Performance
Complete an
evaluation on
CommNet and
enter to win!
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should
not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS,
IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.