• Detect failures reliably • Able to stand multiple failures • Unified solution • Easy to configure, manage, and monitor • Reuse existing investments •

Download Report

Transcript • Detect failures reliably • Able to stand multiple failures • Unified solution • Easy to configure, manage, and monitor • Reuse existing investments •

• Detect failures reliably
• Able to stand multiple failures
• Unified solution
• Easy to configure, manage, and monitor
• Reuse existing investments
• SAN/DAS environments
• Allow using HA hardware resources
• Fast seamless failover
SQL Server HA/DR Technologies
Failover Cluster Instances
(for servers)
Availability Groups
(for groups of databases)
Pre-existent
New
Server failover
Multi-database Failover
Useful in consolidation scenarios
Shared storage (SAN / SMB)
Depends on storage redundancy
Failover takes 30s to couple of minutes
Server restart
SQL instance is replica for one FCI
Passive secondary nodes
DBs that app depends on
Direct attached storage
Log synchronization
Failover takes less than 30 seconds
Secondary replicas are online
SQL instance hosts replicas for one or more
AG replicas
Active Secondary Replicas
Enhancements in SQL Server 2012
Introduced in SQL Server 2012
Integrated
Multi-database Failover
Multiple secondaries (4)
Sync (max 2) / Async
Compression &
Encryption
Manual/Automatic
Failover
Flexible Failover Policy
Automatic Page Repair
Seamless App
Connectivity
Configuration Wizard
Monitoring Dashboard
Diagnostics
infrastructure
System Center
integration
Full cross-feature
support
Contained Databases,
FileStream, FileTable,
Service Broker
Efficient
Active Secondaries
Read workloads
Backups
PowerShell Automation
Fast Failover
Sync Log
Synchronization
Async Log
Synchronization
SQL Server HA/DR Technologies
Availability Groups
(for groups of databases)
Pre-existent
Increased Number of
Secondaries
Server
failover
Useful inAvailability
consolidationof
scenarios
Increased
Readable
Secondaries
Shared
storage
(SAN / SMB)
AddDepends
Azure Replica
Wizard
on storage
redundancy
Failover takes minutes
Server restart
Multi-node instance
Enhanced
Passive secondary nodes
Failover Cluster Instances
(for servers)
Support for Windows
New
Cluster SharedFailover
Volumes
Multi-database
DBs that app depends on
Direct attached storage
Log synchronization
Failover takes seconds
D i a g nSecondary
o s t i c sreplicas are online
Multiple Secondary Replicas
Active Secondary Replicas
Increased Number of Secondaries
•
Single technology to configure / manage
•
Higher throughput (~7x) than Replication
•
Reduce query latency in geo-distributed environments
•
Scale-out read workloads
•
Max 2 sync secondaries for high availability
•
Secondary delay depends on network latency and I/O: ~1s within data center, ~5s between
data centers
Increased Number of Secondaries
•
Commits don’t wait for async secondaries
•
Log sender threads share log pool
•
Added transaction latency of 8 async secondaries: <1%
•
Read_Only connections still routed to first available readable secondary
•
Load balancing possible via DNS round-robin or specialized load balancers (e.g. NLB)
Increased Readable Secondaries Availability
•
Geo-distributed environments (e.g. failure/upgrade of network equipment, ISP failures)
•
Hybrid (on-premise to Azure) deployments
•
Readable secondaries remain available during “Resolving” state
•
Requires direct connections to readable secondaries (Read-only routing not supported yet)
•
Replica state and last commit time available in DMV/Dashboard
Increased Readable Secondaries Availability
Sync Log
Synchronization
Async Log
Synchronization
Increased Readable Secondaries Availability
Increased Readable Secondaries Availability
•
•
Simpler to change DNS than force failover and failback
Doesn’t result in data loss
“The increased readable secondaries availability means
our users can still find answers online and the world
keeps spinning”
- StackOverflow
Add Azure Replica Wizard
•
Site rent + maintenance, hardware, Ops
•
Offload read workloads
•
Offload backups (policy compliance)
•
Disaster recovery
•
West US, East US, East Asia, Southeast Asia, North Europe, West Europe
•
Latency / political considerations
Add Azure Replica Wizard
Sync Log
Synchronization
Async Log
Synchronization
Add Azure Replica Wizard
•
VM and storage
•
Free ingress traffic
•
Lufthansa, Thomson Reuters, GameStop, Buffalo Hospital Supply
•
E2E: From provisioning VM to starting log synchronization
•
Validates environment, handles failures, does cleanup
Enhanced Diagnostics
•
Simplify troubleshooting & prevent issues
•
Based on feedback from customers & CSS
Enhanced Diagnostics
Title
Component
Show
in XEL
output
in UTC
(not (not
adjusted
to client
SSMS SSMS
computer)
Showtimestamps
timestamps
in XEL
output
in UTC
adjusted
to client
computer)
XEvents Viewer
Warning about log synchronization behavior when primary replica is async
Dashboard
System
function
IsPrimaryReplica(database_name)
System
function
IsPrimaryReplica(database_name)
System function
Add AG name (and replica name and DB name if relevant) to many more XEvents to
allow better data correlation between the logs
Report major HADRON Manager transitions to AlwaysOn XEvent session
XEvents
Add Replica name context to connection established error log entry
Error Log
XEvents
Dump
output
from
sys.dm_hadr_database_replica_states
to SQLtoerror
when
Dumprelevant
relevant
output
from
sys.dm_hadr_database_replica_states
SQL log
error
log XEvents
replicas
change to
resolving
state
when replicas
change
to resolving
state
Add new error message to detect AG startup failure when quorum is forced
Error Log
Separate
error
msg
41142
(replica
can'tcan't
become
primary)
- raised
for two
Separate
error
msg
41142
(replica
become
primary)
- raised
forimportantly
two
different
reasons
importantly
different reasons
AlwaysOn Functions/DMVs should also support FCIs where applicable
Improve the CREATE AG error message “AG already exists”, to say “It’s possible that a
previous DROP AG operation, executed during cluster quorum loss, didn’t delete the AG
from the cluster. If so, please retry the DROP operation”
Remove FCI setup dependency on cluster.exe (deprecated) – Use Powershell
Error Log
DMVs
Error Message
Error Log
Support for Windows Cluster Shared Volumes (Windows Server 2012+)
•
Shared disk accessible to all nodes (over SMB)
•
One or more per physical drive
•
Improves SAN utilization
Removes limitation of 24 drives
•
Increases I/O resiliency
Retry read/write via other nodes
•
Increases failover resiliency
Disks don’t need to be unmounted/mounted
Support for Windows Cluster Shared Volumes (Windows Server 2012+)
Windows Cluster Enhancements
•
Reduces node evictions
•
Removes votes from unavailable nodes
•
Enables “last man standing”
•
Names (e.g. Listeners) are registered directly to DNS
•
Avoid permission/collision issues
Breakout Sessions (session codes and titles)
•
• DBI-B314: Microsoft SQL Server High Availability and Disaster Recovery in Microsoft Azure (Thu
2:45 PM)
Labs (session codes and titles)
• DBI-H304: Implementing HA/DR with Microsoft SQL Server 2014 AlwaysOn Availability Groups
Find Me Later At. . .
• SQL Server Booth (Wed 10:45 AM – 1 PM)
http://www.trySQLSever.com
http://www.powerbi.com
http://microsoft.com/bigdata
http://channel9.msdn.com/Events/TechEd
www.microsoft.com/learning
http://microsoft.com/technet
http://microsoft.com/msdn