Introduction Networking Storage Quorum But what if there is a catastrophic event and you lose the entire datacenter? Site A.
Download ReportTranscript Introduction Networking Storage Quorum But what if there is a catastrophic event and you lose the entire datacenter? Site A.
Introduction Networking Storage Quorum
Site A But what if there is a catastrophic event and you lose the entire datacenter?
Site A Site B Site B Node is located at a physically separate site SAN
Dependence on People
http://www.microsoft.com/whdc/winlogo/default.mspx
http://technet.microsoft.com/en-us/library/cc732035.aspx
KB 943984
Introduction Networking Storage Quorum
SameSubnetDelay SameSubnetThreshold CrossSubnetDelay CrossSubnetThreshold Cluster.exe /prop Get-Cluster | fl *
Site A
10.10.10.1
30.30.30.1
20.20.20.1
Site B
40.40.40.1
Site A
10.10.10.1
30.30.30.1
20.20.20.1
Public Network Site B
40.40.40.1
Redundant Network
DNS Server 1 Record Created
10.10.10.111
DNS Replication Record Updated
20.20.20.222
Site A Site B DNS Server 2 Record Obtained Record Updated
RegisterAllProvidersIP HostRecordTTL
DNS Server 1
10.10.10.111
20.20.20.222
Site A Site B VM = 10.10.10.111
DNS Server 1 DNS Server 2 Site A
10.10.10.111
Site B VLAN FS = 10.10.10.111
30.30.30.30
10.10.10.111
20.20.20.222
Site A Site B DNS Server 1 DNS Server 2 VM = 30.30.30.30
Site A
VLAN
Site B CSV Network
DHCP Static IP
• IP updated automatically • Admin needs to configure new IP • Can be scripted
Live Migration (seamless) Quick Migration Fast failover Cluster Shared Volumes Static IPs in guest Flexibility Complexity
Multi-Subnet VLAN
Introduction Networking Storage Quorum
Site A Site B Site B SAN
Site A Site A Site B Site B SAN Changes are made on Site A and replicated to Site B DR requires data replication mechanism between sites Replica
Replication Write Request Write Complete Primary Storage Acknowledgement Secondary Storage
Write Request Write Complete Primary Storage Replication Secondary Storage
Synchronous
No data loss Requires high bandwidth/low latency connection Stretches over shorter distances Write latencies impact application performance
Asynchronous
Potential data loss on hard failures Enough bandwidth to keep up with data replication Stretches over longer distances No significant impact on application performance
http://go.microsoft.com/fwlink/?LinkID= 119949
SAN Single Volume Disk5 VHD VHD VHD Concurrent access to a single file system
Site A VHD Read/Write Read/Only Site B VM attempts to access replica
Servers abstracted from storage Site A Site B Virtualized storage presents logical LUN
Live Migration Hardware Replication Software Replication Appliance Replication
Traditional Cluster Storage Cluster Shared Volumes
Consult vendor Consult vendor
VPLEX Cluster-1 NewYork-01 NewYork-02 NewYork-03 NewYork-04 CSV - Volume1 - OS VHDs CSV - Volume2 - OS VHDs CSV - Volume3 - OS VHDs CSV - Volume4 - OS VHDs CSV - Volume1 - SQL VHDs CSV - Volume2 - SQL VHDs CSV - Volume3 - SQL VHDs CSV - Volume4 - SQL VHDs NewJersey-01 NewJersey-02 NewJersey-03 NewJersey-04 VPLEX Cluster-2
Introduction Networking Storage Quorum
Vote Vote Vote Vote Vote
Vote Vote ?
Replicated Storage
Vote
Can I communicate with majority of the nodes in the cluster?
Yes, then Stay Up 5 Node Cluster: Majority = 3
Site A
Can I communicate with majority of the nodes in the cluster?
No, drop out of Cluster Membership
Site B Cross site network connectivity broken!
Majority in Primary Site
Site A
We are down!
5 Node Cluster: Majority = 3 Can I communicate with majority of the nodes in the cluster?
No, drop out of Cluster Membership
Site B Need to force quorum manually Disaster at Site 1 Majority in Primary Site
net start clussvc /fixquorum (or /fq) Start-ClusterNode –FixQuorum (or –fq)
Complete resiliency and automatic recovery from the loss of any 1 site Site A Site C (branch office) WAN
\\Foo\Share
File Share Witness Site B
Can I communicate with majority of the nodes (+FSW) in the cluster?
Yes, then Stay Up
Complete resiliency and automatic recovery from the loss of connection between sites Site A Site C (branch office) WAN
\\Foo\Share
File Share Witness
Can I communicate with majority of the nodes in the cluster?
No (lock failed), drop out of Cluster Membership
Site B
Node and File Share Majority
• Even number of nodes • Highest availability solution has FSW in 3rd site
Node Majority
• Odd number of nodes • More nodes in primary site
Node and Disk Majority No Majority: Disk Only
• Use as directed by vendor • Not Recommended • Use as directed by vendor
http://technet.microsoft.com/en-us/library/dd197430.aspx
http://technet.microsoft.com/en-us/library/dd197546.aspx
Passion for High Availability?
Are You Up For a Challenge?
Become a Cluster MVP!
Contact: [email protected]
Breakout Sessions WSV313 | Failover Clustering Deployment Success WSV314 | Failover Clustering Pro Troubleshooting with Windows Server 2008 R2 VIR303 | Disaster Recovery by Stretching Hyper-V Clusters across Sites ARC308 | High Availability: A Contrarian View DAT207 | SQL Server High Availability: Overview, Considerations, and Solution Guidance DAT303 | Architecting and Using Microsoft SQL Server Availability Technologies in a Virtualized World DAT305 | See the Largest Mission Critical Deployment of Microsoft SQL Server around the World DAT401 | High Availability and Disaster Recovery: Best Practices for Customer Deployments DAT407 | Windows Server 2008 R2 and Microsoft SQL Server 2008: Failover Clustering Implementations UNC304 | Microsoft Exchange Server 2010: High Availability Deep Dive UNC305 | Microsoft Exchange Server 2010 High Availability Design Considerations VIR06-INT | Failover Clustering with Hyper-V Unleashed with Windows Server 2008 R2 UNC01-INT | Real-World Database Availability Group (DAG) Design VIR02-INT | Hyper-V Live Migration over Distance: A Multi-Datacenter Approach BOF34-IT | Microsoft Exchange Server High Availability and Disaster Recovery: Are You Prepared?
WSV01-HOL | Failover Clustering in Windows Server 2008 R2 DAT01-HOL | Create a Two-Node Windows Server 2008 R2 Failover Cluster DAT02-HOL | Create a Windows Server 2008 R2 MSDTC Cluster DAT09-HOL | Installing a Microsoft SQL Server 2008 + SP1 Clustered Instance DAT12-HOL | Maintaining a Microsoft SQL Server 2008 Failover Cluster UNC02-HOL | Microsoft Exchange Server 2010 High Availability and Storage Scenarios VIR06-HOL | Implementing High Availability and Live Migration with Windows Server 2008 R2 Hyper-V
Cluster Team Blog: http://blogs.msdn.com/clustering/ Cluster Resources: http://blogs.msdn.com/clustering/archive/2009/08/21/9878286.aspx
http://www.microsoft.com/windowsserver2008/en/us/clustering-home.aspx
http://www.microsoft.com/windowsserver2008/en/us/clustering-resources.aspx
http://forums.technet.microsoft.com/en-US/winserverClustering/threads/ Clustering Forum (2008 R2): http://social.technet.microsoft.com/Forums/en-US/windowsserver2008r2highavailability/threads/ http://technet.microsoft.com/en-us/library/dd443539.aspx
http://technet.microsoft.com/en-us/library/dd197430.aspx
http://technet.microsoft.com/en-us/library/dd197546.aspx
http://www.microsoft.com/virtualization/en/us/solution-continuity.aspx
http://download.microsoft.com/download/3/6/1/36117F2E-499F-42D7-9ADD A838E9E0C197/SiteRecoveryWhitepaper_final_120309.pdf
Microsoft.com/Virtualization/Events Facebook.com/Microsoft.Virtualization
Twitter.com/MS_Virt Microsoft.com/Virtualization/Events
www.microsoft.com/teched http://microsoft.com/technet www.microsoft.com/learning http://microsoft.com/msdn
Sign up for Tech·Ed 2011 and save $500 starting June 8 – June 31
st