VIR303 Network Storage Compute Quorum Windows Server 2008 R2/Hyper-V Rapid, Reliable, Manageable HA solution Site A Site B Nodes are located at a physically separate site.
Download ReportTranscript VIR303 Network Storage Compute Quorum Windows Server 2008 R2/Hyper-V Rapid, Reliable, Manageable HA solution Site A Site B Nodes are located at a physically separate site.
VIR303 Network Storage Compute Quorum Windows Server 2008 R2/Hyper-V Rapid, Reliable, Manageable HA solution Site A Site B Nodes are located at a physically separate site Network Storage Compute Quorum Cluster Traditional Storage Cluster Shared Volumes (CSV) SAN Disks Shared-Nothing Storage Model Unit of Failover at LUN/Disk level Ideal for Hyper-V Quick Migration scenarios SAN Disk Multiple nodes to concurrently access Unit of Failover is at VM level Ideal for Hyper-V Quick and Live Migration Site A Site A Site B Site B SAN Replica Changes are made on Site A and replicated to Site B Duration of time within which Services must be restored Window of time before a disaster during which data may be lost Select Replication Method Synchronous or Asynchronous Select Replication Platform Array or Host or Appliance Based Infrastructure optimization Traffic prioritization, Compression, WAN Accelerations 2. Replication 1. Write Request 4. Write Complete Secondary Storage Primary Storage 3. Acknowledgement 3. Replication 1. Write Request 2. Write Complete Primary Storage Secondary Storage Synchronous Asynchronous Recovery Point Objectives (RPO) High Business Impact, critical application (RPO = 0) Medium-Low Business Impact, critical applications ( RPO > 0 ) Application I/O Performance For applications not sensitive to high IO latency Applications sensitive to high IO latency Distance between sites 50 km to 300 km >200 km Bandwidth cost High Mid-Low Hardware storage Software/Host Appliance Modes Synch & Async Async Synch & Async Type LUN or Volume Block File System LUN or Volume Block level Environment Support Typically between similar arrays Storage Array Agnostic Storage Array and Platform agnostic Replication Impact Inside storage array Utilizes the Host CPU, memory, I/O resources Inside storage Appliance Double-Take Availability SteelEye DataKeeper Cluster Edition Symantec Storage Foundation for Windows Sanbolic Melio 2010 http://go.microsoft.com/fwlink/?Link ID=119949 N1 N2 N3 N4 SAN Disk set #1 • • Disk set #2 Disk #1 is visible on N1&N2 and Disk #2 on N3 & N4 SQL and non-SQL workloads separated Site B Site A VM attempts to access replica VHD Read/Write Read/Only Storage Virtualization Abstraction Site A Site B Servers abstracted from storage Virtualized storage presents logical LUN Traditional Cluster Storage Cluster Shared Volumes Live Migration Hardware Replication Consult vendor Software Replication Appliance Replication Consult vendor SAS SATA Centralized Management Console SAN/iQ Cluster SAN/iQ Multi-site SAN A D A D A B A B C B C B C D C D Volumes Remain Online Windows Hyper-V Server Cluster SAN/iQ Multi-site SAN A D A D A B A B C B C B C D C D Windows Hyper-V Server Cluster SAN/iQ Multi-site SAN A D A D A B A B C B C B C D C D Windows Hyper-V Server Cluster SAN/iQ Multi-site SAN A D A D A B A B C B C B C D C D Network Storage Compute Quorum Different Subnets Stretched VLANs Public Network Site A Site B 10.10.10.* Public Network Site A Site B 10.10.10.* 20.20.20.* 30.30.30.* Redundant Network 20.20.20.* 40.40.40.* Redundant Network Cross Subnet Site A Site B 10.10.10.* 20.20.20.* SameSubnetDelay & CrossSubnetDelay, Delay between 2 heartbeats 30.30.30.* Default is 1 Seconds between two subsequent heartbeats 2. SameSubnetThreshold & CrossSubnetThreshold Low Latency Subnet Missed heartbeats before an interface is considered down Cluster.exe Default is 5 Missing heartbeats Cluster.exe /prop CrossSubnetDelay = 1500 1. PowerShell (R2): $cluster = Get-Cluster; $cluster.CrossSubnetDelay = 1500 Public Traffic Site A Site B ………………..… …. ABCD…. Intra-Cluster Traffic Cluster.exe Cluster.exe /prop SecurityLevel=2 PowerShell (R2): $cluster = Get-Cluster; $cluster.SecurityLevel = 2 DHCP Static IP • IP updated automatically • Admin needs to configure new IP • Can be scripted DNS Replication DNS Server 1 Record Created DNS Server 2 Record Updated Record Obtained Record Updated 10.10.10.111 20.20.20.222 VM = 10.10.10.111 20.20.20.222 Site A Site B DNS Server 1 10.10.10.111 20.20.20.222 VM = 10.10.10.111 Site A Site B DNS Server 2 DNS Server 1 10.10.10.111 10.10.10.111 VLAN FS = 10.10.10.111 Site A Site B http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/App_Networking/extmsftw2k8vistacisco.pdf DNS Server 2 30.30.30.30 DNS Server 1 10.10.10.111 20.20.20.222 VM = 30.30.30.30 Site A Site B Site B 20.20.20.* 10.10.10.* 30.30.30.* CSV Network Multi-Subnet Live Migration (seamless) Quick Migration Fast failover Cluster Shared Volumes Static IP’s in guest Flexibility Complexity VLAN Network Storage Compute Quorum 1. Disk only 3. Node & Disk Majority 2. Node Majority 4. File Share Witness Vote Vote Vote ? Replicated Storage Can I communicate with majority of the nodes in the cluster? Yes, then Stay Up 5 Node Cluster: Majority = 3 Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership Site B Site A Cross site network connectivity broken! Majority in Primary Site We are down! 5 Node Cluster: Majority = 3 Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership Site B Site A Need to force quorum manually Disaster at Site 1 Majority in Primary Site Command Line: net start clussvc /fixquorum (or /fq) PowerShell (R2): Start-ClusterNode –FixQuorum (or –fq) Site C (branch office) File Share Witness Complete resiliency and automatic recovery from the loss of any 1 site \\Foo\Share Site A WAN Site B Can I communicate with majority of the nodes (+FSW) in the cluster? Yes, then Stay Up File Share Witness Site C \\Foo\Share Complete resiliency and automatic recovery from the loss of connection between sites Site A WAN Can I communicate with majority of the nodes in the cluster? No (lock failed), drop out of Cluster Membership Site B Primary Site Cluster.exe Cluster.exe . node <NodeName> /prop NodeWeight=0 PowerShell (R2): (Get-ClusterNode “NodeName”).NodeWeight = 0 Backup Site Primary Site Vote=2 Backup Site Vote=1 (3,4)-1,2 (1,2)-3,4 3,4 Cluster Configuration Command Line: net start clussvc /PQ PowerShell (R2): Start-ClusterNode –PreventQurom (or –pq) Node and File Share Majority • Even number of nodes • Best availability solution – FSW in 3rd site Node Majority • Odd number of nodes • More nodes in primary site Node and Disk Majority • Use as directed by vendor No Majority: Disk Only • Not Recommended • Use as directed by vendor http://technet.microsoft.com/en-us/library/dd197430.aspx http://technet.microsoft.com/en-us/library/dd197546.aspx http://www.microsoft.com/virtualization/en/us/solutioncontinuity.aspx http://download.microsoft.com/download/3/6/1/36117F2E-499F42D7-9ADDA838E9E0C197/SiteRecoveryWhitepaper_final_120309.pdf – – – – – – – – – WSV373-INT – – – – http://blogs.msdn.com/clustering/ http://forums.technet.microsoft.com/en-US/winserverClustering/threads/ http://blogs.msdn.com/clustering/archive/2009/08/21/9878286.aspx http://www.microsoft.com/windowsserver2008/en/us/clustering-home.aspx http://www.microsoft.com/windowsserver2008/en/us/clustering-resources.aspx http://technet.microsoft.com/en-us/library/dd443539.aspx Blue Section http://www.microsoft.com/cloud/ http://www.microsoft.com/privatecloud/ http://www.microsoft.com/windowsserver/ http://www.microsoft.com/windowsazure/ http://www.microsoft.com/systemcenter/ http://www.microsoft.com/forefront/ http://northamerica.msteched.com www.microsoft.com/teched www.microsoft.com/learning http://microsoft.com/technet http://microsoft.com/msdn SAN/iQ Cluster A D A D A B A B C B C B C D C D The goal for SAN availability is "no nines," or 100% availability. (Gartner,2007) Human error and firmware bugs are the weakest links, even in properly deployed SANs. (Gartner, 2007)