VIR303 Network Storage Compute Quorum Windows Server 2008 R2/Hyper-V Rapid, Reliable, Manageable HA solution Site A Site B Nodes are located at a physically separate site.

Download Report

Transcript VIR303 Network Storage Compute Quorum Windows Server 2008 R2/Hyper-V Rapid, Reliable, Manageable HA solution Site A Site B Nodes are located at a physically separate site.

VIR303
Network
Storage
Compute
Quorum
Windows Server 2008
R2/Hyper-V
Rapid, Reliable, Manageable HA solution
Site A
Site B
Nodes are
located at a
physically
separate site
Network
Storage
Compute
Quorum
Cluster Traditional Storage
Cluster Shared Volumes (CSV)
SAN
Disks
Shared-Nothing Storage Model
Unit of Failover at LUN/Disk level
Ideal for Hyper-V Quick Migration
scenarios
SAN
Disk
Multiple nodes to concurrently access
Unit of Failover is at VM level
Ideal for Hyper-V Quick and Live Migration
Site A
Site A
Site B Site B
SAN
Replica
Changes are made on Site A
and replicated to Site B
Duration of time within which Services must be restored
Window of time before a disaster during which data may be lost
Select
Replication
Method
Synchronous or
Asynchronous
Select
Replication
Platform
Array or Host or
Appliance Based
Infrastructure
optimization
Traffic prioritization,
Compression, WAN
Accelerations
2. Replication
1. Write
Request
4. Write
Complete
Secondary
Storage
Primary
Storage
3. Acknowledgement
3. Replication
1. Write
Request
2. Write
Complete
Primary
Storage
Secondary
Storage
Synchronous
Asynchronous
Recovery Point
Objectives (RPO)
High Business Impact, critical
application (RPO = 0)
Medium-Low Business Impact,
critical applications ( RPO > 0 )
Application I/O
Performance
For applications not sensitive to high
IO latency
Applications sensitive to high IO
latency
Distance between sites
50 km to 300 km
>200 km
Bandwidth cost
High
Mid-Low
Hardware
storage
Software/Host
Appliance
Modes
Synch & Async
Async
Synch & Async
Type
LUN or Volume Block
File System
LUN or Volume Block level
Environment
Support
Typically between
similar arrays
Storage Array
Agnostic
Storage Array and Platform
agnostic
Replication
Impact
Inside storage array
Utilizes the Host
CPU, memory, I/O
resources
Inside storage Appliance
Double-Take Availability
SteelEye DataKeeper
Cluster Edition
Symantec Storage
Foundation for Windows
Sanbolic Melio 2010
http://go.microsoft.com/fwlink/?Link
ID=119949
N1
N2
N3
N4
SAN
Disk set #1
•
•
Disk set #2
Disk #1 is visible on N1&N2 and Disk #2 on N3 & N4
SQL and non-SQL workloads separated
Site B
Site A
VM attempts to
access replica
VHD
Read/Write
Read/Only
Storage Virtualization Abstraction
Site A
Site B
Servers abstracted
from storage
Virtualized storage
presents logical LUN
Traditional Cluster
Storage
Cluster Shared
Volumes
Live Migration
Hardware Replication
Consult vendor
Software Replication
Appliance Replication
Consult vendor
SAS
SATA
Centralized Management Console
SAN/iQ Cluster
SAN/iQ Multi-site SAN
A
D
A
D
A
B
A
B
C
B
C
B
C
D
C
D
Volumes Remain Online
Windows Hyper-V Server Cluster
SAN/iQ Multi-site SAN
A
D
A
D
A
B
A
B
C
B
C
B
C
D
C
D
Windows Hyper-V Server Cluster
SAN/iQ Multi-site SAN
A
D
A
D
A
B
A
B
C
B
C
B
C
D
C
D
Windows Hyper-V Server Cluster
SAN/iQ Multi-site SAN
A
D
A
D
A
B
A
B
C
B
C
B
C
D
C
D
Network
Storage
Compute
Quorum
Different Subnets
Stretched VLANs
Public
Network
Site A
Site B
10.10.10.*
Public
Network
Site A
Site B
10.10.10.*
20.20.20.*
30.30.30.*
Redundant
Network
20.20.20.*
40.40.40.*
Redundant
Network
Cross
Subnet
Site A
Site B
10.10.10.*
20.20.20.*
SameSubnetDelay & CrossSubnetDelay,
 Delay between 2 heartbeats
30.30.30.*
 Default is 1 Seconds between two subsequent
heartbeats
2. SameSubnetThreshold & CrossSubnetThreshold
Low Latency
Subnet
 Missed heartbeats before an interface is
considered down
Cluster.exe
 Default is 5 Missing heartbeats
Cluster.exe /prop CrossSubnetDelay = 1500
1.
PowerShell (R2):
$cluster = Get-Cluster; $cluster.CrossSubnetDelay = 1500
Public
Traffic
Site A
Site B
………………..…
…. ABCD….
Intra-Cluster
Traffic
Cluster.exe
Cluster.exe /prop SecurityLevel=2
PowerShell (R2):
$cluster = Get-Cluster; $cluster.SecurityLevel = 2
DHCP
Static IP
• IP updated automatically
• Admin needs to configure new IP
• Can be scripted
DNS Replication
DNS Server 1
Record Created
DNS Server 2
Record Updated
Record Obtained
Record Updated
10.10.10.111
20.20.20.222
VM = 10.10.10.111
20.20.20.222
Site A
Site B
DNS Server 1
10.10.10.111
20.20.20.222
VM = 10.10.10.111
Site A
Site B
DNS Server 2
DNS Server 1
10.10.10.111
10.10.10.111
VLAN
FS = 10.10.10.111
Site A
Site B
http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/App_Networking/extmsftw2k8vistacisco.pdf
DNS Server 2
30.30.30.30
DNS Server 1
10.10.10.111
20.20.20.222
VM = 30.30.30.30
Site A
Site B
Site B
20.20.20.*
10.10.10.*
30.30.30.*
CSV
Network
Multi-Subnet
Live Migration (seamless)
Quick Migration
Fast failover
Cluster Shared Volumes
Static IP’s in guest
Flexibility
Complexity
VLAN
Network
Storage
Compute
Quorum
1. Disk only
3. Node & Disk Majority
2. Node Majority
4. File Share Witness
Vote
Vote
Vote
?
Replicated
Storage
Can I communicate
with majority of the
nodes in the cluster?
Yes, then Stay Up
5 Node Cluster:
Majority = 3
Can I communicate
with majority of the
nodes in the cluster?
No, drop out of
Cluster Membership
Site B
Site A
Cross site network
connectivity broken!
Majority in
Primary Site
We are down!
5 Node Cluster:
Majority = 3
Can I communicate
with majority of the
nodes in the cluster?
No, drop out of
Cluster Membership
Site B
Site A
Need to force
quorum manually
Disaster at Site 1
Majority in
Primary Site
Command Line:
net start clussvc /fixquorum
(or /fq)
PowerShell (R2):
Start-ClusterNode –FixQuorum (or –fq)
Site C (branch office)
File Share
Witness
Complete resiliency and
automatic recovery from the
loss of any 1 site
\\Foo\Share
Site A
WAN
Site B
Can I communicate with
majority of the nodes
(+FSW) in the cluster?
Yes, then Stay Up
File Share
Witness
Site C
\\Foo\Share
Complete resiliency
and automatic
recovery from the
loss of connection
between sites
Site A
WAN
Can I communicate with
majority of the nodes in the
cluster?
No (lock failed), drop out of
Cluster Membership
Site B
Primary Site
Cluster.exe
Cluster.exe . node <NodeName> /prop NodeWeight=0
PowerShell (R2):
(Get-ClusterNode “NodeName”).NodeWeight = 0
Backup Site
Primary Site
Vote=2
Backup Site
Vote=1
(3,4)-1,2
(1,2)-3,4
3,4
Cluster Configuration
Command Line:
net start clussvc /PQ
PowerShell (R2):
Start-ClusterNode –PreventQurom (or –pq)
Node and File
Share Majority
• Even number of nodes
• Best availability solution – FSW in 3rd site
Node Majority
• Odd number of nodes
• More nodes in primary site
Node and Disk
Majority
• Use as directed by vendor
No Majority: Disk
Only
• Not Recommended
• Use as directed by vendor
http://technet.microsoft.com/en-us/library/dd197430.aspx
http://technet.microsoft.com/en-us/library/dd197546.aspx
http://www.microsoft.com/virtualization/en/us/solutioncontinuity.aspx
http://download.microsoft.com/download/3/6/1/36117F2E-499F42D7-9ADDA838E9E0C197/SiteRecoveryWhitepaper_final_120309.pdf
–
–
–
–
–
–
–
–
–
WSV373-INT –
–
–
–
http://blogs.msdn.com/clustering/
http://forums.technet.microsoft.com/en-US/winserverClustering/threads/
http://blogs.msdn.com/clustering/archive/2009/08/21/9878286.aspx
http://www.microsoft.com/windowsserver2008/en/us/clustering-home.aspx
http://www.microsoft.com/windowsserver2008/en/us/clustering-resources.aspx
http://technet.microsoft.com/en-us/library/dd443539.aspx
Blue Section
http://www.microsoft.com/cloud/
http://www.microsoft.com/privatecloud/
http://www.microsoft.com/windowsserver/
http://www.microsoft.com/windowsazure/
http://www.microsoft.com/systemcenter/
http://www.microsoft.com/forefront/
http://northamerica.msteched.com
www.microsoft.com/teched
www.microsoft.com/learning
http://microsoft.com/technet
http://microsoft.com/msdn
SAN/iQ Cluster
A
D
A
D
A
B
A
B
C
B
C
B
C
D
C
D
The goal for SAN availability is "no nines," or 100% availability. (Gartner,2007)
Human error and firmware bugs are the weakest links, even in properly deployed SANs.
(Gartner, 2007)