VMware presentation

Transcript VMware presentation

BCO2874
vSphere High Availability 5.0
and SMP Fault Tolerance –
Technical Overview and Roadmap
Name, Title, Company
Disclaimer
 This session may contain product features that are
currently under development.
 This session/overview of the new technology represents
no commitment from VMware to deliver these features in
any generally available product.
 Features are subject to change, and must not be included in
contracts, purchase orders, or sales agreements of any kind.
 Technical feasibility and market demand will affect final delivery.
 Pricing and packaging for any new technologies or features
discussed or presented have not been determined.
2
vSphere HA and FT Today
Minimize downtime without the cost/complexity of traditional solutions
 vSphere HA provides rapid recovery from outages
 vSphere Fault Tolerance provides continuous availability
Coverage
Application
App Monitoring APIs
Partner
solutions
Guest OS
Guest Monitoring
VM
Fault Tolerance
Infrastructure HA
Hardware
none
3
minutes
Downtime
This Talk
1. Technical overview of vSphere HA 5.0
• Presented by Keith Farkas
2. Technical preview of vSphere Fault Tolerance SMP
• Presented by Jim Chow
Coverage
App Monitoring APIs
Partner
solutions
Application
Guest Monitoring
Guest OS
VM
HA 5.0
Multiple
vCPU
Fault Tolerance
FT
Infrastructure HA
Hardware
none
4
minutes
Downtime
vSphere HA 5.0
Objectives
 Learn about the enhancements in vSphere HA 5.0
 Understand the new architecture
 Identify questions for the breakout / expert sessions
5
vSphere HA 5.0
vSphere HA was completely rewritten in 5.0 to
• Simplify setting up HA clusters and managing them
• Enable more flexible and larger HA deployments
• Make HA more robust and easier to troubleshoot
• Support network partitions
5.0 architecture is fundamentally different
• This talk
• Describes the three key concepts
• Summarizes host failure responses
• To learn more, see other VMworld HA venues
6
5.0 Architecture
New vSphere HA agent
• Called the Fault Domain Manager (FDM)
• Provides all the HA on-host functionality
As in previous releases
• vCenter Server (VC) manages the cluster
• Failover operations are independent of VC
• FDMs communicate over management
network
vCenter Server (VC)
7
Key Concepts – Part 1
• FDM roles and responsibilities
• Inter-FDM communication
8
FDM Master
One FDM is chosen to be the master
• Normally, one master per cluster
• All others assume the role of FDM slaves
Any FDM can be chosen as master
• No longer a primary / secondary role concept
• Selection done using an election
Master-specific responsibilities
• Monitors availability of hosts / VMs in cluster
• Manages VM restarts after VM/host failures
• Reports cluster state / failover actions to VC
• Manages persisted state
vCenter Server (VC)
9
FDM Slave and Shared Responsibilities
Slave-specific responsibilities
 Forwards critical state changes to the master
 Restarts VMs when directed by the master
 If the master should fail, participates in master election
Each FDM (master or slave)
• Monitors the state of local VMs and the host
• Implements the VM/App Monitoring feature
10
The Master Election
An election is held when:
 vSphere HA is enabled
• Master’s host becomes inactive
• HA is reconfigured on master’s host
• A management network partition occurs
ESX 1
ESX 3
If multiple masters can communicate,
all but one will abdicate
Master-election algorithm
• Takes15 to 25s (depends on reason for election)
• Elects participating host with the greatest number
of mounted datastores
11
ESX 2
ESX 4
Agent Communication
FDMs communicate over the
• Management networks
• Datastores
Datastores used when network is unavailable
• Used when hosts are isolated or partitioned
Network communication
• All communication is point to point
• Election is conducted using UDP
• All master-slave communication is via SSL encrypted TCP
12
Questions Answered Using Datastore Communication
Master
Is a slave partitioned or isolated?
Are its VMs running?
13
Slave
Is a master responsible for my VM?
Questions Answered Using Datastore Communication
Master
Slave
Is a slave partitioned or isolated?
Is a master responsible for my VM?
Are its VMs running?
Datastores Used
Datastores selected by VC, called
the Heartbeat Datastores
14
Datastores containing VM config files
Heartbeat Datastores
 VC chooses (by default) two datastores for each host
 You can override the selection or provide preferences
• Use the cluster “edit settings” dialog for this purpose
15
Responses to a Network or Host Failures
16
Host Is Declared Dead
Master declares a host dead when:
• Master can’t communicate with it over the network
• Host is not connected to master
• Host does not respond to ICMP pings
• Master observes no storage heartbeats
ESX 1
ESX 3
ESX 2
ESX 4
Results in:
• Master attempts to restart all VMs from host
• Restarts on network-reachable hosts and
its own host
17
Host Is Network Partitioned
Master declares a host partitioned when:
• Master can’t communicate with it over the network
• Master can see its storage heartbeats
Results in:
• One master exists in each partition
• VC reports one master’s view of the cluster
ESX 1
ESX 3
ESX 2
ESX 4
• Only one master “owns” any one VM
• A VM running in the “other” partition will be
• monitored via the heartbeat datastores
• restarted if it fails (in master’s partition)
• When partition is resolved, all but one master abdicates
18
Host Is Network Isolated
A host is isolated when:
 It sees no vSphere HA network traffic
 It cannot ping the isolation addresses
Results in:
ESX 1
ESX 3
ESX 2
ESX 4
 Host invokes (improved) Isolation response
• Checks first if a master “owns” a VM
• Applied if VM is owned or datastore is inaccessible
• Default is now Leave Powered On
 Master
• Restarts those VMs powered off or that fail later
• Reports host isolated if both can access its
heartbeat datastores, otherwise dead
Isolation Addresses
19
Key Concepts – Part 2
HA Protection and
failure-response guarantees
20
vSphere HA Response to Failures
Guest OS hangs, crashes
Reset VM
With tools installed
Application heartbeats stop
Host fails (e.g., reboots)
Host isolation (VM powered off)
VM fails (e.g., VM crashes)
21
Attempt
The responding master
VM restart knows are HA Protected
HA Protected Workflow
User issues power on for a VM
Host powers on the VM
VC learns that the VM powered on
time
VC tells master to protect the VM
Master receives directive from VC
Master writes fact to a file
Write is done
22
HA Restart Guarantee
User issues power on for a VM
Host powers on the VM
VC learns that the VM powered on
time
VC tells master to protect the VM
Master receives directive from VC
An attempt may be made
if a failure occurs now
Master writes fact to a file
Write is done
23
An attempt will be made
for failures now and in future
vSphere HA Protection Property
Is a new per-VM property
Reports on whether a restart attempt is guaranteed
Is shown on the VM summary panel and optionally in VM lists
24
Values of the HA Protection Property
Value reported by VC
User issues power on for a VM
N/A
Host powers on the VM
VC learns that the VM powered on
time
VC tells master to protect the VM
Master receives directive from VC
Unprotected
Master writes fact to a file
Write is done. Master tells VC
VC learns VM has been protected
25
Protected
Wrap Up
26
vSphere HA Summary
 vSphere HA feature provides organizations the ability to run their
critical business applications with confidence
 5.0 Enhancements provide
• A solid, scalable foundation upon which to build to the cloud
• Simpler management and troubleshooting
• Additional and more robust responses to failures
Resource Pool
27
VMware ESXi
VMware ESXi
VMware ESXi
Operating Server
Failed Server
Operating Server
To Learn More About HA and HA 5.0
 At VMworld
• See demo in VMware booth in solutions exchange
• Try it out in lab HOL04 – Reducing Unplanned Downtime
• Attend group discussions GD15 and GD35 – vSphere HA and FT
• Review panel session VSP1682 – vSphere Clustering Q&A
• Talk with knowledge expert (EXPERTS-09)
 Offline
• Availability Guide
• Best Practices Guide
• Troubleshooting Guide
• Release notes
28
vSphere Fault Tolerance SMP
Technical Preview
Objectives
 Why Fault Tolerance?
 What’s new: SMP
29
vSphere Availability Portfolio
Coverage
App Monitoring APIs
Application
Guest Monitoring
Guest OS
VM
Fault Tolerance
Infrastructure HA
Hardware
none
30
minutes
Downtime
Why Fault Tolerance?
 Continuous Availability
• Zero downtime
• Zero data loss
• No loss of TCP connections
• Completely transparent to guest software
• Simple UI:
Turn On Fault Tolerance
• Delegate all management to the virtual infrastructure
Users
Apps
OS
31
Background





2009: vSphere Fault Tolerance in vSphere 4.0
2010: Updates to vSphere Fault Tolerance in vSphere 4.1
2011: Updates to vSphere Fault Tolerance in vSphere 5.0
Details: http://www.vmware.com/products/fault-tolerance/
Problem:
• FT only for uni-processor VMs
• Is FT for multi-processor VMs possible?
• An impressively hard problem
• Concerted effort to find an approach
 Reached milestone
• We’d like to share it
32
A Starting Point: vSphere FT
vLockstep
Application
Operating System
Virtualization Layer
Application
FT LOGGING
Operating System
Virtualization Layer
Shared Disk
33
A Clean Slate
vLockstep
Application
Operating System
Virtualization Layer
Application
FT LOGGING
Operating System
Virtualization Layer
Shared Disk
34
A Clean Slate
Application
Operating System
Virtualization Layer
 Spare you the details
 See it in action
35
Application
FT LOGGING
Operating System
Virtualization Layer
Live Demo
Client
Operating System
Application
Operating System
Virtualization Layer
 Experimental setup, caveats
36
Application
FT LOGGING
Operating System
Virtualization Layer
Live Demo Summary
 SMP FT in action
• Presented a good solution
• Client oblivious to FT operation
• SwingBench client
• SSH client
• Transparent failover
• Zero downtime, zero data loss
• Taste for performance / bandwidth
 But that’s not all
37
Performance Numbers
% Throughput (FT/non FT)
(higher is better)
100
80
60
40
20
0
Microsoft SQL
Server 2-vCPU
Microsoft SQL
Server 4-vCPU
Oracle
Swingbench 2vCPU
Oracle
Swingbench 4vCPU
 Similar configuration to vSphere 4 FT Performance Whitepaper
• Models real-world workloads: 60% CPU utilization
38
vSphere FT Summary
 Why Fault Tolerance
• Continuous availability
 Fault Tolerance for multi-processor VMs
• Good solution to impressively hard problem
• A new design
• Demonstrated similar experience to existing vSphere FT
• But more vCPUs
39
vSphere HA and FT
Future Directions
40
vSphere HA and FT – Technical Directions
Technical directions include
 More comprehensive coverage of failures for more applications
Coverage
Multi-tier
application
App Monitoring APIs
Application
Guest OS
Hardware/VM
41
VM/Guest Monitoring
Fault Tolerance
Infrastructure HA
Multiple vCPUs
MetroHA
Protection against host component failures
Downtime
vSphere HA and FT – Technical Directions
Technical directions include
 More comprehensive coverage of failures for more applications
 Broader set of enablers for improving availability of applications
Coverage
Multi-tier
application
Building blocks for
creating available
apps
App Monitoring APIs
Application
API extensions
Guest OS
Hardware/VM
42
VM/Guest Monitoring
Fault Tolerance
Infrastructure HA
Multiple vCPUs
MetroHA
Protection against host component failures
Downtime
vSphere HA and FT – Technical Directions
Technical directions include
 More comprehensive coverage of failures for more applications
 Broader set of enablers for improving availability of applications
Coverage
Multi-tier
application
Building blocks for
creating available
Partner
apps
solutions APIs
App Monitoring
Application
API extensions
Guest OS
Hardware/VM
VM/Guest Monitoring
Fault Tolerance
Infrastructure HA
Multiple vCPUs
MetroHA
Protection against host component failures
none
43
minutes
Downtime
vSphere HA and FT – Technical Directions
Technical directions include
 More comprehensive coverage of failures for more applications
 Broader set of enablers for improving availability of applications
Coverage
Multi-tier
application
Application
Building blocks for
creating available
Partner
apps
solutions APIs
App Monitoring
Solidifying vSphere
API extensions
as the platform for running
all
mission-critical applications
Guest OS
Hardware/VM
VM/Guest Monitoring
Fault Tolerance
Infrastructure HA
Multiple vCPUs
MetroHA
Protection against host component failures
none
44
minutes
Downtime
Thank you!
Questions?
45
BCO2874
vSphere High Availability 5.0
and SMP Fault Tolerance –
Technical Overview and Roadmap
Additional
vSphere HA 5.0 Details
48
Troubleshooting
49
Troubleshooting vSphere HA 5.0
 HA issues proactive warning about possible future conditions
• VMs not protected after powering on
• Management network discontinuities
• Isolation addresses stop working
 HA host states provide granularity into error conditions
 All HA conditions reported via events; config issues/alarms for some
• Event descriptions describe problem and actions to take
• All event messages contain “vSphere HA” so searching for HA issues easier
• HA alarms are more fine grain and auto clearing (where appropriate)
 5.0 Troubleshooting guide which discusses likely top issues. E.g.,
• Implications of each of the HA host states
• Topics on HB datastores, failovers, admission control
• Will be updated periodically
50
HA Agent Logging
 HA 5.0 writes operational information to a single log file called fdm.log
• A configurable number of historical copies are kept to assist with debugging
 File contains a record of, for example,
• Inventory updates relating to VMs, the host, and datastores received from the host
management agent (hostd)
• Processing of configuration updates sent to a master by vCenter Server
• Significant actions taken by the HA agent, such as protecting a VM or restarting a VM
• Messages sent by a slave to a master and by a master to a slave
 Default location
• ESXi 5.0: /var/log/fdm.log (historical copies in var/run/log)
• Earlier ESX versions: /var/log/vmware/fdm (all files in the same directory)
 Notes
• See vSphere HA best practices guide for recommended log capacities
• HA log files are designed to assist VMware support in diagnosing problems and the format may
change at any time. Thus, for reporting, we recommend you rely on the vCenter Server HA-related
events, alarms, config issues, and VM/host properties
51
Log File Format
 Log file contains time stamped rows
 Many rows report the HA agent (FDM) module that logged the info
 E.g.,
2011-06-01T05:48:00.945Z [FFFE2B90 info 'Invt' opID=SWI-a111addb] [InventoryManagerImpl::ProcessClusterChange]
Cluster state changed to Startup
 Noteworthy modules are
• Cluster – module responsible for cluster functions
• Invt – module responsible for caching key inventory details
• Policy – module responsible for deciding what to do on a failure
• Placement – module responsible for placing failed VMs
• Execution – module responsible for restarting VMs
• Monitor – modules responsible for periodic health checks
• FDM – module responsible for communication with vCenter Server
52
Additional Datastore Details
for HA 5.0
• Heartbeating and heartbeat files
• Protected VM files
• File locations
53
Heartbeat Datastores(HB): Purpose and Mechanisms
 Used by master for slaves not connected to it over network
 Determine if a slave is alive
• Rely on heartbeats issued to slave’s HB datastores
• Each FDM opens a file on each of its HB datastores for heartbeating purposes
• Files contain no information. On VMFS datastores, file will have the minimum-allowed
file size
• Files are named X-hb, where X is the (SDK API) moID of the host
• Master periodically reads heartbeats of all partitioned / isolated slaves
 Determine the set of VMs running on a slave
• A FDM writes a list of powered on VMs into a file on each of its HB datastores
• Master periodically reads the files of all partitioned/isolated slaves
• Each poweron file contains at most 140 KB of info. On VMFS datastores, actual
disk usage is determined by the file-sizes supported by the VMFS version
• They are named X-powereon, where X is the (SDK API) moID of the host
54
VM Protected Files
 Protected-vm files are used
• When recovering from a master failure
• To determine whether a master is responsible for a given VM
• To divvy the VMs up between masters during a partition
 One protetedlist file per datastore per cluster using the datastore
• It stores the local paths of the protected VMs
• A VM is listed only in the file on the datastore containing its config file
 Each file is a fixed 2 MB in size
55
File Locations
 FDMs create a directory (.vSphere-HA) in root of each relevant datastore
 Within it, they create a subdirectory for each cluster using the datastore
 Each subdirectory is given a unique name called the Fault Domain ID
 <VC uuid>-<cluster entity ID>-<8 random hex characters>-<VC hostname>
• Entity ID is the number portion of the (SDK API) moID of the cluster
 E.g., in /vmfs/volumes/clusterDS/.vSphere-HA/
56
FDM-C8496A0D-12D2-4933-AE02-601BCDDB9C61-9-d6bfc023-vc23/
 Cluster 9
FDM-C8496A0D-12D2-4933-AE02-601BCDDB9C61-17-ad9fd307-vc23/
 Cluster 17
UI Changes
57
Summary of UI Changes
 Cluster Summary Screen
• Advanced Runtime Info (improved)
• Cluster Status (new)
• Configuration Issues (improved)
 Cluster and datacenter
• Hosts list view (improved)
 Cluster Configuration
• Datastore Heartbeating (new)
• Admission Control (improved)
 Host, cluster, datacenter
• VM list view (improved)
 Host Summary Screen
• HA host state (improved)
 VM Summary Screen
• HA Protection (improved)
58
Cluster
Summary of UI Changes
 Cluster Summary Screen
• Advanced Runtime Info (improved)
• Cluster Status (new)
59
Summary of UI Changes
 Cluster Summary Screen
• Advanced Runtime Info (improved)
• Cluster Status (new)
• Configuration Issues (improved)
60
Summary of UI Changes
 Cluster Summary Screen
• Advanced Runtime Info (improved)
• Cluster Status (new)
• Configuration Issues (improved)
 Cluster and datacenter
• Hosts list view (improved)
61
Summary of UI Changes
 Cluster and datacenter
• Hosts list view (improved)
 Cluster Configuration
• Datastore Heartbeating (new)
62
Summary of UI Changes
 Cluster Summary Screen
• Advanced Runtime Info (improved)
• Cluster Status (new)
• Configuration Issues (improved)
 Cluster and datacenter
• Hosts list view (improved)
 Cluster Configuration
• Datastore Heartbeating (new)
• Admission Control (improved)
 Host, cluster, datacenter
• VM list view (improved)
 Host Summary Screen
• HA host state (improved)
 VM Summary Screen
• HA Protection (improved)
63
Summary of UI Changes
 Host, cluster, datacenter
• VM list view (improved) showing protected VMs
64
UI Changes
 Cluster Summary Screen
• Advanced Runtime Info (improved)
• Cluster Status (new)
• Configuration Issues (improved)
 Cluster and datacenter
• Hosts list view (improved)
 Cluster Configuration
• Datastore Heartbeating (new)
• Admission Control (improved)
 Host, cluster, datacenter
• VM list view (improved)
 Host Summary Screen
• HA host state (improved)
 VM Summary Screen
• HA Protection (improved)
65
UI Changes
 Cluster Summary Screen
• Advanced Runtime Info (improved)
• Cluster Status (new)
• Configuration Issues (improved)
 Cluster and datacenter
• Hosts list view (improved)
 Cluster Configuration
• Datastore Heartbeating (new)
• Admission Control (improved)
 Host, cluster, datacenter
• VM list view (improved)
 Host Summary Screen
• HA host state (improved)
 VM Summary Screen
• HA Protection (improved)
66