ITM 7.3: Data Center Host Maintenance Through Virtualization

Download Report

Transcript ITM 7.3: Data Center Host Maintenance Through Virtualization

ITM 7.3: Data Center Host
Maintenance Through
Virtualization
Anil Kapur (Product Line Manager, VMWare)
Vishal Gupta (Member of Technical Staff, VMWare)
1
Data Center World – Certified Vendor Neutral
Each presenter is required to certify that their
presentation will be vendor-neutral.
As an attendee you have a right to enforce this
policy of having no sales pitch within a session by
alerting the speaker if you feel the session is not
being presented in a vendor neutral fashion. If the
issue continues to be a problem, please alert Data
Center World staff after the session is complete.
2
Data Center Host Maintenance Through Virtualization
Data center infrastructure management (DCIM) tools monitor, measure, manage and/or control major
data center assets and resources, including IT infrastructure (such as servers, storage or networking)
and facilities infrastructure (such as power, cooling or physical space). In this session, we discuss how
virtualization technologies such as Distributed Resource Scheduler (DRS), which automatically decides
VM physical placement and quality of service, and DPM, which saves power and cooling by
consolidating VMs onto servers, have shaped power savings and availability in the data center. It will
address the role virtualization can play in DCIM and future directions such as automated VM and host
power down in response to critical host failure conditions caused by a failed power supply, CPU
overheating, etc.
3
Agenda
1
Virtualization and Cluster Basics
2
Hyper-Converged SDDC Infrastructure
3
Future Directions: ProActive High Availability
4
Agenda
1
Virtualization and Cluster Basics
2
Hyper-Converged SDDC Infrastructure
3
Future Directions: ProActive High Availability
5
What is Virtualization?

Virtualization is a proven software technology that is rapidly transforming the
IT landscape and fundamentally changing the way that people compute

Virtualization makes it possible to run multiple operating systems and multiple
applications on the same SERVER at the same time
Benefit

Virtualization allows you to reduce IT costs while increasing the efficiency,
utilization and flexibility of your existing x86 computer hardware
6
For the visually inclined…
7
4 Key Properties




Partitioning
Isolation
Encapsulation
Hardware Independence
8
Partitioning
Key Virtualization Properties
 Run multiple operating systems
on one physical machine
 Divide system resources between
virtual machines
9
Isolation
Key Virtualization Properties
 Fault and security isolation at the
hardware level
 Advanced resource controls
preserve performance
Hypervisor
10
Encapsulation
Key Virtualization Properties
 Entire state of the virtual machine
can be saved to files
 Move and copy virtual machines
as easily as moving and copying
files
Hypervisor
11
Live Machine Migration with Zero Downtime
Enables the live migration of virtual
machines from one host to another with
continuous service availability.
Benefits:

Revolutionary technology that is the
basis for automated virtual machine
movement

Meets service level and performance
goals
12
DRS Overview
Manage a Cluster As a Single Large Host
6 hosts
CPU = 10 GHz
Memory = 64 GB
1 “Giant Host”
CPU = 60 GHz
Memory = 384 GB
13
DRS Goals






vMotion
Ease of Host Management
Initial Placement
CPU and Memory Load Balancing
VM-VM Affinity (Anti-Affinity)
Host Cluster
VM-Host Affinity (Anti-Affinity)
Host Maintenance Mode
•••
14
DRS Load-balancing
Measuring Imbalance
0.65
0.05
0.40
 Imbalance metric is a cluster-level metric
 First for each host, DRS computes
Imbalance metric = 0.30
VM entitlement is a measure of how much resources a VM
deserves (both CPU and memory).
A host with higher capacity can have more VMs for the same
normalized entitlement.
0.39
0.24
0.36
 Imbalance metric is the standard deviation
(spread) of these normalized entitlements
Imbalance metric = 0.08
15
Agenda
1
Virtualization and Cluster Basics
2
Hyper-Converged SDDC Infrastructure
3
Future Directions: ProActive High Availability
16
Getting to Software Defined DC
Size Host,
Switch &
Storage
Hypervisor and
Hypervisor
Management
Upgrade
Certify HCL
Capacity
Planning,
Health &
Risk
Monitoring
Software
defined
networkin
g
Cabling
Highly
available
installation
Procure Hardware
Patch
Decommission
hardware
Re-license
Cloud IaaS
Management
Install Software
Lifecycle Management
17
Making Data Center Infrastructure Simple
Procure
• Single SKU with
pre-racked, precabled
infrastructure
with infra SW
pre-installed
• Choice of prequalified partner
configs
• Easy
procurement of
incremental
capacity in
known capacity
bundles
Deploy
• Power and
network uplink
hookups only
• No software
installs at
deployment
• Rack power on to
deploying VMs in
~ 2 hours
• Integration with
existing
datacenter
network
Manage
Operate
• Physical and
virtual resources
visible and
operable one
system
• Pre-validated
upgrade paths
between
prescriptive sets
of software
• Health, capacity
information from
single pane
• Fastpath update
bundles for
emergency
patching e.g.
Shellshock,
Heartbleed
• Rest based API
for datacenterwide
management
integration
Support
• One place to call
for support
• Backend triage
among
component
products
• Prescriptive
configuration
with less
variations
• Incremental
virtual capacity
addition when
needed
18
What is EVO:RACK?
Two 32 x 40GE Inter Rack Spine Switches (first rack only)
To
Data Center
Ethernet Network
1 GE Management Switch for Out of Band Connectivity
Two 10 GE ToR Switches for Data Connectivity
4 x 40GE uplinks from each switch to existing data center LAN
480 CPU Cores, 6TB of Memory, 500TB of Raw Storage
24 x 2 CPU Servers with 256GB of Memory Each
To
Data Center
Power
•
•
•
Pre-racked and pre-cabled equipment
Power distribution built-in
Drops for power and network uplinks
19
EVO: RACK Architecture
vRealize Operations
REST APIs
Operations &
Management
vRealize Automation
vRealize Log Insight
NSX Manager
vCenter
NSX
VSAN
External
Management
Tools
vRack
Manager
(VRM)
Cloud
Infrastructure
Administrators
Virtualization
vSphere
Hardware
Servers
Network Switches
Storage (DAS)
Hardware
Management
System
(HMS)
20
Agenda
1
Virtualization and Cluster Basics
2
Hyper-Converged SDDC Infrastructure
3
Future Directions: ProActive High Availability
21
Motivation
 Today, vSphere HA reacts to failures by
o Failing over VMs to healthy host
o Restarting the guest OS
 Failure scenarios include
o Host fails or becomes isolated
o Guest OS fails
o Application fails
 Therefore, there is downtime
22
Motivation
Objective: Can we improve uptime in scenarios where the host has indicated
there is likely going to be a failure?
Solution: Avoid future down time by proactively reacting to error
conditions that may lead to failures
23
Solution
 Detect health conditions that may be
catastrophic in the future such as
o
o
o
o
Memory errors
Fan failure
CPU temperature above a threshold
Redundant link failures
 Take preventive actions to minimize the
impact of a future failure
o Issue a warning
o Move around VMs in the cluster
24
Failure Management: Policy Control
 Admin specifies the monitored conditions and corresponding actions
 User-specified policies determine if the immediate cost is worth the
anticipated future gain, e.g., evacuate tier 1 VMs only if their
resource allocation won’t be affected
25
Challenge: Impact of Proactive Actions




Proactive actions can impact the performance of VMs
Currently, the system cannot determine if the impact is
acceptable
Users want to understand the impact and evaluate tradeoffs
Capacity planning must be done to minimize the impact of
failures
26
Solution: What-if Simulations


Performance-accurate simulation of the impact of failures
Performance impact is measured by the reduction in CPU cycles
and memory allocation after a (simulated) failure
27
Key Lessons
1.
Virtualization can significantly improve resource utilization and
thus lower cost for datacenters
2.
Hyper-converged SDDC Infrastructure solutions reduce opex
and capex
1.
Proactive failure management is key to ensure continuous
availability of VM’s and their related services
28
Thank you
Anil Kapur ([email protected])
Vishal Gupta ([email protected])
29