The Impact of Logical and Physical Fragmentation in a

Download Report

Transcript The Impact of Logical and Physical Fragmentation in a

Improving Disk Latency and
Throughput with VMware
Presented by
Raxco Software, Inc.
March 11, 2011
Today’s Agenda
• Provide technical information on how NTFS
impacts VMware I/O performance
• Examine ESX I/O test results
• Economic impact of Windows guests
• Solutions
Virtualization Benefits
•
•
•
•
•
Server consolidation
Less physical space for data centers
Lower energy costs
Easier management
Eco-friendly alternative
Identifying and Correcting Problems
• Latency is your best indicator of a performance problem
– Device latency is vSphere’s report of the physical storage response time
– Kernel latency is vSphere’s report of ESC’s ability to manage IO
• Experts disagree on specifics, but most agree that…
Device latency in excess of 15ms is worth inspection
Device latency in excess of 30ms is likely a problem
Kernel latency in excess of 2ms means ESX queues are
overflowing
• High device latency can result in ESX queuing
– So, correct slow hardware first!
– Then, consider reducing VMDKs on a VMFS volume
– Only then consider changing queue depths
© Copyright 2010 EMC Corporation. All rights reserved.
Storage Contention Solution:
Storage IO Control
• SIOC calculates data store latency
to identify storage contention
– Latency is a normalized, average across
virtual machines
– IO size and IOPS included
• SIOC enforces fairness when data
store latency crosses threshold
– Default of 30ms
– Fairness enforced by limiting VMs access
to queue slots
• Net effect: trade throughput for
latency
© Copyright 2010 EMC Corporation. All rights reserved.
With Storage IO Control
Actual Disk Resources utilized by each VM are in
the correct ratio even across ESX Hosts
NTFS I/O Storms
NTFS Behavior
•
•
•
•
NTFS fragments files and free space
Increases logical I/O to storage controller
More logical I/O = More physical I/O
Multiple instances of Windows on host can
lead to I/O contention
What is Fragmentation?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
??
?
?
?
?
?
?
?
?
?
?
?
?
Logical v Physical
• Logical Level
– NTFS needs disk and cluster size,
enumerates LCNs
– Creates $MFT and $Bitmap
metadata
– $Bitmap is how NTFS “sees”
the disk
– Has no idea about physical/virtual
disk types
Anatomy of an MFT Record
(vcn, lcn, run length): (8a85, 9189a, 7)
File Allocation
• Create $MFT record (one or more)
• $Bitmap accessed to locate free space
• $MFT record is updated with content
Create
Bitmap
Access
MFT
Update
File Access
• Load portion of MFT with correct record via
directory
• Locate file in the MFT
• Pass starting LCN’s and run lengths to disk
controller
• Number of logical fragments influences
number of physical seeks
Load
Locate
File
# LCN’s
# Physical
Seeks
Logical v. Physical
• Physical Level
– Disk controller Maps LCNs to PCNs
– Writes data to disk
Wasted Seeks
Partition State
Total Number
of I/O Requests
Sent to the File
System
Total Number
of Resulting
Disk
Accesses/Seeks
Net Wasted
Seeks When
Running
SYSmark
Percent Net
Wasted Seeks
When Running
SYSmark
Fragmented
1,320,686
2,090,649
769,963
58.30%
After
PerfectDisk
1,434,454
1,616,847
182,393
12.72%
After Built-In
1,411,613
1,931,395
519,782
36.82%
How This Affects A Virtual
Environment
•
•
•
•
•
•
•
P2V Conversion
Extra Hypervisor Overhead
Disk Latency Degradation
Overall Performance
System Throughput
Wasted Space
Costly
P2V Conversion
Physical
Drive
No
Optimization
Optimization
24GB
24GB
22GB
2GB Smaller
ESX Cluster Testing
•
•
•
•
Identical disks - 40% free space
Optimized one set, the other “as is”
Installed MS Office and MS SQL
Captured metrics with VMware’s vscsiStats
utility
Total I/O Count
Fragmented
PerfectDisk
% Improvement
37191
29238
21.3
Read IO Count
3066
2799
8.7
Write IO Count
34125
26439
22.5
Total IO Count
49% Reduction in Latency!
Fragmented
I/O
PerfectDisk
I/O
30ms
50ms
100ms
>100ms
Total
12749
9877
8700
9116
40,442
6707
4923
4081
5053
20,764
Disk Latency
Histogram: latency of IOs in Microseconds (us) Volume: 8260
60000
50000
Freq
40000
30000
Fragmented (Average)
Defragmented (Average)
20000
10000
0
1
10
100
500
1000
5000
uSec
15000
30000
50000
100000 100000
12X More Large I/O
Fragmented Disk
PerfectDisk Disk
Total IO Equal to 524K
2512
848
Total IO > 524K
247
2959
Read IO Equal to 524K
33
7
Read IO >524K
125
65
Write IO Equal to 524K
2480
841
122
2894
Write IO >524K
Large I/O
12 times more of the largest IO
Improved Sequential I/O
Fragmented
PerfectDisk
Improvement
17%
27%
58%
Total IO
127703
90526
25%
Sequential IO
22126
24340
33%
Percent Sequential
Improved Sequential I/O
Histogram: distance (in LBNs) between successive commands Volume: 8260
35000
30000
Freq
25000
20000
15000
Fragmented (Average)
10000
Defragmented (Average)
5000
0
LBN
Installation Time Comparison
Fragmented
PerfectDisk
% Improvement
MS Office Install
20 min
15 min
25
MS SQL Install
76 min
51 min
33
The Cost of Fragmentation
EXAMPLE:
•
•
•
•
20 files x 6 seconds = 2 minutes
300 users x 2 min = 10 hours/day
10 hrs x $25/hr = $250/day
Annual cost = $62,500
Virtual Guest Fragmentation
• Windows guests have all the same NTFS
behavior
• Fragmentation produces more IOPS
• Fragmentation reduces ESX throughput
• Fragmentation increases ESX disk latency
• Fragmentation creates resource contention between
host & guests
Solutions
• Expensive
– More disks and faster disks
– Upgrade Fibre Channel
– Troubleshooting
• Inexpensive
– Optimize the Windows guest systems
PerfectDisk 12 vSphere
NEW
NEW
NEW
• Virtualization Awareness/host & client
• OptiWrite Fragmentation Avoidance
• “Zero-fill” free space
PerfectDisk 12 vSphere
NEW
NEW
NEW
• “Short stroking” for thin provisioned disks
• Schedule guest compaction
• Snapshot & Linked Clone recognition
PerfectDisk Benefits on ESX
•
•
•
•
•
•
Saves $$$ in productivity and admin
Reduces resource contention for VM’s
Reduces total IO workload
Improves throughput
Reduces disk latency
Delivers optimal performance
Contact Raxco
•
•
•
•
•
•
Free Evaluation Software
Excellent Support to Get You Started
White Papers
Great ROI
www.raxco.com
Toll Free: 1.800.546.9728