16:9 PowerPoint Template Light
Download
Report
Transcript 16:9 PowerPoint Template Light
Bottlenecks
SQL Server on flash storage
Matt Henderson
Consulting Engineer
Violin Memory Inc. Proprietary
1
Modern Datacenters
Blade servers
Rack servers
Multi-CPU (2-4 sockets)
Multi-core (4-10 cores)
2.0GHz+ per core
Moore’s Law went from faster CPU’s
to adding more cores
Cheap and plentiful processing power
Violin Memory Inc. Proprietary
2
Inside the Server
Many threads
Many cores
Many time slices
10-40% utilization
Why is the CPU not at
100%?
-> Resource constrained
Violin Memory Inc. Proprietary
3
Databases
Persistent data storage
Data access engine
Application logic processor
Violin Memory Inc. Proprietary
4
SQL Server Process Management
One SQL scheduler per logical core
SQL scheduler time-slices between users
Violin Memory Inc. Proprietary
5
Queues
Running
‒ Currently executing
process
Waiting
‒ Waiting for a resource
(IO, network, locks,
latches, etc)
Runnable
‒ Resource ready, waiting
to get on CPU
Violin Memory Inc. Proprietary
6
Bottleneck Basics
Waits: What is keeping the SQL engine from continuing?
‒
‒
‒
‒
Application (SQL)
Hardware
Architecture
Tuning
Production
WaitType
Wait_S Resource_S Signal_S WaitCount Percentage AvgWait_S AvgRes_S AvgSig_S
BROKER_RECEIVE_WAITFOR
661.36
661.36
0
4
44.6
165.3388 165.3388
0
LCK_M_IS
139.46
139.35
0.11
489
9.4
0.2852
0.285
0.0002
LCK_M_X
96.86
96.54
0.32
373
6.53
0.2597
0.2588
0.0009
LCK_M_U
83.93
83.91
0.02
32
5.66
2.6227
2.6221
0.0006
PAGEIOLATCH_SH
83.92
83.84
0.08
9835
5.66
0.0085
0.0085
0
LCK_M_S
82.44
82.1
0.33
419
5.56
0.1967
0.1959
0.0008
ASYNC_NETWORK_IO
54.4
53.61
0.79
33146
3.67
0.0016
0.0016
0
ASYNC_IO_COMPLETION
43.1
43.1
0
37
2.91
1.1649
1.1649
0
BACKUPIO
42.22
42.19
0.03
12607
2.85
0.0033
0.0033
0
BACKUPBUFFER
36.64
36.48
0.15
2175
2.47
0.0168
0.0168
0.0001
LCK_M_IX
30.88
30.85
0.03
130
2.08
0.2376
0.2373
0.0003
IO_COMPLETION
28.12
28.11
0.01
2611
1.9
0.0108
0.0108
0
CXPACKET
23.27
21.6
1.67
3542
1.57
0.0066
0.0061
0.0005
PREEMPTIVE_OS_CREATEFILE
18.84
18.84
0
247
1.27
0.0763
0.0763
0
Violin Memory Inc. Proprietary
7
Visualizing Latency
8ms
HDD Storage
8ms latency
Time
I/O Bound Apps
Total latency =
seek time +
rotational latency
Oracle
DB2
SQL Server
Etc…
Flash Storage
0.15ms (150 microsecond) latency
Time
Violin Memory Inc. Proprietary
Over 50 IO in the same
amount of time
8
Optimizing Utilization
•
•
•
•
Violin Memory Inc. Proprietary
9
CPU’s are faster than hard drives
Adding cores does not correct issue
8 cores @ 10% = 1 core @ 80%
40% utilization = 60% over paying
Storage Enables Applications
CPU
< 1ms
No
Wait
Violin Memory Inc. Proprietary
10
Accelerate Workloads & Save Money
Violin Memory Inc. Proprietary
Accelerate
Consolidate
•
•
•
•
Same CPU’s
Same work
Less human time
Fewer cores = few licenses
• Operating System
• Virtualization
• Database / Application
11
Compounding issues
Adding more cores increases licensing costs
Faster cores still have to wait for blocking items
Probably faster, definitely more expensive
Best to find the bottleneck and solve it versus buying
more/faster CPU’s
Violin Memory Inc. Proprietary
12
Buffer / RAM
SQL Buffer
‒ MRU – LRU chain
‒ Pages move down chain until they fall off the end
PLE: Page Life Expectancy
‒ How long a page lives in the chain before being cycled out
‒ MSFT recommends over 300
Working Set
‒ How much data do you want/NEED in RAM?
‒ Database size isn’t relevant
‒ What’s being worked on now, what will be in the near future?
Workload profile
‒ What data are the users hitting?
‒ Is there any way to predict the future data page hits?
Violin Memory Inc. Proprietary
13
Legacy Storage Architecture
Aggregation & Segregation Model
Many autonomous parts
Separate RAID groups
Data locality
Hot spots
Transient data / usage
Tiering software / admin
Cost Inefficient
Each workload bottlenecked by a different
subset of components
Flash != Flash
SSDs are a modernization – disk
substitution
Distributed block all-flash is a revolution
Violin Memory Inc. Proprietary
14
Database Trends
•
More, More, More
• Data never dies – archived but not deleted
• More users, more applications, hybrid systems
• Interdependent usage
• Usage and complexity explosion
•
•
•
•
More users
SQL is adhoc
Machines consume data (SaaS)
Mixed workloads (OLTP & DW)
• Multi-core, Multi-CPU servers
• Processing capacity of host servers exploding
• Compute is cheap
• 24x7 is REQUIRED
• Little to no maintenance windows
• Workload is random
• Many LUN stripes on each disk
• Many users, applications, databases, servers
Violin Memory Inc. Proprietary
15
Flash for Databases - General
• Simplicity
• Less to design, plan. Easy to scale. No performance variance
• Faster, Quicker, More. Speed and Scale
• Reduced admin time
• No LUN/stripe mapping
• Chasing down / resolving performance issues
• Chores and tasks run faster
• ETL’s
• Backup/Restores – reduced RTO
• No index rebuilding for logical fragmentation
• Reduced admin costs / issues
•
•
•
•
•
No costly tiering/locality management software
No quarterly new-spindle purchases
Turn asynch mirroring into synch mirroring
Less hardware to do the same work (context switches)
Drop from Enterprise to Standard edition SQL Server
Violin Memory Inc. Proprietary
16
Right Sizing & Eliminating Bottlenecks
• SQL Server
• Number of database files
• Windows
• Number of “Disks” (LUNs)
• Transport
• FC 8Gb port
• 100k IOPs
• 800MB/s
• iSCSI (10 GigE)
• 60k IOPs
• 550MB/s
• Virtualization
• One virtual disk
Violin Memory Inc. Proprietary
17
Architecting for I/O
Rule of Many: At every layer
utilize many of each component
to reduce bottlenecks
‒
‒
‒
‒
‒
Database files
Virtual disks
MPIO paths
Physical ports
LUNs
Parallelization: Use many objects
and many processes to increase
parallel workloads
‒
‒
‒
‒
Spread transactions over pages
Use a switch (path multiplier)
Use several LUNs
Increase MAXDOP (and test)
I/O Latency: Is Sacred. Don’t
add anything to the I/O path that
doesn’t need to be there
‒
‒
‒
‒
Violin Memory Inc. Proprietary
18
LVM (Logical Volume Manager)
Virtualization
Compression
De-dup
SMB Direct: Optimize CPU Utilization
Windows
Windows
Windows
NDKPI
*RDMA
Windows
Transport (Fibre /
iSCSI)
Remote Direct Memory Access
•
•
Violin Memory Inc. Proprietary
Reduce CPU time spent handling I/O
• More CPU for applications
Reduce time (latency) to return I/O
• Accelerate applications
19
* 30-80% reduction
in CPU utilization
Demo
Violin Memory Inc. Proprietary
20
Australian Department of Defense
Challenges
•
•
•
•
Latency sensitive applications
Massive IOPs requirement
Data center space limitations
Unable to take advantage of HA and virtualization
Solution
•
•
Windows Flash Array
4-CPU host servers
Applications
•
•
•
Network monitoring / dashboard reporting
MSSQL
Hyper-v
Results
• Sustained 500-800K IOPS at <1ms latency
• Server consolidation – 45 servers to 2
• Core / licensing consolidation – 90 CPU’s to 8
• 90% reduction in DC space & power requirements
• Simplified management and support processes on MS stack
Violin Memory Inc. Proprietary
21
Logistic company – WFA testing results
Technical details
•
•
•
•
SQL Server with 1.3 billon rows
Utilize Microsoft’s SMB Direct networking protocol with RDMA capable 10Gb Ethernet cards
Host server: HP DL 580 (4 CPUs – 96 logical cores – 4 Rack U)
Storage: Violin Memory WFA (3 Rack U – Windows Server embedded inside array for native SMB
Direct)
Results from Violin WFA storage:
Existing platform
with Disk+Flash and
SQL2012
Test environment:
0.455 mill rows/sec
Prod environment:
0.621 mill rows/sec
Violin Memory Inc. Proprietary
WFA+SMB Direct
and SQL2012
WFA+SMB Direct and
SQL2014
(updatable in-memory columnstore
index)
Test
environment:
13.2 mill rows/sec
Improve
d
Prod
environment:
13.2 mill rows/sec
Improve
d
Test environment:
211 mill rows/sec
30x
21x
22
Improve
d
464x
Prod environment:
211 mill rows/sec
Improve
d
340x
SharePoint 2013 with Hyper-V and WFA
Reduce risk
‒ New tools & features
‒ New usage pattern
‒ Random and dynamic usage over many users
Reduce costs (fewer servers)
Increase performance
‒
‒
‒
‒
1
2
3
Live Migration
4
‒ Move SQL, app and web servers between hosts
‒ Move data between arrays
Balance reads/writes
Increase performance
Isolate operations work
Cluster
Violin Memory Inc. Proprietary
Optimize CPU for application work
Crawls / index searches
Content access (read or write documents)
Permissions authentication (page loads)
23
What is Flash?
Non-volatile storage
Used in cell phones, USB ‘thumb’ drives, SSD’s, cameras, etc
SLC vs. MLC vs. eMLC vs. TLC
Pros
‒
‒
‒
‒
Random access
Storage density
Read and Write very fast (~30us)
Power & Heat
Cons
‒ Memory wear
‒ No deletes or updates at byte or bit level
‒ Erase only at block level
~30x slower than reads/writes
Blocking transaction while moving still active values to another wafer
‒ Block erases cause the infamous Write Cliff (latency spikes up to 30x)
Violin Memory Inc. Proprietary
24
Flash Technology
1 Package
Violin Memory Inc. Proprietary
8 Dies
2 Planes
25
Erases at Block
Level
Writes at Page
Level
1000s of Blocks per Die
256 Pages per Block
Flash Deployments
PCIe card
1st generation
Fastest latency
configuration
Not sharable
Crashes host on
failure
Read mostly
Requires host
based RAID
Has write cliff
Violin Memory Inc. Proprietary
All-flash array
SSD
2nd generation
Controller adds
latency
Sharable with
chassis
Won’t crash host
Requires data
segregation / RAID
Has write cliff
26
3rd generation
One block of storage,
pre-RAID’d
Sharable or direct
attach (fastest latency
configuration)
Enterprise redundancy
Massive parallelism
No write cliff – Violin
only
NTFS Tuning – Need several LUNs
Violin Memory Inc. Proprietary
27
NTFS Tuning – Content Indexing
Violin Memory Inc. Proprietary
28
LUN Tuning – 4k
• LUN at 4k
• NTFS ALU at 4k (used to be 64k for disk)
• Increases utilization of RAM, Transport, SQL Buffer
Violin Memory Inc. Proprietary
29
SQL Tuning – IDENTITY Column & Clustered Indexes
• IDENTITY column
• As PK = OK
• As Clustered index = NOT OK
• Huddles data onto one data page
• Bottlenecks on inserts
• Removes locks/latches as source of
bottleneck
Violin Memory Inc. Proprietary
30
SQL Tuning – In and Outs
• Out
•
•
•
•
•
Many files for speed / data placement
Many partitions for speed / data placement
Index defragging to re-order data
MAXDOP <= 4
Admin in serial mode (one at a time)
•
•
•
•
•
•
Table loads (ETL)
Index maintenance
Backups
Archiving
Admin only during off hours
Segregated workloads
• In
•
•
•
•
•
Any number of files, partitions
MAXDOP = number of cores in server
Admin in parallel
Admin during business processing
Mixed workloads (many servers & DB’s on same storage device)
Violin Memory Inc. Proprietary
31
Q&A
Violin Memory Inc. Proprietary
32
Thank you
Violin Memory Inc. Proprietary
33