Transcript Presentation Title
Freakish Database Performance With Flash Storage 1
[email protected]
763.228.6463
IBM Internal Use Only
© 2011 IBM Corporation
Agenda
• Share some experience with using solid state/ flash storage for database workloads: • OLTP (2TB) • Warehouse (76TB) • Which workload characteristics can best leverage flash storage?
• What are some best practices
2
IBM Internal Use Only
© 2011 IBM Corporation
3
• • • •
The Challenge
Goal: 1.5 M trans in 5.5 hours Stretch goal: 1.5M trans in 2 hours Improve backup time What is possible if CPU, memory, storage, and network are not constrained?
OLTP Workload
Initial profile
• A brokerage house package • Batch cycle comprised of five Java programs (only one can be parallelized) • 1.5M transactions in 8 hours after extensive application and SQL tuning • 1.68TB uncompressed • Online backup time (backup, then gzip) in 36 hours
IBM Internal Use Only
© 2011 IBM Corporation
4
No holds barred
• 2 x 64 cores, 3.86 GHz, 1TB RAM • 86TB HDD, 256GB cache – 2 ms average response time • 1TB SSD • 10GbE
The Setup
• • • • •
Approaches
Enabled compression No database tuning All-HDD Mixed – SSD (logs & temp), HDD (data & indexes) All-SSD
IBM Internal Use Only
© 2011 IBM Corporation
5 Accept all default database settings out of the box
• STMM • • Auto runstats Auto online table reorg Results Mixed SSD (logs & temp) & HDD (data, indexes) All – SSD Disk Utilization Average IOPS Throughput Application Engines Uncompressed offline backup Compressed online/offline backup (SSD to HDD) 14% 26% < 1% busy 20 450KB/s 30 30 – 40 min 18 min
IBM Internal Use Only
© 2011 IBM Corporation
6 Application Engines Performance
• • • • Most improvements resulted from more CPUs for the application CPU intensive Verbose application logging • Application logs generated more IOs than database!
More application engines generating transactions to reduce batch elapsed time Low database IO profile
IBM Internal Use Only
© 2011 IBM Corporation
7 Final Results
Results Goal: 1.5M trans in 5.5 hours Stretch goal: 1.5M trans in 2 hours Improve backup time: 18 minutes v. 36 hours Best result: 1.5M trans in 1.1 hours! (All SSD)
Y Y Y Y
IBM Internal Use Only
© 2011 IBM Corporation
8
• • •
The Challenge
Aging servers and storage Data center floor space, cooling, and power consumption constraints Same or better performance
Warehouse Workload
Initial profile
• Servers and storage running 100% all day long • Maxed out at around 30 – 40 active users • Half-stroked disks to get performance and throughput
IBM Internal Use Only
© 2011 IBM Corporation
The Setup
•
Approach
Replacement will be very fast, very small, very simple
9
IBM Internal Use Only
© 2011 IBM Corporation
10 Database IO Improvement for Warehouse Workload 76TB IBM SSD v. Old HDD Sub-millisecond IO response time Sustained
• • • • •
Synchronous reads Synchronous writes Asynchronous reads Asynchronous writes Data pages per asynchronous request 21.8x
13.6x
17.6x
18.34x
1.8x
Note: Asynchronous IOs are ~18x faster, each asynchronous request is ~2x more effective due to 32K page size, that is a 36x improvement.
IBM Internal Use Only
© 2011 IBM Corporation
11 Benchmark Queries Improvement for Warehouse Workload 76TB IBM SSD v. Old HDD
Benchmark details • Actual IO and CPU intensive queries captured from business users • Runs weekly to monitor any performance degradation with respect to new and organic growth in the warehouse over time • Noise queries (75) + benchmark queries (25) = 100 Noise queries completed BM queries completed CPU utilization
All SSD
85% 100% (first time ever) 30%
Old
32% 64% (historically never reached 100) 100%
IBM Internal Use Only
© 2011 IBM Corporation
Benchmark Queries Speed Up Factor for Warehouse Workload (Plotted on Logarithmic Scale ) 76TB IBM SSD v. Old HDD 12
Speed up details • Average: 2.21 (log) or 163.96x faster • Median: 1.48 (log) or 29.96x faster (50% is at least ~30x faster) • Low: 0.56 (log) or 3.59x faster • High: 3.05 (log) or 1,113.56x faster • Time is measured as elapsed time (prepare + execute + fetch)
IBM Internal Use Only
© 2011 IBM Corporation
CPU Utilization 13
About 30% busy … BTW … We are also using disk level encryption (SED)
IBM Internal Use Only
© 2011 IBM Corporation
14 EXP30 Ultra SSD IO Specifications
• Each drive: SFF (1.8”), 1/5 of 1U, 387GB • IO drawer: 30 drives (6 x 5). Total raw capacity: 11.6TB (30 x 387GB). Cache: 3.1GB
• IOPS: 400K (100% read) / 280K (70/30 R/W) / 165K (100% write) • Two POWER 740 servers connected to one IO drawer • PCIe attached via GX++ adapter (8Gb/s) • Configured as 5+p LUNs (130GB LUNs)
IBM Internal Use Only
© 2011 IBM Corporation
15
Deployment Considerations
• IO adapter card (HBA) • At 120K – 400K IOPS per IO drawer, and 32K IO size, it is possible to saturate the HBA • Plan for adequate number of HBAs • If using SAN then be sure the bandwidth to the storage server is consistent along the whole path, for example, 8Gb/s • Balance IOs across HBAs and front end ports for even utilization • Be cautious about mixing flash storage & HDD drives in one HBA • Fewer, larger LUNs (500GB– 700GB) • LUNs do take up available system memory and CPU cycles on the server • Multiple logical volumes per LUN, no reason to stripe LV across LUNs • Use large page size (32K), extent size, but ensure that the database bufferpool(s) are adequately sized to accept big reads • Optimize data movement with less IOPS. It is not about driving up IOPS
IBM Internal Use Only
© 2011 IBM Corporation
16
Candidate Application Considerations
• High IO profile • Indexes, data • Database logs and temp spaces can take advantage of cache write through already, may not be the best candidates • Applications that can parallelize well to take advantage of higher IO throughput • Before we can process more transactions per second the applications need to be able to generate more transactions per second • For example, we needed to increase the number of application engines from 3 to 30 in order to generate 8x throughput in transaction rate • Applications that spend more time fetching result sets across a network, rather than executing complex queries in the database, will likely see less improvement (slow consumers) • client_idle_wait_time (ms) (time spent waiting for client/application to send its next request) • If the database spends more time waiting for client/application to send work then improving database response time alone will not improve throughput. • Increase application parallelism • Look for network congestion issues • call monreport.dbsummary(600), examine client_idle_wait_time
IBM Internal Use Only
© 2011 IBM Corporation
17 Why Consider Flash Storage
• Greatly beneficial for high IO workloads • Much smaller footprint, much more energy efficient • Servers (11), IO drawers (7), power supply all fit in one rack!
• Achieve high performance, and throughput quickly without tuning • Performance, reliability, price
IBM Internal Use Only
© 2011 IBM Corporation