Transcript Document

SQL Server 2014: In-Memory OLTP
Adoption Considerations
Mike Weiner
@MikeW_CAT
[email protected]
Performance-critical
data fits in memory
Push decisions to
compilation time
Conflicts are Rare
Built-In
Main-Memory
Optimized
T-SQL Compiled to
Native Machine Code
Non-Blocking Execution
SQL Server Integration
Architectural Pillars
• Direct pointers to rows
• Indexes exist only in
memory
• No buffer pool
• No write-ahead logging
• Stream-based storage
• T-SQL compiled to
machine code leveraging
VC compiler
• Procedure and its
queries, becomes a C
function
• Aggressive optimizations
@ compile-time
• Multi-version optimistic
concurrency control with
full ACID support
• Lock-free data structures
• No locks, latches or
spinlocks
• No I/O during
transaction
• Same manageability,
administration &
development experience
• Integrated queries &
transactions
• Integrated HA and
backup/restore
Results
Principles
In-Memory OLTP Architectural Pillars - Review
Speed of an in-memory
cache with capabilities
of a database
Queries & business logic
run at native-code
speed
Transactions execute to
completion without
blocking
Hybrid engine and
integrated experience
Introduction
Application Patterns for In-Memory OLTP
In-Memory OLTP can meet demanding scale and performance goals with SQL Server
In-Memory OLTP is a solution for applications suffering from RDBMS performance and scale
bottlenecks
There are some common workload patterns which work well with In-Memory OLTP
Adoption Methodology and considerations for In-Memory OLTP
Evaluate (early) whether In-Memory OLTP can address the application needs. Some workloads
may not benefit from In-Memory OLTP. Some may with a large amount of work required
Migrate in stages and there are tools to help
Consider aspects of In-Memory OLTP Engine in design
Successful Patterns with In-Memory OLTP*
Implementation Scenario
Pattern Characteristics and Challenge
Main Benefits of In-Memory OLTP
High Data Insert Rate




Primarily append-only store
Inability to ingest write workload
High performance read operations
Unable to meet scale-up requirements
Compute Heavy Data
Processing



Insert/Update/Delete workload
Heavy computation inside database
Read and write contention





o



Eliminate contention
Minimize IO logging
Eliminate contention
Efficient data retrieval
Minimize code execution time
CPU efficiency for scale
Eliminate contention
Minimize code execution time
Efficient data processing
Low Latency

Require low latency business
transactions which typical database
solutions cannot achieve
High concurrency exacerbates latency



Eliminate contention
Minimize code execution time
Efficient data retrieval
Heavy insert, update and point lookups
User scale under load from multiple
stateless web servers



Eliminate contention
Efficient data retrieval
Optional IO reduction/removal
Read Performance and Scale

Session State Management


*In-Memory OLTP – Common Workload Patterns and Migration Considerations
Early Workload Performance Gains
Factor X Gains for Applications
BEST FIT
Apps that take full advantage: e.g. web
app session state
30
INGEST /
READ
HEAVY
Apps with periodic bulk updates & heavy
random reads
10
LEGACY
APP
Existing apps typically see 4-7x
improvement
5
UPDATE
HEAVY OLTP
Despite 20 years of optimizing for OLTP
benchmarks – we still get ~2x on a
workload derived from TPC-C
2
0
10
X factor Gains
20
30
In-Memory OLTP impact within the Engine
Traditional Execution Stack
Client
Connectivity
Query
Execution
Data Access
(Buffer Pool)
Transaction
Logging
Gains
In-Memory OLTP Execution Stack
No Improvement
2X to 30X Improvement
*Including concurrency gains
Same Latency
Potentially Less Volume for Durable
I/O Eliminated for Non-Durable
Client
Connectivity
Procedure
Execution
Data Access
(Memory)
Transaction
Logging
Transaction
Logging
Adoption Steps
• Assess application
• Is the application suitable for In-Memory OLTP?
• Define goals and measure
• Identify bottlenecks
• “Analysis” through MDW and ‘Transaction Performance Analysis Report’
• Staged migration
• “Migrate” utilizing Advisors in Management Studio for help
• Measure against goals
• Understand considerations
• Memory, Storage, Indexing
Assess Application
• Invest in workloads that are OLTP in nature with few exceptions
• Move critical data-sets/T-SQL into In-Memory OLTP and add more via subsequent dev/test
iterations
• Some applications may require a lot of work to achieve the gains, while others migrate quite
easily.
• If you lack the ability to change code In-Memory OLTP in SQL Server 2014 won’t work
• Some deployments may not be a good fit
• Hardware limitations
• Long running reporting workloads
• Row-versions can utilize a lot of memory if queries are “long running”
• Queries which execute much better with parallelism
• Dependencies on locking, MARS, unsupported collations, specialized indexes like XML/FullText
• Very dynamic, undefined workload patterns
Establish Performance Baseline
• Production is great
• More accurate and realistic data
• But hard to get access and may impact workload
• Simulation works, but requires…
• Scale, transaction mix, profiling period matters
• Try to be as realistic as possible with scale, transaction mix, hardware
configuration, etc.
• Baseline using Workload Tools
• Ostress, Distributed Replay, Visual Studio Test System etc…
• In-Memory OLTP’s Analyze, Migrate and Report Toolset
• Performance Monitor/DMV’s
Identify Bottlenecks
• Table contention
• Look at dm_db_index_operational_stats, dm_os_latch_stats, and
Latch Waits/sec counter
• Target table with heaviest latching
• Table access
• Look at dm_db_index_operational_stats
• Target table with heaviest accesses
• Stored procedure CPU usage
• Look at dm_exec_query_stats
• Target stored procedure with highest CPU usage
• Tooling in place
• In-Memory OLTP’s Analyze, Migrate and Report Toolset
Testing with realistic workload is critical
http://msdn.microsoft.com/en-us/library/dn511655(v=sql.120).aspx
Taking an Iterative Approach to Migration
Identify bottlenecks
in tables
Address surface area
limitations and
migrate code
Identify performance
critical transaction
code
Address unsupported
constructs and
migrate data
Perform minimal
changes for
interpreted access
Durability Considerations
• Durability = SCHEMA_AND_DATA
• Operations on data are always fully logged
• Data is stored in checkpoint files on disk and can be recovered
• Requires storage considerations
• Durability = SCHEMA_ONLY
• Transient data with no log or checkpoint files
• Schema can be recovered but data cannot be recovered after server restart or
failure
• Reduces dependency on disk drives, improves overall transaction throughput
• Not universally applicable – use for select scenarios only
Storage Considerations for SCHEMA_AND_DATA objects
Log Drive
• Log IO is still critical to latency
• Index operations and UNDO records are not logged
Data Drive – Checkpoint File Pairs (CFP), space and performance
• Have checkpoint files on drive with high sequential I/O
• Needed for offline checkpoint, garbage collection (merge–read/consolidate), and recovery
• Spread checkpoint files over odd numbers of containers over multiple disks
• Data/Delta allocation round-robin across all containers, improve streaming IO bandwidth
• Size of files may be larger then size of tables in-memory
•
•
•
•
Start with allocating storage up to 4 times size of durable tables
Automated merge to consolidate CFPs to remove deleted rows
Merged CFPs to go through state transition and are eventually garbage collected
Requires multiple checkpoints and log truncation. Need log backup based on recovery model
• Use workload’s steady state to determine disk usage
http://blogs.technet.com/b/dataplatforminsider/archive/2014/01/16/storage-allocation-and-managementfor-memory-optimized-tables.aspx
Memory Considerations
Memory Allocation
• Size of rows (datatypes) and indexes, plus row-versions
•
•
•
•
Row versions must stay in-memory until all transactions with their timestamp end are done
Garbage collection, unless under memory pressure, is async and can take time to remove rows
Nonclustered Hash Index: fixed, function of bucket count, 8 bytes per bucket
Nonclustered Index: dynamic, function of row count, overhead
Recommendations:
•
•
•
•
Around 250GB max for SCHEMA_AND_DATA tables in a database
Consider ‘common’ OLTP workload 2x memory allocation (2 row versions).
Consider other workloads which may require memory under SQL Server process
Use Resource Governor to control memory allocated at the database level
• Create Resource Pool…MAX_MEMORY_PERCENT
• Pools only allocate a percent (around 80%) to memory-optimized objects for system stability
Estimating Memory Consumption
• Memory Size = Table Size + SUM(Index Size)
•
•
•
•
Table Size = Row Size * Row Count
Row Size = SUM(Column Sizes) + Row Header Size
Row Header Size = 24 + 8 * Index Count
Column Size = Dependent on column type and associated
padding/overhead
• Hash Index Size = Bucket_Count * 8 bytes
• Nonclustered Index Size = Row Count * (Key Size + 8) bytes
• Details available in Books Online - http://msdn.microsoft.com/enus/library/dn205318(v=sql.120).aspx
Further System Considerations
No need to re-invest in hardware (although likely more memory).
Processor
• Number of cores/sockets – All deployments have been 2-4 socket
• Not NUMA aware
Network
• Bandwidth, latency and configuration are still critical
• Receive-Side Scaling configurations can become important
• Consider batching client calls to avoid number of client/server round-trips for
latency
Index Guidelines
• Nonclustered memory-optimized:
• Execution similar to nonclustered on disk-based tables
• Choose index direction that conforms to seek direction
• Definition of column in index is distinct ordering during seek – Should match
ORDER BY
• Nonclustered Hash memory-optimized:
• Bucket_count requirement
• Bucket-count too low or too high impacts read and write behavior & memory
• Performance generally good up to 5x bucket count, if avg key values per bucket > 10 consider
nonclustered
• Usually better to overestimate rather than under
• Queries on sub-set of columns in hash index result in scans
• Index key should be exact columns search is on
• Recall indexes need to be re-created on startup
Index Guidelines
Operation
Nonclustered
Nonclustered Hash
Specify Bucket Count
Not Required
Required
Index Scan, retrieve all table rows
Supported
Supported
Index seek on equality predicate(s)
(=)
Supported*
Performs better than Nonclustered.
Predicate must include all columns in hash
key, otherwise will result in scan
Index seek on inequality predicates
(>, <, <=, >=), BETWEEN
Performs better than
Nonclustered Hash*
Will result in scan
Retrieve rows in a sort-order
matching the index definition
Supported
Not Supported
Retrieve rows in a sort-order
matching the reverse of the index
definition
Not Supported
Not Supported
* Predicate does not need to include all key columns but must include at least leading key columns,
otherwise will result in scan
Interpreted Transact-SQL vs. Natively Compiled
• Interpreted T-SQL Access
– Access both memory- and
disk-based tables
– Less performant
– Virtually full T-SQL surface
• When to use
– Ad hoc queries
– Surface area not available in
native
– Compilation at runtime can
impact plan (positively)
• Natively Compiled Procs
‒ Access only memoryoptimized tables
‒ Maximum performance
‒ Limited T-SQL surface area
• When to use
‒ Well defined OLTP-style
operations
‒ Optimize performance
critical business logic
‒ Query plan unlikely to
change based on statistics
Query Performance
 Lack of parallelism can be problematic for some queries
 Consider trade-off between faster execution (native) vs. parallel operation
 Joins between memory-optimized table and columnstore can be problematic
 Statistics creation and updates can be problematic for large tables
 Requirement of FULLSCAN can take a long time
 If query has many attributes individual column create stats can have an impact
 Consider impact of no auto-update statistics
 For InterOp queries: determining how often and when can affect plans
 For native procedures: Do data load, execute statistics update, create native compiled procedure
Recap: Migration Steps
Assess application
Is the application suitable for In-memory OLTP?
Define goals and measure
Identify bottlenecks
“Analysis” through MDW and ‘Transaction Performance Analysis Report’
Staged migration
“Migrate” utilizing Advisors in Management Studio for help
Measure against goals
Understand considerations
Memory, Storage, Indexing
Review
Application Patterns for In-Memory OLTP
In-Memory OLTP can meet demanding scale and performance goals with SQL Server
In-Memory OLTP is a solution for applications suffering from engine performance and scale bottlenecks
There are some common workload patterns which work well with In-Memory OLTP
Adoption Methodology and considerations for In-Memory OLTP
Evaluate (early) whether In-Memory OLTP can address the application needs. Some workloads may not benefit
from In-Memory OLTP. Some may with a large amount of work required
Migrate in stages and there are tools to help
Consider aspects of In-Memory OLTP Engine in design
References
In-Memory OLTP Common Workloads and Migration Considerations
In-Memory OLTP Internals Overview
In-Memory OLTP Sample Database and Workload (based on AdventureWorks)
http://msdn.microsoft.com/en-us/library/dn511655(v=sql.120).aspx
ASP.NET In-Memory OLTP Provider:
http://www.nuget.org/packages/Microsoft.Web.SessionState.SqlInMemory/
In-Memory OLTP Blog Series
Books Online: In-Memory OLTP