Transcript Document
Doing fast! Optimizing Query performance with ColumnStore Indexes in SQL Server 2012 Margarita Naumova | SQL Master Academy Agenda • • • • What is a columnstore index How it works Limits and restrictions When and how to use it About me • Working with SQL Server from v6.5 • MCT from 1998 • SQL Server Trainer and Consultant with over 50 projects in Bulgaria, Finland, Germany, UK, Greece.. • 5 years Senior Consultant in Microsoft, member of Microsoft Worldwide Technical Leadership Team • Microsoft Certified Master SQL Server 2008 • BG SQL UG Leader • MVP SQL Server • Now teaching and presenting SQL Server, running SQL Master Academy Training Program www.SQLMasterAcademy.com What is a Columnstore Index A new column store in SQL Server Database Engine Advanced Query processing techniques Accelerates specific workloads (DW) Performance Example of CSI 1TB DW Database, 1.44 billion rows in fact table, 32 logical procs with 256GB RAM DEMO CSI Performance Example A columnstore Index Structure • Stores data column-wise • Better compression – Uses VertiPaq compression technology • Less IO Only needed cols are fetched – Better Buffer hit rates • Pure Column Store • Fetches only needed columns from disk Segment (values for 1 col for a set of about 1mnl of rows) Advanced Query processing techniques • Processing mode – Batch mode – Row mode • Best performance - when processes column data in batches using highly efficient algorithms • Optimizer makes a cost based decision • Segment elimination is an additional optimization • Optimizer allows you to use hints DEMO CSI Internals DO’s for using CSI Effectively • Put CSI on large tables only – Large fact tables and very large dimension table • Include every column of the table in the columnstore index • Use in read mostly workload, where most updates are appending new data • Workflow permits using partitioning (or drop rebuild index) to handle new data • Structure your queries as star joins with grouping and aggregation as much as possible or entail scanning and aggregating large amounts of data ColumnStore Index Optimizations Rules • DOP>2 and enough memory • On ColumnStore indexed tables avoid using: – – – – joins and filters on string columns OUTER JOIN NOT IN (<subquery>) UNION ALL to directly combine columnstore-indexed tables with other tables Loading data • Table with columnstore index can be read, not updated – INSERT, UPDATE, DELETE and MERGE is not allowed – Partition switching is allowed • There are three possible methods for loading data – Disabling index – Partition switching – UNION ALL (tickle update) Loading Method 1 • Disable (or drop) the index – ALTER INDEX ….DISABLE • Update the table • Rebuild the columnstore index • ALTER INDEX … REBUILD Loading Method 2 • Load new data into a staging table • Build a columnstore index on the staging table – CREATE NONCLUSTERED COLUMNSTORE INDEX myindex ON Staging(……) Switch 5/11 Staging table • Switch the partition into the main table – ALTER TABLE Staging SWITCH TO PARTITION 5 Load DEMO Loading and updating data in CSI Loading Method 3 • • • • Build CSI index on primary table Create staging/delta table with no CSI Insert new data into (row-based) delta table Query both tables using UNION ALL to combine the results – Perform local-global aggregation using CTE • aggregate the data from the columnstore and from the delta table in separate queries • combine the results with a second level of grouping and aggregation – Do not make a view with UNION ALL • Keep delta table small Other Usage and limitations • Must have a base table structure – heap or Clustered Index • CSI is not appropriate for lookups, seeks and in OLTP as a general workload type • Supported data types should be <8 Bytes • The following data types cannot be included in a columnstore index: – – – – – – binary and varbinary, ntext, text, and image varchar(max) and nvarchar(max) Uniqueidentifier, rowversion (and timestamp) decimal (and numeric) with precision greater than 18 digits datetimeoffset with scale greater than 2 CLR types (hierarchyid and spatial types), xml Good news • Most things just work with columnstore indexes – – – – – – Backup and restore Mirroring Log Shipping SSMS Administration tools TDE CSI Summary • Benefits – Interactive experience with data • Near instant response times • Ad hoc queries • Large data sets – Easy to set up • No need for summary tables, indexed views • Fewer indexes to design, create and maintain • Reduces need to manually tune queries • Consider limitation in V1 and overcome with – Good coding practices to achieve batch processing – Respective method of loading data Thank you! www.SQLMasterAcademy.com [email protected] Margarita Naumova| SQL Master Academy