Transcript Document

--A Gem of SQL Server 2012, particularly for Data Warehousing--
Present By Steven Wang
Steven Wang
• Senior DBA/Senior BI Specialist at BNZ
• MCITP/MCTS: BI Developer, Database Developer and
Database Administrator
• Blog: www.msbicoe.com
• Email: [email protected]
Agenda
•
•
•
•
•
•
•
•
•
The Nitty-gritty of Columnstore Indexes (Demo)
Columnstore Indexes Under the Hood (Demo)
Columnstore Indexes Performance Tuning (Demo)
The Limitations of the Columnstore Indexes
How to load data for a colunstore indexed table
The Columnstore Indexes Best Practises
The Impact of Columnstore Indexes on BI
Resources
Q&A
The Nitty-gritty of Columnstore Indexes
• What is columnstore?
The Nitty-gritty of Columnstore Indexes
• What is columnstore?
Row-Store
--Serialize all of the
attribute values in a
row together, then the
values in the next row,
and so on.
ProductKey
…………...
DateKey
…………...
ResellerKey
…………...
Quantity
…………...
Price
…………...
Amount
…………...
Columns-Store
--Serializes all of the
values of an attribute
(column) together
The Nitty-gritty of Columnstore Indexes
• Why columnstore?
--It’s All about I/O;
--It’s all about compression efficiency;
--In columnstore, compression algorithms are working better
and a higher compression ratio can be achieved;
In
Memory
--Columnstore compression improves CPU performance;
--The maximum memory is limited by OS,1TB? 2TB?;
--Get only columns which are queried;
The Nitty-gritty of Columnstore Indexes
• SQL Server 2012 Columnstore Indexes
 For the current release, SQL server 11 RTM,only
nonclustered columnstore indexe is allowed
 A nonclustered index covers all possible columns and
stores data in column-wise fashion.
 Data is highly compressed by using xVelocity
Engine (aka, “Vertipaq”)
 A new “Batch mode” execution model is introduced
 A columstore index breaks each column into 1 million
row chunks called segment.
 Delivers order-of-magnitude gains for DW queries
 Base table is not able to INSERT, UPDATE, DELETE while
a nonclustered columnstore index exists
The Nitty-gritty of Columnstore Indexes
• How to create a columnstore index?
CREATE NONCLUSTERED
COLUMNSTORE INDEX
[ci_FactResellerSales]
ON [dbo].[FactResellerSales]
(
[ProductKey],
[OrderDateKey],
[DueDateKey],
[ShipDateKey],
[ResellerKey],
[UnitPrice],
[ExtendedAmount],
[UnitPriceDiscountPct],
[DiscountAmount],
[ProductStandardCost],
[ShipDate]
);
The Nitty-gritty of Columnstore Indexes
• Demo
 Lots of fundamental of columnstore indexes
 Performance difference between B-Tree and CI
 And more…
Columnstore Indexes Under the Hood
• Data is highly compressed. Dramatically reduced IO. More
Data can fit into memory.
Row Store
Column Store
Column Store, Compressed
Columnstore Indexes Under the Hood
• Data is highly compressed. Dramatically reduced IO. More
data can fit into memory.
Row Store
Column Store, Compressed
Columnstore Indexes Under the Hood
Run-length Encoding
 How data is compressed?
RLE
Columnstore Indexes Under the Hood
Dictionary Encoding
 How data is compressed?
Columnstore Indexes Under the Hood
• A vector-based query execution method called “Batch Mode”
processing is implemented. Dramatically reduced CPU
consumption time.
 A Batch is an object that contains about 1000 rows
Columnstore Indexes Under the Hood
• Segment elimination can skip large chunks of data to speed
up scans.
 A segment a 1 million of rows
 Each segment stores min and max value
 If no rows qualify, then the entire segment is skipped for
scan
Columnstore Indexes Under the Hood
• Demo
Columnstore Indexes Performance Tuning
• Maximizing the use of “Batch Mode” processing
The Limitations of the Columnstore Indexes
• Base table is not able to insert, delete and update data
• For current release, no clustered columnstore index
• Columnstore Indexes are not designed for
needle-in-the-haystack kind of queries. No Seek.
Data Type not allowed:
Other restrictions:










 No Sparse column
 No unique columnstore index
 No customer sort option for
columns
 No replication
 No change tracking
 No change data capture
 No filestream column
Binary and varbinary
Varchar(max) and nvarchar(max)
Decimal/Numeric precision > 18
Ntext, text, and image
Uniqueidentifier
Rowversion
Sql-variant
Datetimeoffset(>2)
CLR data type
XML
How to load data for a colunstore indexed table
• There is really no simple way that loads data into a
columnstore indexed table
• Columnstore is not born for data updating
• 3 possible ways:
 Drop columnstore index, load the data,
recreate. (or disable, load the data, rebuild)
 Partition the columnstore indexed table and using
partition switching. Seems the best way.
 Using t 2 tables, one with historical data with columnstore
index, another is just a normal table. Complicated, lots of
maintenance
The Columnstore Indexes Best Practises
• Include all columns in the columnstore index whenever
possible
• Put columnstore indexes on large tables only
• Consider to create a clustered index on columns which
are frequently used, like date column
• Structure your queries as star joins with grouping and
aggregation as much as possible
• Avoid joins and string filters directly on columns of
columnstore indexed tables.
• Whenever possible, avoid constructing queries with outer
join, Union all, and not in directly on columnstore indexed
• Using integer whenever possilbe
• Using table partitions
The Impact of Columnstore Indexes on BI
•
•
•
•
•
Data type consideration for Data warehouse designing
The columns chosen for creating a clustered index
Partition clustering
ROLAP vs MOLAP
DirectQuery for the Tabular BI semantic model
Resources
• Eric Hanson: SQL Server Columnstore Performance Tuning
http://social.technet.microsoft.com/wiki/contents/articles/4995.sqlserver-columnstore-performance-tuning.aspx
• BOL: Columnstore Indexes
http://msdn.microsoft.com/en-us/library/gg492088.aspx
• Joe Sack: Exploring Columnstore Index Metadata, Segment
Distribution and Elimination Behaviors
http://www.sqlskills.com/blogs/joe/post/Exploring-Columnstore-IndexMetadata-Segment-Distribution-and-Elimination-Behaviors.aspx
• Benjamin Nevarez: Improve the Performance of Data
Warehouse Queries with Columnstore Indexes
http://www.sqlmag.com/article/sqlserverdenali/data-warehousequeries-columnstore-indexes-141712
• Stavros Harizopoulos, Daniel Abadi: Column-Oriented
Database system
Q&A