SQL Performance 2013

Download Report

Transcript SQL Performance 2013

Special Topics
Joe Chang
www.qdpma.com
#SQLSatRiyadh
About Joe
•
•
•
•
•
SQL Server consultant since 1999
Query Optimizer execution plan cost formulas (2002)
True cost structure of SQL plan operations (2003?)
Database with distribution statistics only, no data 2004
Decoding statblob/stats_stream
– writing your own statistics
• Disk IO cost structure
• Tools for system monitoring, execution plan analysis
See ExecStats on
www.qdpma.com
Overview
• Why performance is still important today?
– Brute force?
• Yes, but …
• Special Topics
• Automating data collections
• SQL Server Engine
– What developers/DBA need to know?
CPU & Memory 2001 and 2012
PCI-E
PCI-E
PCI-E
PCI-E
PCI-E
PCI-E
PCI-E
PCI-E
PCI-E
P
QPI
C3
C2
C1
C0
MCH
2001 – 4 sockets, 4 cores
Pentium III Xeon, 900MHz
4-8GB memory?
LLC
MI
MI
QPI
PCI-E
C3
C2
C1
C0
LLC
MI
C3
C2
C1
C0
PCI-E
LLC
MI
C4
C5
C6
C7
C4
C5
C6
C7
MI
QPI
QPI
C3
C2
C1
C0
PCI-E
LLC
MI
C4
C5
C6
C7
MI
PCI-E
PCI-E
PCI-E
PCI-E
PCI-E
PCI-E
DMI 2
PCI-E
PCI-E
16GB $191
32GB $794
QPI
MI
PCI-E
Each core today is more
than 10x over PIII
PCI-E
PCI-E
Xeon MP 2002-4
C4
C5
C6
C7
QPI
QPI
FSB
PCI-E
QPI
P
PCI-E
P
PCI-E
PCI-E
P
L2
2012 – 4 sockets, 8 cores each
4 x 8 = 32 cores total
768GB (48 x 16GB), Westmere-EX 1TB
15 cores in next generation?
Storage 2001 versus 2012/13
QPI
192 GB
10GbE
RAID
RAID
RAID
RAID
SSD
SSD
SSD
SSD
HDD
HDD
HDD
HDD
PCIe x8
PCIe x8
PCIe x8
PCIe x8
PCIe x8
PCIe x4
2001
100 x 10K HDD
125 IOPS each = 12.5K IOPS
IO Bandwidth limited: 1.3GB/s
(1/3 memory bandwidth)
192 GB
QPI
2013
64 SSDs, >10K+ IOPS each, 1M
IOPS possible
IO Bandwidth 10GB/s easy
SAN vendors –
questionable BW
IB
Performance Past, Present, Future
• When will servers be so powerful that …
– Been saying this for a long time
• Today – 10 to 100X overkill
– 32-cores, 60-cores later in 2013?
– Enough memory that IO is only sporadic
– Unlimited IOPS with SSD
• What can go wrong?
Today’s topic
Special Topics
• Data type mistmatch
• Multiple Optional Search Arguments (SARG)
– Function on SARG
•
•
•
•
•
Parameter Sniffing versus Variables
Statistics related (big topic)
first OR, then AND/OR combinations
Complex Query with sub-expressions
Parallel Execution
Not in order of priority
1a. Data type mismatch
DECLARE @name nvarchar(25) = N'Customer#000002760'
SELECT * FROM CUSTOMER WHERE C_NAME = @name
SELECT * FROM CUSTOMER WHERE C_NAME = CONVERT(varchar, @name)
.NET auto-parameter discovery?
Unable to use index seek
1b. Type Mismatch – Row Estimate
SELECT * FROM CUSTOMER WHERE C_NAME LIKE N'Customer#00000276%'
SELECT * FROM CUSTOMER WHERE C_NAME LIKE 'Customer#00000276%'
Row estimate
error could have
severe
consequences in a
complex query
SELECT TOP plus Row Estimate Error
SELECT TOP 1000 [Document].[ArtifactID]
FROM [Document] (NOLOCK)
WHERE [Document].[AccessControlListID_D] IN (1,1000064,1000269)
AND EXISTS (
SELECT [DocumentBatch].[BatchArtifactID]
FROM [DocumentBatch] (NOLOCK)
INNER JOIN [Batch] (NOLOCK)
ON [Batch].ArtifactID = [DocumentBatch].[BatchArtifactID]
WHERE
[DocumentBatch].[DocumentArtifactID] = [Document].[ArtifactID]
AND [Batch].[Name] LIKE N'%Value%'
)
ORDER BY [Document].[ArtifactID]
Data type mismatch – results in estimate rows high
Top clause – easy to find first 1000 rows
In fact, there are few rows that match SARG
Wrong plan for evaluating large number of rows
http://www.qdpma.com/CBO/Relativity.html
2. Multiple Optional SARG
DECLARE @Orderkey int, @Partkey int = 1
SELECT * FROM LINEITEM
WHERE (@Orderkey IS NULL OR L_ORDERKEY = @Orderkey)
AND (@Partkey IS NULL OR L_PARTKEY = @Partkey)
AND (@Partkey IS NOT NULL OR @Orderkey IS NOT NULL)
Dynamically Built Parameterized SQL
DECLARE @Orderkey int, @Partkey int = 1
, @SQL nvarchar(500), @Param nvarchar(100)
SELECT @SQL =
N‘/* Comment */
SELECT * FROM LINEITEM WHERE 1=1‘
, @Param = N'@Orderkey int, @Partkey int'
IF (@Orderkey IS NOT NULL)
SELECT @SQL = @SQL + N' AND L_ORDERKEY = @Orderkey'
IF (@Partkey IS NOT NULL)
SELECT @SQL = @SQL + N' AND L_PARTKEY = @Partkey'
PRINT @SQL
exec sp_executesql @SQL, @Param, @Orderkey, @Partkey
IF block is easier for few options
Dynamically built parameterized SQL better for many options
Considering /*comment*/ to help identify this
IF block
DECLARE @Orderkey int, @Partkey int = 1
IF (@Orderkey IS NOT NULL)
SELECT * FROM LINEITEM
WHERE (L_ORDERKEY = @Orderkey)
AND (@Partkey IS NULL OR L_PARTKEY = @Partkey)
ELSE IF (@Partkey IS NOT NULL)
SELECT * FROM LINEITEM
WHERE (L_PARTKEY = @Partkey)
2b. Function on column SARG
SELECT COUNT(*), SUM(L_EXTENDEDPRICE) FROM LINEITEM
WHERE YEAR(L_SHIPDATE) = 1995 AND MONTH(L_SHIPDATE) = 1
SELECT COUNT(*), SUM(L_EXTENDEDPRICE) FROM LINEITEM
WHERE L_SHIPDATE BETWEEN '1995-01-01' AND '1995-01-31'
DECLARE @Startdate date, @Days int = 1
SELECT COUNT(*), SUM(L_EXTENDEDPRICE) FROM LINEITEM
WHERE L_SHIPDATE BETWEEN @Startdate AND DATEADD(dd,1,@Startdate)
Estimated versus Actual Plan - rows
Estimated Plan – 1 row???
Actual Plan – actual rows 77,356
3 Parameter Sniffing
-- first call, procedure compiles with these parameters
exec p_Report @startdate = '2011-01-01', @enddate = '2011-12-31'
-- subsequent calls, procedure executes with original plan
exec p_Report @startdate = '2012-01-01', @enddate = '2012-01-07'
Need different execution plans for narrow and wide range
Options:
1) WITH RECOMPILE
2) main procedure calls 1 of 2 identical sub-procedures
One sub-procedure is only called for narrow range
Other called for wide range
4 Statistics
• Auto-recompute points
• Sampling strategy
– Percentage
– Random pages versus random rows
– Histogram Equal and Range Rows
– Out of bounds, value does not exist
– etc
Statistics Used by the Query Optimizer in SQL Server 2008
Writer: Eric N. Hanson and Yavor Angelov
Contributor: Lubor Kollar
http://msdn.microsoft.com/en-us/library/dd535534.aspx
Statistics Structure
• Stored (mostly) in binary field
Density Vector
Histogram
Scalar values
Statistics Auto/Re-Compute
• Automatically generated on query compile
• Recompute at 6 rows, 500, every 20%?
Has this changed?
Statistics Sampling
• Sampling theory
– True random sample
– Sample error - square root
• Relative error 1/ N
• SQL Server sampling
– All rows in random pages
N
Row Estimate - Statistics
• Skewed data distribution
• Out of bounds
• Value does not exist
Loop Join - Table Scan on Inner Source
Estimated out from first 2 tabes (at right) is
zero or 1 rows. Most efficient join to third
table (without index on join column) is a loop
join with scan. If row count is 2 or more, then
a fullscan is performed for each row from
outer source
Default statistics rules may lead to serious ETL issues
Consider custom strategy
Compile Parameter Not Exists
Main procedure has cursor around view_Servers
First server in view_Servers is ’CAESIUM’
Cursor executes sub-procedure for each Server
sql:
SELECT MAX(ID) FROM TReplWS
WHERE Hostname = @ServerName
But CAESIUM does not exist in TReplWS!
Good and Bad Plan?
SqlPlan Compile Parameters
SqlPlan Compile Parameters
<?xml version="1.0" encoding="utf-8"?>
<ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.50.2500.0">
<BatchSequence>
<Batch>
<Statements>
<StmtSimple StatementText="@ServerName varchar(50) SELECT @maxid = ISNULL(MAX(id),0)
FROM TReplWS WHERE Hostname = @ServerName"
StatementId="1" StatementCompId="43" StatementType="SELECT" StatementSubTreeCost="0.0032843" StatementEstRows="1"
StatementOptmLevel="FULL" QueryHash="0x671D2B3E17E538F1" QueryPlanHash="0xEB64FB22C47E1CF2"
StatementOptmEarlyAbortReason="GoodEnoughPlanFound">
<StatementSetOptions QUOTED_IDENTIFIER="true" ARITHABORT="false" CONCAT_NULL_YIELDS_NULL="true" ANSI_NULLS="true"
ANSI_PADDING="true" ANSI_WARNINGS="true" NUMERIC_ROUNDABORT="false" />
<QueryPlan CachedPlanSize="16" CompileTime="1" CompileCPU="1" CompileMemory="168">
<RelOp NodeId="0" PhysicalOp="Compute Scalar" LogicalOp="Compute Scalar"
EstimateRows="1" EstimateIO="0" EstimateCPU="1e-007“
AvgRowSize="15" EstimatedTotalSubtreeCost="0.0032843" Parallel="0" EstimateRebinds="0" EstimateRewinds="0">
</RelOp>
<ParameterList>
<ColumnReference Column="@ServerName" ParameterCompiledValue="'CAESIUM'" />
</ParameterList>
</QueryPlan>
</StmtSimple>
</Statements>
</Batch>
</BatchSequence>
</ShowPlanXML>
Compile parameter values at bottom of sqlplan file
5a Single Table OR
-- Single table
SELECT * FROM LINEITEM
WHERE L_ORDERKEY = 1
OR L_PARTKEY = 184826
5a Join 2 Tables, OR in SARG
-- subsequent calls, procedure executes with original plan
SELECT O_ORDERDATE, O_ORDERKEY, L_SHIPDATE, L_QUANTITY
FROM LINEITEM
INNER JOIN ORDERS ON O_ORDERKEY = L_ORDERKEY
WHERE L_PARTKEY = 184826 OR O_CUSTKEY = 137099
5a UNION instead of OR
SELECT O_ORDERDATE,
FROM LINEITEM INNER
UNION
SELECT O_ORDERDATE,
FROM LINEITEM INNER
O_ORDERKEY, L_SHIPDATE, L_QUANTITY, O_CUSTKEY, L_PARTKEY
JOIN ORDERS ON O_ORDERKEY = L_ORDERKEY WHERE L_PARTKEY = 184826
O_ORDERKEY, L_SHIPDATE, L_QUANTITY, O_CUSTKEY, L_PARTKEY
JOIN ORDERS ON O_ORDERKEY = L_ORDERKEY WHERE O_CUSTKEY = 137099
Caution, select list
should have keys
to ensure correct
rows
UNION removes
duplicates
5b AND/OR Combinations
• Hash Join is good method to process many rows
– Requirement is equality join condition
SELECT xx FROM A WHERE col1 IN (expr1) AND col2 NOT IN (expr2)
SELECT xx FROM A WHERE (expr1) AND (expr2 OR expr3)
• In complex SQL with AND/OR or IN NOT IN combinations
– Query optimizer may not be to determine that equality join
condition exists
– Execution plan will use loop join,
– and attempt to force hash join will be rejected
• Re-write using UNION in place of OR
• And LEFT JOIN in place of NOT IN
More on AND/OR combinations:
http://www.qdpma.com/CBO/Relativity3.html
Complex Query with Sub-expression
• Query complexity – really high compile cost
• Repeating sub-expressions (including CTE)
– Must be evaluated multiple times
• Main Problem
– Row estimate error propagation
• Solution
– Temp table when estimate is high, actual is low
More on AND/OR combinations:
http://www.qdpma.com/CBO/Relativity4.html
Parallelism
• Designed for 1998 era
– Cost Threshold for Parallelism: default 5
– Max Degree of Parallelism – instance level
– OPTION (MAXDOP n) – query level
• Today – complex system – 32 cores
– Plan cost 5 query might run in 10ms?
– Some queries at DOP 4
– Others at DOP 16?
More on Parallelism:
http://www.qdpma.com/CBO/ParallelismComments.html
http://www.qdpma.com/CBO/ParallelismOnset.html
Full-Text Search
Loop Join with FT as inner
Source Full Text search
Potentially executed
many times
Summary
• Hardware today is really powerful
– Storage may not be – SAN vendor disconnect
• Look for serious blunders first
Special Topics
• Data type mistmatch
• Multiple Optional Search Arguments (SARG)
– Function on SARG
•
•
•
•
•
Parameter Sniffing versus Variables
Statistics related (big topic)
AND/OR
Complex Query with sub-expressions
Parallel Execution
Parallelism