SSIS - The SQL Server Conference

Download Report

Transcript SSIS - The SQL Server Conference

1st Oct 2011
SQL Tuning Dot.net Perspective
Bob Duffy
Database/Solution Architect
Prodata SQL Centre of Excellence
Speaker Bio – Bob Duffy
•
•
•
•
•
•
•
•
Database/Solution Architect at Prodata SQL Centre of Excellence
One of about 25 MCA for SQL Server globally (aka SQL Ranger)
MCM on SQL 2005 and 2008
SQL Server MVP 2009+
20 years in database sector, 250+ projects
Senior Consultant with Microsoft 2005-2008
Regular speaker for TechNet, MSDN, Users Groups, Irish and UK
Technology Conferences
*new* Completed the SSAS Maestro course
Session Goals
Learn to improve performance and scalability by making
changes to your dot.net application.
Target Audience is developers who work with SQL
Or
DBA’s who work in the dot.net world
We will avoid typical SQL query tuning and physical
tuning (covered elsewhere)
What is an Application ?
Pure OLTP
OLTP With Mixed Workload (CRM/ERP)
Departmental (< 200 users )
Enterprise (> 250 users or strict performance SLA’s)
Logging/Auditing
Reporting
Batch Processing
Data Processing
Web Scale OLTP
1. Connecting to SQL Server
Connection Pooling
Connection Authentication
Connection Pooling - Overview
Connections are slow and expensive
(3 * network_packet_size) + 94k
Should be LESS connections than users
General Guidance: Acquire late release early
Connection String sets Pool Size
Default Min of 1 and Max of 100
One Pool is created Per
Process
per application domain
per connection string
when using integrated security, per Windows identity.
Connection Authentication
Decide on Trusted v Delegated
Decide on windows v sql authentication
If Windows Decide on Kerberos v NTLM
To check Kerberos
Good or Bad Connections?
2. Parameterisation
You Must Parameterisation
Optimising Parameterisation
How to Parameterise
Why you MUST Parameterise
Better Performance and Security
Performance
Less plans in the cache
Less compiles (should be close to zero)
Opens the door to prepared statements
Security
Helps prevent SQL injection
Parameter Tips and Pitfalls
Avoid using concatenation instead of proper params
Do specify Length with variable length types
Otherwise will result in procedure cache bloat and more compiles
One plan per input data length
Should we use “.AddWithValue”?
No – the data size is not specified
For optimal performance cache command object
When repeated calls to same statement
Use Prepared Statement (discussed later)
Monitor for Parameter Sniffing Issues
How to Parameterise
Inline SQL
Select * from Customers where CustomerID=@ID
Use Parameters Collection
Cmd.Parameters.add(“@Id,variable,length)
Ad Hoc v Parameterised
Making the Most of a Bad Deal
What if your application is in production and has no
parameterization (eg Siebel)
Update of Application Data Access is best but may not be possible
Simple queries may be “ok” due to auto parameterisation
“Forced Parameterisation” database setting may help
Server setting “Optimise for Ad Hoc” is the second best option
Will solve proc cache bloat
Will not resolve compile Issues (High CPU usage)
Check your Proc Cache for offending Queries
Look for single use queries and total size
SELECT c.cacheobjtype , c.objtype , c.usecounts , c.size_in_bytes , t.dbid , t.text
FROM sys.dm_exec_cached_plans as c
CROSS APPLY sys.dm_exec_sql_text(plan_handle) as t
WHERE c.cacheobjtype = 'Compiled Plan'
ORDER BY c.usecounts DESC
--ORDER BY t.text
3. SqlCommand Tips
Parameterize those queries
Last Section
Use RPC Calls
Use CommandType=StoredProcedure
Do NOT “stuff” statements (causes compiles)
Cache where frequently called
Careful of overzealous “stateless” architecture
Consider Preparation for batches of similar commands
Make good use of methods with no resultsets
ExecuteNonQuery
ExecuteScalar
4. Prepared Commands
Chatty Interfaces may send same request with
different parameters.
Waste of bandwidth and other resources
Use sqlCommand.Prepare to optimise this
Sends the SQL only once over network
Calls sp_prepexec in SQL Server
Calls sp_execute with each set of parameters
Greater startup cost, so avoid with only single
calls
5. The Chunky v Chatty Debate
We consider two types of chunkiness
Data transferred. One row or one table
One call per statement or one per batch
Affects type of dot.net calls
GetAllCustomers()
GetCustomer(CustomerID as int)
Affects guidance on ado.net
Use of Stored Procedures
use of Transactions
Use of Data Caching
Some Test Results
Test Name
AdHoc SQL
Connection Pooling
Parameterized
Connection Pooling
Cached Connection
Prepared
DataReader
Chunky
(Seconds)
126.03
75.56
21.55
19.16
18.04
15.33
13.67
7.43
6. Optimising Resultsets
Data Readers v Data Tables
Ordinal Indexes
SqlDataReader - Overview
Streaming access row by row basis
Data is buffered in the background
Connected while reading
Avoid passing around
Avoid delaying reader or you will see ASYNC_NETWORK_IO
Very fast but inflexible
SqlDataReader - Batching
Avoid making multiple round trips when you can make one
If you need two distinct sets of data, send both requests in a
single batch
Use SqlDataReader.NextResult to move to the next table
This is forward only
Warning – while this will improve performance we do find it
affects maintainability if not used with good helper functions.
DataTable - Overview
Cached, in-memory access to all of the rows
Everything is pulled across the wire BEFORE reading starts
DataAdapter Fills the DataTable
Uses a reader behind the scenes
So double pass over data to read it !!
Disconnected Access
Can be cached and passed around
Supports limited searching and sorting
Bottom Line: Ultimate flexibility, with some performance
sacrificed
What are Ordinal Indexes
Both Reader and table offer column name or
Ordinal Indexes
Column Names calls GetOrdinal under the covers
Recommendations
Align enumerations with the output column list
Call GetOrdinal in advance and store the ordinal
positions to avoid multiple calls
C#
int productIdOrdinal = reader.GetOrdinal(“productId”);
while (reader.read())
productIds.Add((int)reader[productIdOrdinal]);
Some Test Results
7. SqlBulkCopy Overview
.NET class that exposes the functionality of SQL
Server’s Bulk Copy API
Sends rows to SQL Server using TDS
Enables minimally-logged inserts
Accepts data input via either DataTable or
IDataReader
Requirements for Minimal Logging
Table not replicated
TABLOCK
Table is a heap
If table has clustered Index
Pages fully logged unless table empty
Trace Flag 610
allows for Minimal logging on non empty tables
with clustered Indexes
8. Table Valued Parameters
Table-Valued Parameters are a SQL Server
2008 feature, available from ADO.NET 3.5
Allows creation of typed table parameters to
stored procedures
Table types support CHECK, DEFAULT, PRIMARY
KEY, and UNIQUE constraints
Input only; must specify READONLY option in
stored procedure parameter
Some Test Results – 1 million row insert
Test
Row by Row (Prepared)
Secs
245.47
SqlBulkCopy - Minimal Logged
TVP
2.4
6.2
SqlBulkCopy - Fully Logged
4.5
9. Data Caching Considerations
Proper use of caching can be the key to
application scalability.
Consider:
Should cache granularity be user-level or applicationlevel?
Should the cache live on the server or at the client?
How should cache expiration and update be handled?
What form of data or objects should be cached?
How do we distribute cache across servers ?
Do we cache reads/writes or both?
10. Some ORM Guidance
Check the Health of the Plan Cache
Queries Provided Earlier
Log and monitor the SQL being generated
RML OR DMV’s are good here
If a query is not getting optimized properly,
consider migrating to a stored procedure
Coming up…
Speaker
Quest
SQL Server
Community
SQLSentry
Attunity
Title
Room
Trivia Quiz: Test Your SQL Server and IT Knowledge and Win
Aintree
Prizes
SQL Server Community presents : The A to Z of SQL Nuggets Lancaster
Real Time and Historical Performance Troubleshooting with
Pearce
SQL Sentry
Data Replication Redefined – best practices for replicating
Empire
data to SQL Server
#SQLBITS
11 Stored Proc or T-SQL not optimal
Too much IO on query plan
Too many joins
Lack of covering indexes
Non SARGABLE predicates
Use of UDF’s
Too much data in columns or row
Data Model Design not optimal (de-normalised)
Abuse of Temp variables
Procedural or math heavy code in stored procs
Abuse of DISTINCT
In conjunction with UDF or non sargable
Abuse of UNION
Use of Triggers
Implicit Conversions
Resources
SQL Server Connection Pooling
http://msdn.microsoft.com/en-us/library/8xx3tyca.aspx
SET OPTIONS That Affect Results
http://msdn.microsoft.com/en-us/library/ms175088.aspx
How to use Impersonation and Delegation
http://msdn.microsoft.com/en-us/library/ff647404.aspx
Beginners Guide to ADO.Net Entity Framework
http://msdn.microsoft.com/en-us/data/aa937723
Query Notification
How does it work?
Thank You