Performance Tuning MDX Queries

Download Report

Transcript Performance Tuning MDX Queries

Platinum
Learn & Enjoy
www.sqlbits.com
Gold
[Put your phone on Vibrate!]
Group BY:
[Remember get your Tickets for Entry, Coach, Drink]
Silver
Feedback Forms:
[Fill these out at the end of each session please]
Lunch Time Sessions:
[Once you eat, learn some more; Grok Talks in Chicago 1 and 2
Quest are in Menphis, Idera are in Everest]
Ask The Experts
[Sessions need to finish on time, take questions to the ATE area]
Performance Tuning
MDX Queries
Chris Webb
Crossjoin Consulting Limited /
Solid Quality Mentors
[email protected]
Agenda
•
•
•
•
Performance tuning methodology
Tuning the cube
Tuning MDX in queries and calculations
Cheating: warming the cache
Performance Tuning Methodology
• Before you start your work, ensure that:
– Your test environment is comparable with your
production environment
– Nothing else is happening on the server
– No-one will connect to the server and try to run
queries
– You know what exactly you are trying to tune!
Performance Tuning Methodology
• Capture sample queries using Profiler
• Look at the Query End events for the MDX
used
• Look at the Duration for how long the query
took to run in ms
• Note that for some clients (eg Excel before
2007) queries may be dependent on named
sets declared earlier in the session
Performance Tuning Methodology
• Clear the cache using an XMLA ClearCache
command before running your query for the
first time
• Then run the query again immediately
afterwards, so you have timings on a cold and
warm cache
• Disk caching may also have a noticeable
impact if you are querying a newly-processed
cube
Hardware
• Throwing hardware at the problem is rarely the
answer, unless:
– You have problems with performance degredation
with large numbers of concurrent users
– You have a lot of data
• Extra memory will mean you have more room for
caching query results
• Disk IO should be as fast as you can afford
• Extra CPUs can help when you have queries that
cross multiple partitions, or many concurrent
users
Attribute Relationships
• Make sure attribute relationships in your
dimension are set up to reflect all 1:M
relationships in your data
• Failure to do so will mean that you will not get
maximum benefit from aggregations and
partitioning, and the engine itself will be less
efficient
Query Execution
Tuning the Storage Engine
• The percentage of time spent in the Storage
Engine can be worked out by:
– Summing the Duration of all Query Subcube
events seen in Profiler
– Dividing that by the Duration of the Query End
event
• If you see a Query Subcube event take more
than 0.5 seconds it should probably be tuned
Partitioning
• You should partition your measure groups to
reflect the slices that users are likely to use in
their queries
• Partitioning improves query performance in
two ways:
– It reduces overall IO because AS should only scan
the partitions containing the data requested
– It increases parallelism because AS can scan more
than one partition simultaneously
Partitioning
• You should be partitioning so that a single
partition contains between 2 million rows and
15 million rows of data
• Although AS should auto-detect the slice of a
partition, it is good practice to set the Slice
property manually, even on MOLAP partitions
• In Profiler, Progress Report Begin/End events
show which partitions are being read
Aggregations
• Building aggregations improves query
performance by reducing the amount of
calculation done at query time
• Aggregations start to show an effect on
regular Sum measures where partition size is
greater than a few million rows
• Can have an effect on much smaller partitions
where there are complex calculations
Aggregation Design Methodology
1. Make your dimension design as ‘clean’ as
possible – delete or disable unnecessary
attributes
2. Set the AggregationUsage property
appropriately on each Cube dimension
3. Run the Aggregation Design Wizard
1. First using the ‘I Click Stop’ option
2. Then, if this creates too many aggregations, using
‘Performance Gain’ of 30%
Aggregation Design Methodology
4. Don’t bother with Usage-Based Optimisation:
it may overwrite existing useful aggregations
5. If necessary, design aggregations manually
using the version of Aggregation Manager
built-in to BIDS Helper
6. Check aggregations are being used by looking
for the Get Data From Aggregation event in
Profiler
The Formula Engine
• If Storage Engine requests make up a negligible
percentage of the time taken by the query, then
the Formula Engine is the problem
• Another giveaway sign is that when a query runs
on a multi-processor server, only one CPU shows
activity
– The Formula Engine only uses one thread per query
• MDX Script Performance Analyser can help to
work out which, if any, calculation is the problem
Using Named Sets
• Use named sets instead of repeatedly
evaluating the same set expression
• Named sets can be defined
– On the cube and in the session, but since they are
static (until AS2008) they are of limited usefulness
– In the WITH clause, which is useful only if you
have control over the MDX you’re executing
– Inline, which can be very powerful but also a bit
buggy
Non Empty Filtering
• Removing empty values before a set
operation, such as a Sum(), is often important
• This relies on the fact that in many cases, AS
can remove empty values very quickly
• Do not use NonEmptyCrossjoin() any more,
use the NonEmpty() function instead: it is
more reliable, predictable, and just as fast
Non_Empty_Behavior
• Setting the Non_Empty_Behavior property on
calculated members can provide a massive
performance increase
• However, knowing how to set it correctly can
be very difficult:
“I feel that it is the best not to mention
NON_EMPTY_BEHAVIOR at all - unless you can
explain what it does - and (no offence) I don't
think you will be able to do it.”, Mosha
Non_Empty_Behavior
• Non_Empty_Behavior has improves query
performance in two ways:
– When filtering out empty tuples from a set using
NON EMPTY or NonEmpty()
– When working out how to request the data a
calculation needs from the Storage Engine, even
when there is no filtering taking place
• Increasingly, the Formula Engine can ‘guess’
what it should be set to though
Non_Empty_Behavior
• In general, the rule for setting it is if you can
say:
<subcube> =
iif(IsEmpty(<empty expression>)
, null, <else expression>);
Then you can say:
Non_Empty_Behavior(<subcube>)=
<empty expression>;
Appropriate Calculation Scope
• Use scoped assignments to determine when
and where calculations execute, rather than
conditional logic
– For example, a year-to-date calculation will never
be relevant at the year level
– Do not use IIF with
Hierarchy.Currentmember.Level – scope on the
level itself
Aggregation-Aware Calculations
• If you know where you have built your
aggregations, it can make sense to write your
calculations appropriately
• For example, if you have an aggregation at the
year level, the calculation:
Q1 + Q2 + Q3
might be better expressed as:
Year – Q4
MDX to Avoid
• ORDER – there’s a bug which makes it perform
badly! Use TOPCOUNT instead.
• LOOKUPCUBE – redesign your cube instead
• CASE – it isn’t optimised, use IIF instead
• Non-deterministic functions like USERNAME,
STRTOX – they stop AS being able to cache
calculated values
• Late-binding functions like LINKMEMBER – they
stop AS being able to generate efficient query
plans
Warming the Cache
• Analysis Services can cache Storage Engine
data in most circumstances
• It can also cache calculation results in many
cases too
• As a result, most queries execute much faster
(if not immediately) on a warm cache
• So why not run a sample of common queries
every night after cube processing?
How to Warm the Cache
• The Create Cache MDX statement (available in
AS2K and AS2005 post SP2) is useful for
loading the Storage Engine cache
• Several tools can be used to automate running
of queries:
– Ascmd, available in the SQL Server samples
– SSIS, using the Execute SQL task
Resources
• “Analysis Services 2005 Performance Guide”
white paper
• “Identifying and Resolving MDX Query
Performance Bottlenecks” white paper
• Mosha’s blog:
http://sqlblog.com/blogs/mosha
• My blog:
http://cwebbbi.spaces.live.com/
Questions?
Platinum
www.SQLBits.com
www.sqlbits.com
Gold
[Conference Web site]
www.SQLBlogCasts.com
[Becoming the premier Blogging site for SQL professionals]
Silver
www.SQLServerFAQ.com
[UK SQL Server Community Website]
UK SQL Bloggers
cwebbbi.spaces.live.com
sqlblogcasts.com/blogs/simons
sqlblogcasts.com/blogs/tonyrogerson
Remember :: Feedback Forms!!