BIE13-INT Parallel Processing Late Arriving Facts Using the Database Engine Lookup Cache Change Data Capture Slowly Changing Dimensions Null Value Substitution General Performance Tips Package Benchmark Baseline Step Time DimCustomer 00:02:37:987 1,202.0 DimNation 00:00:00:263 95.0 DimPart 00:00:19:777 10,113.0 DimPartSupplier 00:01:51:643 7,165.0 DimRegion 00:00:00:417 12.0 DimSupplier 00:00:02:207 4,533.0 Lineitem 00:13:56:847 7,171.0 Orders 00:01:32:843 16,156.0 1,475 5,805.9 00:18:15:190 KB / Sec Rows / Sec.

Download Report

Transcript BIE13-INT Parallel Processing Late Arriving Facts Using the Database Engine Lookup Cache Change Data Capture Slowly Changing Dimensions Null Value Substitution General Performance Tips Package Benchmark Baseline Step Time DimCustomer 00:02:37:987 1,202.0 DimNation 00:00:00:263 95.0 DimPart 00:00:19:777 10,113.0 DimPartSupplier 00:01:51:643 7,165.0 DimRegion 00:00:00:417 12.0 DimSupplier 00:00:02:207 4,533.0 Lineitem 00:13:56:847 7,171.0 Orders 00:01:32:843 16,156.0 1,475 5,805.9 00:18:15:190 KB / Sec Rows / Sec.

BIE13-INT
Parallel
Processing
Late Arriving
Facts
Using the
Database
Engine
Lookup Cache
Change Data
Capture
Slowly
Changing
Dimensions
Null Value
Substitution
General
Performance
Tips
Package Benchmark
Baseline
Step
Time
DimCustomer
00:02:37:987
350
1,202.0
DimNation
00:00:00:263
121
95.0
DimPart
00:00:19:777
2557
10,113.0
DimPartSupplier
00:01:51:643
2121
7,165.0
DimRegion
00:00:00:417
76
12.0
DimSupplier
00:00:02:207
1218
4,533.0
Lineitem
00:13:56:847
1656
7,171.0
Orders
00:01:32:843
3697
16,156.0
1,475
5,805.9
00:18:15:190
KB / Sec
Rows / Sec
Parallel
Processing
Late Arriving
Facts
Using the
Database
Engine
Lookup Cache
Change Data
Capture
Slowly
Changing
Dimensions
Null Value
Substitution
General
Performance
Tips
Extract
Transform
Load
Parallel
Processing
Late Arriving
Facts
Using the
Database
Engine
Lookup Cache
Change Data
Capture
Slowly
Changing
Dimensions
Null Value
Substitution
General
Performance
Tips
Destination
Source
Staging
Parallel
Processing
Late Arriving
Facts
Using the
Database
Engine
Lookup Cache
Change Data
Capture
Slowly
Changing
Dimensions
Null Value
Substitution
General
Performance
Tips
• Modified flag
• Checksum
• Extract operations from Transaction Log
• LastModifiedDate column
• Comparing database snapshots
Parallel
Processing
Late Arriving
Facts
Using the
Database
Engine
Lookup Cache
Change Data
Capture
Slowly
Changing
Dimensions
Null Value
Substitution
General
Performance
Tips
MERGE INTO [DimProduct] AS FACT
USING [Staging] AS SRC
ON (
FACT.ProductAlternateKey = SRC.ProductAlternateKey
)
WHEN MATCHED AND FACT.EndDate is NULL -- update the current record
THEN UPDATE SET
FACT.[ArabicDescription] = SRC.ArabicDescription
,FACT.[ChineseDescription] = SRC.ChineseDescription
,FACT.[EnglishDescription] = SRC.EnglishDescription
,FACT.[FrenchDescription] = SRC.FrenchDescription
,FACT.[GermanDescription] = SRC.GermanDescription
,FACT.[HebrewDescription] = SRC.HebrewDescription
,FACT.[JapaneseDescription] = SRC.JapaneseDescription
,FACT.[ThaiDescription] = SRC.ThaiDescription
,FACT.[TurkishDescription] = SRC.TurkishDescription
,FACT.[ReorderPoint] = SRC.ReorderPoint
,FACT.[SafetyStockLevel] = SRC.SafetyStockLevel
;
INSERT INTO [DimProduct] ([ProductAlternateKey],[ListPrice],[EnglishDescription],[StartDate])
SELECT [ProductAlternateKey],[ListPrice],[EnglishDescription],[StartDate]
FROM (
MERGE INTO [DimProduct] AS FACT
USING [Staging] AS SRC
ON ( FACT.ProductAlternateKey = SRC.ProductAlternateKey )
WHEN NOT MATCHED THEN
INSERT VALUES (
SRC.ProductAlternateKey
,SRC.ListPrice
,SRC.EnglishDescription
,GETDATE()
-- StartDate
,NULL
-- EndDate
)
WHEN MATCHED AND FACT.EndDate is NULL
THEN UPDATE SET FACT.EndDate = GETDATE()
OUTPUT $Action Action_Out
,SRC.ProductAlternateKey
,SRC.ListPrice
,SRC.EnglishDescription
,GETDATE() StartDate
) AS MERGE_OUT
WHERE MERGE_OUT.Action_Out = 'UPDATE'
Seconds to Process
SCD Performance Comparison – 100k Row Dimension Table
1000
900
800
700
600
500
400
300
200
100
0
SCD Wizard (No Insert/Update)
SCD Wizard
SCD Wizard (Optimized)
MERGE
20k Rows
27
310
70
12
100k Rows
128
1674
404
94
200k Rows
260
3632
854
178
Ease of Creation
Maintenance
Large Dimensions
Small Change Set
Overall Performance
Error Recovery
Parallel
Processing
Late Arriving
Facts
Using the
Database
Engine
Lookup Cache
Change Data
Capture
Slowly
Changing
Dimensions
Null Value
Substitution
General
Performance
Tips
DimProduct
FactOrders
select [ProductKey],
[ProductAlternateKey],
[StartDate],
[EndDate]
from [DimProduct]
where [ProductAlternateKey] = ?
and
[StartDate] <= ?
and (
[EndDate] is null or
[EndDate] > ?
)
OrderDate
ProductNumber
2001-07-01
LJ-0192-L
OrderDate
Product
Key
StartDate
EndDate
2001-07-01
LJ-0192-L
232
2001-07-01
2002-06-30
2001-07-01
LJ-0192-L
233
2002-07-01
2003-06-30
2001-07-01
LJ-0192-L
234
2003-07-01
NULL
OrderDate
Product
Key
2001-07-01
LJ-0192-L
232
140000
120000
Rows / Second
100000
80000
60000
40000
20000
0
Rows Per Second
Lookup
3639
Merge Join
55708
Script (Partial)
123659
Script (Full)
128978
Lookup
Merge Join &
Conditional Split
Custom Script
Ok for small number of rows
Best for equal data set size
Good middle ground
Best for overall performance
Parallel
Processing
Late Arriving
Facts
Using the
Database
Engine
Lookup Cache
Change Data
Capture
Slowly
Changing
Dimensions
Null Value
Substitution
General
Performance
Tips
Reducing database and memory usage
Customers
Cache most common values
SELECT TOP CustomerId …
Customers
Get the rest
Customers
Parallel
Processing
Late Arriving
Facts
Using the
Database
Engine
Lookup Cache
Change Data
Capture
Slowly
Changing
Dimensions
Null Value
Substitution
General
Performance
Tips
Parallel
Processing
Late Arriving
Facts
Using the
Database
Engine
Lookup Cache
Change Data
Capture
Slowly
Changing
Dimensions
Null Value
Substitution
General
Performance
Tips
www.microsoft.com/teched
www.microsoft.com/learning
http://microsoft.com/technet
http://microsoft.com/msdn