#CCNZ www.mscommunities.co.nz What is going on here??? Steven Wang • Senior DBA/Senior BI Specialist at BNZ • MCITP/MCTS: BI Developer, Database Developer and Database Administrator •
Download
Report
Transcript #CCNZ www.mscommunities.co.nz What is going on here??? Steven Wang • Senior DBA/Senior BI Specialist at BNZ • MCITP/MCTS: BI Developer, Database Developer and Database Administrator •
#CCNZ
www.mscommunities.co.nz
What is going
on here???
Steven Wang
• Senior DBA/Senior BI Specialist at BNZ
• MCITP/MCTS: BI Developer, Database
Developer and Database Administrator
• Owner of TellYes Data Intelligence
• Blogs: www.msbicoe.com
• Email: [email protected]
Agenda
• What Is minimal logging and why does it
matter?
• What can be minimally logged?
• Metadata-Only Operations
• Things under the hood (Demo)
• Data Manoeuvring technics on very large
tables
• Real world example
• Q&A
Small Quiz
A. A minimally logged operation will always
reduce the transaction log backup size?
True
False
B. A minimally logged operation will always
reduce the transaction log size?
True
False
C. A minimally logged operation is always
faster than fully logged operation ?
True
False
Answer
A. A minimally logged operation will always
reduce the transaction log backup size?
True
False
B. A minimally logged operation will always
reduce the transaction log size?
True
False
C. A minimally logged operation is always
faster than fully logged operation?
True
False
What is Minimal Logging and why does it
matter?
• SQL server uses ‘write-ahead logging’.
Everything has to be written on to log file first
• A minimally logged operation is one that does
not always log every row; it only logs the
extend allocations
• Transaction log will fill rapidly under full
recovery model, it is likely to be grown during
a large quantity of data operation
• “The transaction log is ALWAYS zero initialized
when first created, manually grown, or autogrown.” --Paul Randal
What can be minimally logged?
• SELECT INTO …. FROM Table
• Bulk Import Operation:
BCP
BULK Insert
INSERT INTO … SELECT FROM
OPENROWSET(BULK…)
• Create/Rebuild Index
• Using the .Write in the Update for new data
• The Creation of new heap by Dropping Index
Conditions
• It has to be under Bulk-Logged or Simple
Recovery Model
• Table is not being replicated
• Tablock hint needs to be used
• Something talking about later
Conditions (Continue)
Heap + Tablock
Not Empty
Data
Data
Index
Empty
Data
Data
Index
Heap
Minimal Logging
Heap + Index
Full Logging
Conditions (Continue)
Clustered + Tablock
Not Empty
Data
Data
Index
Empty
Data
Data
Index
Clustered
Minimal Logging
Clustered + Index
Full Logging
Conditions (Continue)
• Use Trace Flag 610
Can be used to get minimal logging
for empty heap and clustered table
Can be used to get minimal logging in
a non-empty B-Tree
Tablock hint is not needed for table
without nonclustered index
The first page is always fully logged
• Can be turned on instance-wide or
session-wide
• Fully tested in your environment before
use
Metadata-Only Operations
• Truncate Table
• Drop Table
• Partition Switch
• Partition Merge for 2 empty partitions
• Partition Split for an empty partition
Things Under the Hood (Demo)
Data Manoeuvring technics on very
large tables
• Bulk Load Partitioned Table
Use parallelism, per bulk load per CPU
core
Create staging tables having exactly
same structure as target table in same
filegroup with no indexes
Bulk load data into staging tables
Create indexes and constraints on
staging tables.
Switch the staging tables into
partitioned table partitions
Alter database MyDB Set recovery Bulk_Logged;
Go
Alter Database MyDB Modify Filegroup FG1
Default;
GO
Select Col1, Col2, Col3,…
Into Stage_1
From Target_Table Where 0=1
Go
...
Go
Alter Database MyDB Modify Filegroup FG4
Default;
GO
Select Col1, Col2, Col3,…
Into Stage_4
From Target_Table Where 0=1
Go
Source Data File
Staging Tables
Destination
Partitioned Table
Data_201104
Stage_4
(FG4)
Partition_4
(FG4)
Data_201103
Stage_3
(FG3)
Partition_3
(FG3)
Data_201102
Stage_2
(FG2)
Partition_2
(FG2)
Data_201101
Stage_1
(FG1)
Partition_1
(FG1)
Source Data File
Staging Tables
Destination
Partitioned Table
Data_201104
Bulk Insert
Stage_4
(FG4)
Partition_4
(FG4)
Data_201103
Bulk Insert
Stage_3
(FG3)
Partition_3
(FG3)
Data_201102
Bulk Insert
Stage_2
(FG2)
Partition_2
(FG2)
Data_201101
Bulk Insert
Stage_1
(FG1)
Partition_1
(FG1)
Source Data File
Staging Tables
Bulk Insert
Create
Indexes; Stage_4
(FG4)
2. Apply other settings to comply with
target;
3. Create constraints;
Stage_3
Data_201104 1.
Destination
Partitioned Table
Partition_4
(FG4)
Bulk Insert
(FG3)
Partition_3
(FG3)
Data_201102
Bulk Insert
Stage_2
(FG2)
Partition_2
(FG2)
Data_201101
Bulk Insert
Stage_1
(FG1)
Partition_1
(FG1)
Data_201103
Source Data File
Staging Tables
Destination
Partitioned Table
Stage_4
Partition_4
InsertStage_1 SWITCH TO MyTargetTable
Alter Bulk
Table
(FG4)
(FG4)
Partition 1;
Go
...
Partition_3
Stage_3
Data_201103 Go Bulk Insert
(FG3)
(FG3)
Alter Table Stage_4 SWITCH TO MyTargetTable
Partition 4;
Data_201104
Data_201102
Bulk Insert
Stage_2
(FG2)
Data_201101
Bulk Insert
Stage_1
(FG1)
Partition_2
(FG2)
Partition_1
(FG1)
Source Data File
Destination
Partitioned Table
Staging Tables
Data_201104
Bulk Insert
Stage_4
(FG4)
Switch In
Partition_4
(FG4)
Data_201103
Bulk Insert
Stage_3
(FG3)
Switch In
Partition_3
(FG3)
Data_201102
Bulk Insert
Stage_2
(FG2)
Switch In
Partition_2
(FG2)
Data_201101
Bulk Insert
Stage_1
(FG1)
Switch In
Partition_1
(FG1)
Data Manoeuvring technics on very
large tables (Continues)
• Delete a very large amount of rows in a
table, for example, over 50% of the total
Don’t Delete! Think of using truncate
or drop table instead.
How??
Delete a very large amount of data from a non-partitioned table
Start
Alter Database MyDB
Set recovery
bulk_Logged
Drop the Original Table
Select Data need to
keep into a new table
Rename the new table
to Original table name
Create indexes and
other settings as
needed
Set the recovery model
back to full
End
Delete a very large amount of data from a partitioned table
Partitioned Table
All Data in the
partition need
to be deleted
Partition_4
(FG4)
Switch Out
Stage_4
(FG4)
Partition_3
(FG3)
Switch Out
Stage_3
(FG3)
Select Into
Stage_1
(FG1)
Partition_2
(FG2)
More than half
of the data
need to be
deleted
Partition_1
(FG1)
Delete a very large amount of data from a partitioned table
Partitioned Table
Stage_4
(FG4)
Partition_4
(FG4)
Drop Stage
Table
Partition_3
(FG3)
Stage_3
(FG3)
Partition_2
(FG2)
Partition_1
(FG1)
Switch Out
Stage_1
(FG3)
Stage_1
(FG1)
Delete a very large amount of data from a partitioned table
Partitioned Table
Partition_4
(FG4)
Partition_3
(FG3)
Partition_2
(FG2)
Stage_1
(FG1)
Switch In
Partition_1
(FG1)
Data Manoeuvring technics on very
large tables (Continues)
• Update a very large amount of rows in a
table for majority of columns
Don’t Update! Think of using inserting
data instead to achieve minimal
logging.
Really??
??
Update a very large amount of data for a non-partitioned table
MyData
MyData_New
Existing Data
New Data
MyData_Temp
Select A.pk_ID
, Coalesce(B.Col1, A.Col1) As Col1
, Coalesce(B.Col2, A.Col2) As Col2
, ...
Into MyData_Temp
From MyData A
Left Outer Join
MyData_New B
On A.pk_ID = B.pk_ID
Updated Data
• Drop MyData
• Truncate MyData_New
• Rename MyData_Temp to
MyData
Update a very large amount of data for a partitioned table
I believe that you have already got a
fairly good idea how to apply the similar
technical to perform the update on
partitioned table.
Real World Example
• A Table has to be partitioned daily
• The data has to be kept more than 7
years
• Every first day of the month a partition
merge operation will be performed to
consolidate one month’s partitions which
is 30 month old to one partition
• How will you do it?
Thank You
• Email: [email protected]
• Blogs: www.msbicoe.com
• Linkedin:
http://nz.linkedin.com/pub/stevenwang/1b/ab/b8b
Sponsor
Premier
Partners
Associated
Partners
Supporting
Partners