Tuning a Very Large Data Warehouse Pichai Bala About Me • Working in the IT industry for the past 17 years • Working in Oracle.

Download Report

Transcript Tuning a Very Large Data Warehouse Pichai Bala About Me • Working in the IT industry for the past 17 years • Working in Oracle.

Tuning a Very Large Data
Warehouse
Pichai Bala
About Me
• Working in the IT industry for the past 17
years
• Working in Oracle since 1993.
• Working in Data Warehouse and BI since
2003
Disclaimer
The views expressed in this presentation are
mine and does not represent the organization I work for
or the organizations I had worked for in the past.
Please always test and validate the ideas presented here
in a test environment first.
A Day in the life of a DBA
How about this one?
What the chaos mean for the DBA?
•
•
•
•
•
•
•
Free buffer waits
enq: KO - fast object checkpoint
enq: TM - contention
db sequential read
CPU time
Logical I/O
Physical I/O
What it means to the End Users?
•
•
•
•
•
•
•
ETL Load/Batch Job Delays
Reporting Delays
Decision Making Delays
Business Analytics Delays
Customer Intelligence Delays
Planning and Forecasting Delays
Key Performance Metrics Delays
Data Warehouse is now in
Death Bed
But, Why?
Data Warehouse Vicious Cycle
• Data gets deployed
• Gains User Acceptance
• More Users and More Demands and
Needs
• Existing Data Grows and New Data gets
Deployed
…and gets into the Death Spiral
Performance
Data Growth
Possible Causes….
• Lack of proper and meaningful
maintenance
• Human errors
• Poor Design
• Bad SQLs by developers, users
• Poor monitoring and scheduling etc..
Tuning Strategy
• Keep it Simple
• Low Intensity Changes with low impact but
with high performance benefits
• Localized changes
• No change in logic
• Easy to understand, test and deploy
Reduce Wastage
•
•
•
•
•
Reduce CPU
Reduce Logical IO
Reduce Physical IO
Reduce UNDO
Reduce Direct Path Reads
How it can be done?
•
•
•
•
•
•
•
•
•
•
Server Tuning
Instance Tuning and Maintenance
Database Tuning and Maintenance
Table Reorganizations/Redefinitions
New Indexes
Regular Statistics Collection
Views
SQL/PLSQL Code Changes
Working with other teams
Educating/Training the users
Instance/Database Tuning
•
•
•
•
•
•
•
SGA Max Size
DB Cache Size
Shared Pool
Large Pool
No. of DB Writers
Redo Log File Size
Typical Init.ORA parameters like
QUERY_REWRITE, BITMAP_MERGE_JOIN
SQL/PLSQL Tuning
•
•
•
•
•
•
Avoid Clutter
Use Indexes when appropriate
Full Table Scan is not bad
Revisit the code
Cunning code is not always necessary
Work with other teams and business to
reduce complexity in code
• Avoid Hints
Query Results can be wrong
• In 10G use ORDER BY whenever GROUP
BY is used
• Hidden parameter can be enabled with the
help of Oracle Support
Pillars of the Data Warehouse
•
•
•
•
•
•
•
Partitioning
Parallelism
Aggregations
Compression
Materialized Views
Read Only Tablespaces
Data Archival
Partitioning
• Range Partitioning
• List Partitioning
• Range List Partitioning
• Range Hash Partitioning
• Hash Partitioning
Caveat: Joins beware.
Parallelism
•
•
•
•
•
•
•
Tables can be built parallel
Parallel Indexes
Parallel Hints while loading or querying.
Alter table <xxx> move … parallel (degree 8) …;
Alter table <xxx> split … parallel( degree 4) …;
Create table <xxx> parallel(degree 4)…
Sufficient LARGE_POOL helps greatly
Aggregations
• Aggregations and MVs are the soul of any
DSS
• Most BI tools supports Aggregation
Awareness
• Have multiple aggregations
• Aggregations help users with adhoc
queries
• Daily, Monthly and Yearly Aggregations
are very common in most DSS
Compression
•
•
•
•
•
•
•
Saves Disk Space by 40 to 50%
Reduces Logical IO
Reduces Physical IO
Reads will be fast
DMLs will be slow
Compress Table as well as Index
Caveat : You can’t uncompress after the table is
compressed
ORA-01735: invalid ALTER TABLE option
Materialized View
•
•
•
•
Fast Refresh may be very slow
From 10G MV can be parallel
MVs can be partitioned
MV_CAPABILITY results can be
misleading.
• ALTER MATERIALIZED VIEW
<mv_name> parallel (degree 4 );
For MV Fast Refresh to be successful a Complete Refresh should happen before
Exchange Partitions
• Very Useful
• Dictionary update only
• Can’t Exchange a table with bitmap
indexes with a partition
Partition exchange has issue with BITMAP indexes with the ora error for mismatch
indexes 0RA-14098
READONLY Tablespaces
• Data Warehouse has time variant nonvolatile data
• Say Range Partition on TIME, and making
historic tablespaces READONLY helps
Database Checkpoint process
Data Archival
• With various regulatory and internal
requirements data needs to be retained for 2 to
30 years.
• Data growth is exponential
• Archival is needed to start it small and keep it
small
• Saves $$$ in Database licenses and
maintenance.
• Helps the optimizer to get results faster from a
smaller set
Rolling Partitions
• If design permits instead of creating new
partitions every time the same partition
can be reused again and again.
• Like SUNDAY can be reloaded again on
the same partition next Sunday.
• Rolling Partitions by HOUR or by DAY of
the WEEK can be considered
• Helps Data Retention Strategies too.
Case of HUGE UNDO
• More than 30G of UNDO was getting
generated for a 1.5G table
Fix the code and fix the problem.
Misleading V$lock
• Blocking locks won’t show in v$lock but
locks would exist
• Use x$kgllk or x$kglpn to identify and kill
the blocking sessions.
Package Invalidations
• Package gets invalidated but can’t
recompile itself because of sessions
holding them invisibly
• Coding and deployment standards can
help
ORA-02049: timeout: distributed transaction
waiting for lock
• Flush the Shared Pool, the failures go
away
• From 10G you can avoid bounces by
flushing buffer_cache and shared_pool
Again?
Stuck in traffic? Meet the new
supercar based on Ferrari that
could fly you out of jams.
* Only £500,000.
Flying Ferrari