10G - New Manageability Features Presented by Lenka Vanek [email protected] Oracle 10g – Manageability  Active Session History (ASH) – Contains recent session activity  Automatic.

Download Report

Transcript 10G - New Manageability Features Presented by Lenka Vanek [email protected] Oracle 10g – Manageability  Active Session History (ASH) – Contains recent session activity  Automatic.

10G - New Manageability
Features
Presented by Lenka Vanek
[email protected]
Oracle 10g – Manageability
 Active Session History (ASH)
– Contains recent session activity
 Automatic Workload Repository (AWR)
= Infrastructure -> Central element – provides
services
 Automatic Database Diagnostic
Monitoring (ADDM)
– Generate advice based on AWR data
 Server Generated Alerts
- metrics computation and and threshold validation
done by Oracle database 10G directly
ASH =v$ session +History
 Contains recent session activity
 History of v$session_wait … records what is session
waiting for
 Every Second
 Inactive sessions not sampled.
• Design – rolling buffer in memory
• Size - between 1M & 128M (avg. sample record 600
bytes)
--- Algorithm used to estimate ASH buffers size
memory_quota = max(2% of sga_target, 5% of shared_pool_size);
/*
sga_target = 0 when AUTO SGA is OFF */
cpu_quota = 2MB * (# of CPUs);
ash_size = min( cpu_quota, memory_quota );
ash_size = max( 1MB, ash_size); /* atleast 1MB */
ash_size = min( 128MB, ash_size); /* atmost 128MB */
ASH
 Provider for ADDM
 v$session and v$session_wait join eliminated –
– Prior 10G - sessions experiencing waits were generally located by joining the
v$session_wait view with the v$session view.
– 10G - offers query simplification. All the wait event columns from v$session_wait
have been added to v$session.
 x$ash
 V$active_session_history - contains one row for each
active session per sample
 DBA_HIST_ACTIVE_SESS_HISTORY - contains
historical data
 the greater the system activity, the smaller the number
of seconds of session activity that can be stored in the
circular buffer
 Flushed every 30 minutes or when buffer is full
– MMON every 30 minutes and by MMNL (Memory
Monitor Light) whenever the buffer is full
– wrh$active_session_history
ASH
Rolling Buffer
Statistics
ASH
Recent History
30 min is just goal
v$session
SGA
V$ACTIVE_SESSION_HISTORY
MMON
MMNL
AWR Snapshots
ASH - Limitations
 Query of v$active_session_history
needs a session
 Query of v$active_session_history
requires all relevant latches in SQL
layer
 If system is crippled ASH will impose
more overhead on these latches
AWR – Automatic Workload Repository
 AWR is Infrastructure
 Collects, maintains and utilizes statistics
 Two major parts:
– In-memory statistics – fixed views- V$
– WR Schema, Snapshots – persistent portion for
historical analysis.
– SYSAUX tablespace - occupies 63.7% of space
– Process MMON – memory monitor – disk transfer,
snapshots purging, retention period
AWR
BG
…..
BG
FG
ASH SGA
In memory statistics
Collections:
Time model, wait classes,
OS stats, Metrics, SQL Stats
Object Stats
SQL*Plus
V$
DBA_%
AWR Snapshots
SYSAUX
7 days - Default
MMON
FG
v$sysstat
v$sql
v$segment_statistics
v$sys_time_model
v$osstat
v$event_name
………
ADDM
AWR = STATSPACK ++
 Foundation for all of the other self-tuning
features.
 Runs every 30 min
 Provides data for
–
–
–
–
–
ADDM
Alerts
Advisors
Cost Based Optimizer
End-to-end tracing
 Automatically installed, populated, purged
for 10G only
 Default retention – 7 days. This can be
changed.
AWR and Snapshots
 Stores information in form of Snapshots (similar
to statspack snapshots, but more precise)
 Snapshot = set of data captured at a certain time
 Each time a snapshot is taken, the ADDM is
triggered to do an analysis of the period
corresponding to the last two snapshots
Snapshot can be taken manually
BEGIN
DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();
END;
/
AWR – Base Statistics and Metrics
 Base statistics – raw data
 Metrics – secondary, derived from base
statistics
– Updated by MMON
– Example: avg number of physical reads per sec in the system in
the last 30 minutes
 Tract the rates of change
• Indicators of DB performance
• Deltas of Stats and Events over 15 and 60 seconds
• Max, min, avg., standard deviation over 30 min
• 10 minutes for File IO
• 30 minutes for SQL – metric values are constantly increasing in x$ until 30
minute snapshot when it is externalized into dba_hist_sqlstat
AWR - Metrics
• New Views
–
–
–
–
v$sysmetric ….not a 1 to 1 map of v$sysstat, there are some new values
v$sessmetric
v$metricname
v$filemetric, v$waitclassmetric, v$eventmetric, ……
 10G Supports metrics for
–
–
–
–
System
Session
File
Wait-event statistics
– The wait event model - steadily gaining ground as a good tuning
tool
– At any given moment an Oracle process is either busy servicing
a request or waiting for something to happen
Wait Event Enhancements

Formed new wait events classes … before too many
individual events
– Changes to v$event_name - CLASS# and CLASS
columns are added. These columns help to group related
events while analyzing the wait issues.
– Example - list the events related to IO,
 Understanding the overall health of the database.

New columns in the v$session and v$session_wait views that
track the resources sessions are waiting for.
Histograms of wait durations

– Assist in determining whether a wait event is a
frequent problem that needs addressing or a unique
event.
New Views

v$system_wait_class – the instance-wide time totals for the
number of waits and the time spent in each class of wait events.
Understanding the overall health of the database

v$session_wait_class - the number of waits and the time spent in
each class of wait event on a per session basis.

v$event_histogram – a histogram of the number of waits, the
maximum wait, and total wait time on a per-child cursor basis.

v$file_histogram – a histogram of all single block reads on a perfile basis. Determine if the bottleneck is a regular or a unique
problem.

v$temp_histogram – a histogram of all single block reads on a pertempfile basis.

v$session_wait_history – This view displays the last 10 wait events
for each active session.
Wait Classes Overview
 Administrative
 Configuration
(39)
–
–
switch logfile
rebuild index
 Application
–
–
(11)
enqueues
sqlnet break/reset
 Cluster (113)
 Commit (1)
–
Log file Sync
 Concurrency
–
–
–
Latches: cbc, lbc,
Lib cache locks
Buffer busy wait
(12)









(20)
log file size
Enqueues: ST, HW, ITL
Latch: redo copy,shared pool
Idle (56)
Network (25)
System I/O (19)
Scheduler (6)
User I/O (12)
Other (485)
AWR and STATISTICS
 Oracle10g to be either robust or simple
 You can control the set of statistics to
capture by using STATISTICS_LEVEL
parameter.
 If STATISTICS_LEVEL is set to:
• BASIC: The computation of AWR statistics
and all self-tuning capabilities are turned off.
• TYPICAL: Only part of the statistics are
collected. They represent what is typically
needed to monitor the Oracle server
behavior. DEFAULT.
• ALL: All possible statistics are captured.
AWR Limitations
Only for 10G
Each DB has its own AWR repository
Cannot provide cross instance analysis
Overhead on production box
PL/SQL interface – cannot
communicate if box is down
 STATSPACK cannot be migrated into
10G
 User cannot modify AWR Schema





Automatic Database Diagnostic
Monitor
 Performance Diagnostics within DB
 Not a monitor – perform analysis – an Advisor
 Analyzes the AWR data, much the same as a
human DBA would analyze a STATSPACK report
 Utilizes AWR and publishes report - every hour
 Identify problems - ADDM searches for lockand-latch contention, file I/O bottlenecks and
SGA shortages, etc…
 Proposes solutions – relies on Advisor for
solution
ADDM
 Where is the time spent?
 What did DB do?
 Elimination method – where is NOT
your problem – TREE Structure
 Uses a tree structure to represent all possible tuning issues
 The tree is based on the new wait and time model statistics
 Root of this tree represents the symptoms, and going down to
the leaves
 If time-based threshold is not exceeded for a particular node,
ADDM prunes the corresponding sub-tree
Limitation of ADDM
 Same limitation as AWR – only for 10G
 Works of a set of rules and these are
not external of DB. Based out of fixed
thresholds.
 Depends on ASH for analysis
Advisory Framework


Advisors are server components - provide useful feedback
about resource utilization and performance
Automatic Database Diagnostic Monitor (ADDM) = An advisor
for the database instance
 ADDM can call
– SQL Tuning Advisor - tuning advice for a SQL
statement
– SQL Access Advisor data
determines optimal ways to access
– Space Advisor
Segment Advisor: Responsible for space issues regarding a
database object. It analyzes the growth trends.
Undo Advisor: Suggests parameter values and the amount of
additional space that is needed to support flashback for a
specified time
Advisory Framework
– Memory Advisor – MMAN
PGA Advisor: recommends optimal usage of PGA
memory based on your workload.
SGA Advisor: tuning and recommending SGA size
depending on pattern of access for the various
components within the SGA:
Buffer Cache Advisor: Predicts cache hit rates for
buffer access for different sizes of the buffer cache.
Library Cache Advisor: Predicts the cursor cache
hit rate for the library cache for different sizes.
 AWR = Common Data Source
10G Server Generated Alerts
 Threshold and non-threshold alerts
– Tablespace full
– Snap-shot too old, Recovery Area Low On Free Space,
Resumable Session Suspended
 Notification by page/email/PDA
 Push not pull for efficiency –
 Advanced Queue
 Server pushes alerts
 Predefined persistent queue ALERT_QUE owned by SYS
 Alerts persist to AWR, review historically
 purged according to Workload Repository snapshot
purging policy
10G Server Generated Alerts
– Most Oracle server-generated alerts are
configured by setting two threshold values
on database metrics:
• Warning threshold – 85%
• Critical threshold – 97%
•Only space-related alerts have
thresholds defined by default –
Tablespace Space usage
– There are 161 metrics for which you can
define thresholds
Server Alerts Limitations
 Concept great – implementation has a
lot to be desired
 Only 4 alerts out of the box
 Rest needs to be set up by user
 Avoid false peaks – need to set # of
occurrences. This is very difficult in
production- How long should I wait?
QUESTIONS & ANSWERS