End-to-End Metrics for Troubleshooting and Monitoring Dave Abercrombie Principal Database Architect, Convio NoCOUG Spring Conference 2009, May 21 © 2009 Convio, Inc.

Download Report

Transcript End-to-End Metrics for Troubleshooting and Monitoring Dave Abercrombie Principal Database Architect, Convio NoCOUG Spring Conference 2009, May 21 © 2009 Convio, Inc.

End-to-End Metrics
for Troubleshooting
and Monitoring
Dave Abercrombie
Principal Database Architect, Convio
NoCOUG Spring Conference 2009, May 21
© 2009 Convio, Inc.
Agenda
■ Benefits
■ Brief description
■ Tag Details
■ Setting tags (PL/SQL and Java)
■ Real-time monitoring
■ Historical troubleshooting
■ Overhead
■ Other ideas
© 2009 Convio, Inc.
Reasons to use End-to-End Metrics
■ Real-time application details
■ Provide meaning to SQL_IDs
■ Application context
■ Customer level
■ Improve staff communication
© 2009 Convio, Inc.
More reasons
■ Similar, but not identical SQL texts
■ Across sessions
■ Enhance historical diagnostics
■ Focus tuning during development
■ Flexible, so customize!
© 2009 Convio, Inc.
Overview
■ Session-level tags (four)
■ Application sets the tags
■ Tags visible in
V$
ASH, AWR
Trace files
■ Initiate tracing
■ Aggregate statistics
■ Flexible, incremental
© 2009 Convio, Inc.
Simple example
SQL>
SQL>
2
3
4
5
6
-- set tags
BEGIN
DBMS_APPLICATION_INFO.SET_MODULE (
module_name => 'simple',
action_name => 'example');
END;
/
PL/SQL procedure successfully completed.
SQL> -- retrieve tags for current session
SQL> SELECT module, action
2
FROM v$session
3
WHERE sid = SYS_CONTEXT('userenv', 'SID');
MODULE
ACTION
---------- ---------simple
example
© 2009 Convio, Inc.
Tags
Name
Length
Module
48
Action
32
Client_Identifier
64
Client_Info
64
© 2009 Convio, Inc.
Tables that use these tags
ACTION MODULE
CLIENT_IDENTIFIER
(CLIENT_ID)
CLIENT_INFO
64
V$SESSION
32
48
64
V$ACTIVE_SESSION_HISTORY
32
48
64
DBA_HIST_ACTIVE_SESS_HISTORY
32
48
64
V$SQL
64
64
DBA_HIST_SQLSTAT
64
64
▶ This is not a complete list!
▶ Length varies (use v$session)
▶ Name varies (v$session.client_identifier, others use client_id)
© 2009 Convio, Inc.
Agenda
■ Benefits
■ Brief description
■ Tag Details
■ Setting tags (PL/SQL and Java)
■ Real-time monitoring
■ Historical troubleshooting
■ Overhead
■ Other ideas
© 2009 Convio, Inc.
Design Challenges
■ Level of detail
■ Availability of tag data
■ Tag quirks:
Various lengths,
Availability varies,
Funny names
■ Short length
© 2009 Convio, Inc.
More Design Challenges
■ Ability to aggregate
■ Uniqueness
■ Hierarchy
■ Restoring to nulls
■ Session management (connection pools)
■ When do called methods override tags?
© 2009 Convio, Inc.
Setting tags with PL/SQL
DBMS_APPLICATION_INFO.SET_MODULE(
module_name IN VARCHAR2,
action_name IN VARCHAR2);
DBMS_APPLICATION_INFO.SET_ACTION(
action_name IN VARCHAR2);
DBMS_APPLICATION_INFO.SET_CLIENT_INFO(
client_info IN VARCHAR2);
© 2009 Convio, Inc.
Setting tags with JDBC
String metrics[] = new
String[OracleConnection.END_TO_END_STATE_INDEX_MAX];
metrics[END_TO_END_ACTION_INDEX] = "Spike";
metrics[END_TO_END_MODULE_INDEX] = "Buffy";
// Set these metrics
conn.setEndToEndMetrics(metrics,(short)0);
// Do some work
// Update a metric
metrics[END_TO_END_MODULE_INDEX] = "Faith";
© 2009 Convio, Inc.
Potential tag values
■
■
■
■
■
■
■
■
■
■
■
PL/SQL package, method
Java object, method
JSP file
Web page URL
Task ID, current step
Customer ID
App server
IP:port
Version
Debugging
Progress indicator
© 2009 Convio, Inc.
Ideas for setting tags
■
■
■
■
■
■
■
■
■
Don’t forget to set back to NULL
Human readable vs. encoded
Overloaded, delimited
Database API wrapper
Hierarchical
Consistency
Mix and match styles
Some utilities already set tags
Integrate with other logging
© 2009 Convio, Inc.
BUG 8524840 - Do not use multibyte tags
■ Multibyte module tags
■ Length? Bytes vs. characters
■ ORA-07445 core dump
■ _MEMSET()+320 KEWE_SET_NEWMODULE
■ New bug
■ Versions 10.2.0.2, 10.2.0.3 (at least)
© 2009 Convio, Inc.
Agenda
■ Benefits
■ Brief description
■ Tag Details
■ Setting tags (PL/SQL and Java)
■ Real-time monitoring
■ Historical troubleshooting
■ Overhead
■ Other ideas
© 2009 Convio, Inc.
Real-time monitoring
■ V$SESSION
■ Progress indicator
Cannot aggregate
CLIENT_INFO
■ Lock pile-ups
■ Inter-session communication
■ Custom probes
■ ASH
■ dbms_monitor (trace)
© 2009 Convio, Inc.
DBMS_MONITOR - tracing
■ Optional
■ Enable/disable traces on specific:
Module
Action
Module and Action
Client_Identifier
■ Service name
Look at v$session.service_name
Perhaps dba_services.network_name?
■ Listed in dba_enabled_traces
■ Examples:
Norton Debes, “Secret Oracle”
James Koopmann, “DBAsupport.com”
© 2009 Convio, Inc.
Example: custom probe
■ Delayed commit from app server
■ Challenges:
Idle wait event (waiting for message)
Common, frequent, widespread DML
Neither ASH nor SQL_ID helped
■ Probe design:
cron script
Join v$transaction and v$session
Filter for idle wait event
Transaction started > 5 seconds ago
Tags identified culprit
© 2009 Convio, Inc.
Agenda
■ Benefits
■ Brief description
■ Tag Details
■ Setting tags (PL/SQL and Java)
■ Real-time monitoring
■ Historical troubleshooting
■ Overhead
■ Other ideas
© 2009 Convio, Inc.
Historical Troubleshooting
■ dba_hist_active_sess_history
■ Aggregate wait counts, durations
■ Complementary, not replacement
■ Supported by OEM, etc.
■ Trace files, deadlocks, etc
■ V$SQL, AWR: Session-level, first parse
■ dbms_monitor (stat)
© 2009 Convio, Inc.
DBMS_MONITOR – aggregate stats
■ Optional
■ Enable/disable aggregations
■ Module, action, client_id, service_name
■ Views
v$serv_mod_act_stats
v$client_stats
v$service_stats
■ Listed in dba_enabled_aggregations
■ Examples:
Norton Debes, “Secret Oracle”
James Koopmann, “DBAsupport.com”
© 2009 Convio, Inc.
Stats gathered, by class
User
SQL
user calls
user I/O wait time
DB time
application wait time
concurrency wait time
cluster wait time
user rollbacks
user commits
session logical reads
opened cursors cumulative
parse time elapsed
execute count
session cursor cache hits
workarea executions – multipass
workarea executions – onepass
parse count (total)
workarea executions - optimal
Redo, Cache
RAC cache
redo size
physical reads
physical writes
db block changes
gc current block receive time
gc current blocks received
gc cr blocks received
gc cr block receive time
© 2009 Convio, Inc.
Example: prioritize tuning efforts
■ Large, expensive background tasks
■ Challenges:
Many executions
Many flavors
Business critical
Complicated, frequent revisions
■ Study design:
Count and sum read wait events
Group by tags
Prioritize
© 2009 Convio, Inc.
Agenda
■ Benefits
■ Brief description
■ Tag Details
■ Setting tags (PL/SQL and Java)
■ Real-time monitoring
■ Historical troubleshooting
■ Overhead
■ Other ideas
© 2009 Convio, Inc.
Overhead
■ About 5 microseconds
PL/SQL loop
Need to defeat compiler optimization
■ No wait events
■ No extra network roundtrips (Norton Debes)
■ OCI no extra DB calls (Cary Milsap)
OCI_ATTR_MODULE
OCI_ATTR_ACTION
OCI_ATTR_CLIENT_INFO
© 2009 Convio, Inc.
Loop elapsed time
DBMS_APPLICATION_INFO
SET_MODULE()
"DB time" as function of number of calls
4.8 microseconds per call
for loop_counter in 1 .. 5000000
Loop
250
dbms_application_info.set_module (
'fourtyEight fourtyEight fourtyEight',
'thirtyTwo thirtyTwo thirtyTwo'
);
DB time in seconds
200
dbms_application_info.set_module (
'module module module module',
'action action action'
);
150
end loop;
100
dbms_application_info.set_module (
null,
null
);
50
0
0.E+00
1.E+07
2.E+07
3.E+07
4.E+07
number of API calls
© 2009 Convio, Inc.
5.E+07
Agenda
■ Benefits
■ Brief description
■ Tag Details
■ Setting tags (PL/SQL and Java)
■ Real-time monitoring
■ Historical troubleshooting
■ Overhead
■ Other ideas
© 2009 Convio, Inc.
Other Ideas
■ Dynamic Monitoring Service (DMS)
Oracle Application Server
Controls tag setting
■ Wrap PL/SQL API
Custom stat gathering code
Redefine public synonym
■ Prevent Routines from Running Simultaneously
Amar Kumar Padhi, Database Journal, 3/10/2004
Simpler than explicit locking
© 2009 Convio, Inc.
Reasons to use End-to-End Metrics
■
■
■
■
■
■
■
■
■
■
Real-time application details
Provide meaning to SQL_IDs
Application context
Customer level
Improve staff communication
Similar, but not identical SQL texts
Across sessions
Enhance historical diagnostics
Focus tuning during development
Flexible, so customize!
© 2009 Convio, Inc.