March 2009 Turning DBQL Data into Actionable Info

Download Report

Transcript March 2009 Turning DBQL Data into Actionable Info

Turning DBQL Data into Actionable
Information
Greg Hamilton
PS Sr. Consultant TD - Teradata
Performance and Workload Management Center of
Expertise
March 19th, 2009
Overview
• Database Query Log (DBQL) Basics
• DBQL Recommendations
• DBQL and TD 12 & 13
• Actionable DBQL
> Application Performance
> Workload Management
• Bonus Queries
2
DBQL Basics
DBQL - Tables
DBQLogTbl
For individual
queries:
query origination
start, stop and other timings
error codes
SQL (truncated)
CPU AND I/O values
DBQLSummaryTbl
Summary of
(short-running) queries:
high-volume queries:
query origination
response time, cpu and
io sums,
query counts
DBQLStepTbl
query step timings, row
counts, CPU and I/O, step
actions, etc
DBQLObjTbl
Tracks usage of database objects
(tables, columns, indexes,
database,etc)
DBQLSqlTbl
Holds full SQL text
DBQLExplainTbl
Holds full EXPLAIN text
4
Detail Table - DBQLogTbl
• Procid and
Queryid: Ties
all DBQL
tables together
except
summary
5
ProcID
CollectTimeStamp
QueryID
UserID
AcctString
ExpandAcctString
SessionID
LogicalHostID
RequestNum
InternalRequestNum
LogonDateTime
AcctStringTime
AcctStringHour
AcctStringDate
LogonSource
AppID
ClientID
ClientAddr
QueryBand
ProfileID
StartTime
FirstStepTime
FirstRespTime
LastStateChange
NumSteps
NumStepswPar
MaxStepsInPar
NumResultRows
TotalIOCount
AMPCPUTime
ParserCPUTime
UtilityByteCount
UtilityRowCount
ErrorCode
ErrorText
WarningOnly
DelayTime
AbortFlag
CacheFlag
StatementType
QueryText
NumOfActiveAMPs
MaxAMPCPUTime
MaxCPUAmpNumber
MinAmpCPUTime
MaxAmpIO
MaxIOAmpNumber
MinAmpIO
SpoolUsage
WDID
OpEnvID
SysConID
LSN INTEGER
NoClassification
WDOverride
SLGMet
ExceptionValue
FinalWDID
TDWMEstMaxRows
TDWMEstLastRows
TDWMEstTotalTime
TDWMAllAmpFlag
TDWMConfLevelUsed
TDWMRuleID
UserName
DefaultDatabase
AMPCPUTimeNorm
ParserCPUTimeNorm
MaxAMPCPUTimeNorm
MaxCPUAmpNumberNorm
MinAmpCPUTimeNorm
EstResultRows
EstProcTime
EstMaxRowCount
ExtraField1
ExtraField2
ExtraField3
ExtraField4
ExtraField5
ExtraField6
ExtraField7
SQL Table - DBQLSqlTbl
• One Row or more are written to DBC.DBQLSQLTBL
for each SQL statement
• Large statements can cause multiple rows to be
written in order to log the full query text.
BEGIN QUERY LOGGING WITH SQL, OBJECTS, STEPinfo, EXPLAIN limit
SQLTEXT=0 ON ALL ACCOUNT = ‘$H1$WG1R&S&D&H’;
Select * From DBC.DBQLSQLTBL;
ProcID
16383
6
CollectTimeStamp
QueryID
8/23/2007 08:23:23.48 163830418894739000
SqlRowNo
1
SqlTextInfo
SELECT * FROM dbc.dbcinfo ;
Object Table - DBQLObjTbl
• DBQL logs one row in DBQLObjTbl for each physical data
object used by the optimizer to satisfy the query.
• An object can be:
Prior to V2R6.2
Database
Table
Column
Primary Index
Secondary Index
Join Index
Journal Table
Added in V2R6.2
View Name
Macro
Trigger
Stored Procedure
Target Objects of Load Utilities
BEGIN QUERY LOGGING WITH SQL, OBJECTS, STEPinfo, EXPLAIN limit
SQLTEXT=0 ON ALL ACCOUNT = ‘$H1$WG1R&S&D&H’;
7
StepInfo Table - DBQLStepTbl
• One row per step is written to DBC.DBQLStepTbl
• Info: Step number, step name, start time, end time, row count
• Diagnostic in nature, shouldn’t be left on
BEGIN QUERY LOGGING WITH OBJECTS, SQL, StepInfo, Explain ON DBC;
END QUERY LOGGING ON DBC;
Select * from DBC.DBQLStepTbl;
ProcID
16383
16383
16383
16383
16383
Num
Of
Active
AMPs
1
2
2
1
?
CollectTimeStamp
8/23/2007 8:23:11 AM
8/23/2007 8:23:11 AM
8/23/2007 8:23:11 AM
8/23/2007 8:23:11 AM
8/23/2007 8:23:11 AM
Max
Amp
CPU
Time
0
0
0
0
?
Max
CPU
Amp
Number
?
?
?
?
?
Min
Amp
CPU
Time
0
0
0
0
?
Max
Amp
IO
0
0
8
8
?
QueryID
163830418894739000
163830418894739000
163830418894739000
163830418894739000
163830418894739000
Max
IO
Amp
Numb
er
?
?
1
1
?
Step Step
Lev1 Lev2 Step
Num Num Name
1
0 MLK
2
0 MLK
3
0 RET
4
0 Edt
5
0 RSF
Min
Max
Amp Spool AMP
IO
Usage Spool
0
0
0
0
0
0
8 2048 1024
0
0
0
?
?
?
Max
Spool
Amp
Number
?
?
1
?
?
Min
AMP
Spool
0
0
1024
0
?
* ExtraField1 = RowsWComprColumns Available in V2R6.1
8
StepStartTime
8/23/2007 8:23:03 AM
8/23/2007 8:23:03 AM
8/23/2007 8:23:03 AM
8/23/2007 8:23:03 AM
?
Step
WD
?
?
?
?
?
Utility
Table
LSN ID
?
?
?
?
?
?
?
?
?
?
StepStopTime
8/23/2007 8:23:03 AM
8/23/2007 8:23:03 AM
8/23/2007 8:23:03 AM
8/23/2007 8:23:03 AM
?
Rows W
Compr
Est IO
Columns Cost
0
0
0
0
0 33.71
0
0
?
?
Est
Net
Cost
0
0
0
0
?
Est
HR
Cost
0
0
0
0
?
Est
Proc
Time
0
0
0.03
0
?
Est
CPU
Cost
0
0
0.39
0
?
Max
Amp
CPU CPU
time Time
Norm Norm
0
0
0
0
0
0
0
0
?
?
Est
CPU IO
Row Row Row
Time count Count Count Count2
0
0
0
1?
0
0
0
2?
*
0
16
2
3?
0
8
0
1?
?
?
?
?
?
Max
CPU
Amp
Number
Norm
?
?
?
?
?
Min
Amp
CPU
Time Extra Extra
Norm Field1 Field2
0
0?
0
0?
0
0?
0
0?
?
?
?
Extra Extra
Field3 Field4
?
0
?
1
?
2
?
3
?
4
Explain Table - DBQLExplainTbl
• One row (or more) is written to DBC.DBQLExplainTbl
• Large explains can cause multiple rows to be written in order to
log the full explain text.
• Diagnostic in nature, shouldn’t be left on
BEGIN QUERY LOGGING WITH OBJECTS, SQL, StepInfo, Explain ON DBC;
END QUERY LOGGING ON DBC;
Select * from DBQLExplainTbl;
ExplainText
ProcID
CollectTimeStamp
QueryID
Exp
Row No
1) First, we lock a distinct EmpDB."pseudo table" for read on a
RowHash to prevent global deadlock for EmpDB.TestTable.
2) Next, we lock EmpDB.TestTable for read.
3) We do an all-AMPs RETRIEVE step from EmpDB.TestTable by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. The size of Spool 1 is estimated with low confidence
to be 2 rows (206 bytes). The estimated time for this step is
0.03 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 0.03 seconds.
16383 2007-08-23 08:51:20.60
9
163830418894739000
1
Recommendations for Logging and
Maintenance
Logging Recommendations
• ALL usage should be logged in DBQL 24 X 7
• Flush the DBQL Caches at default rate of 10 minutes
• Recommend logging usage in DBQL at the
AccountString level
> This may mean some work needs to be done to the
AccountStrings to ensure the appropriate level of logging
is done for each type of usage
> Our recommendation for AccountString setup:
$H1$WXYZ&S&D&H Where…
– $H1$ = The Performance Group
– WXYZ = The Work Group Identifier
– &S&D&H = ASE Variables – Session, Date, and Hour
11
Logging Recommendations
• Three types of logging Classified by usage
> Type 1 – Long running (Adhoc, Report Apps, Batch Loads)
– Example: BEGIN QUERY LOGGING WITH SQL, OBJECTS LIMIT
SQLTEXT=0 ON ALL ACCOUNT = ‘acctname’;
– TD13 default log all queries as Type 1
– Example: BEGIN QUERY LOGGING WITH SQL, OBJECTS LIMIT SQLTEXT=0 ON ALL;
> Type 2 - ONLY short running (subsecond) KNOWN work
(TPumps, Apps w/10000s of Tactical Queries )
– Example: BEGIN QUERY LOGGING LIMIT SUMMARY=10,50,100
CPUTIME ON ALL ACCOUNT = ‘acctname’; (V2R6)
– TD13 Type2 logged same as before
> Type 3 - Mostly short (subsecond) known work with
occasional long running unknown work (Tactical Apps with
Some Larger or Unknown Queries)
– Example: BEGIN QUERY LOGGING LIMIT THRESHOLD=100 CPUTIME
AND SQLTEXT=10000 ON ALL ACCOUNT = ‘acctname’; (V2R6)
– TD13 Type3 logged same as before
12
Recommendations for Historic DBQL Data
• Our recommendation is to remove the DBQL data
from the DBC database nightly
• If space is a concern consider changing retention
• Recommendation for DDL on Historical DBQL
>
>
>
>
Create new date column populated from Starttime on all tables
Put PPI on this date to provide better performing analysis
Join between DBQL tables on ProcId, QueryId and the Date
End logging on at least one user to dump the cache to ensure all
DBQL history is captured
> Begin logging on the user again
> Retain 13 months Detail & Summary, 3 months SQL & Object
> Keep history tables similar to DBC DBQL tables for other tools like
Teradata Workload Analyzer
NOTE: The standard maintenance process (for V2R6 and TD12) is now available
on Teradata.com
13
DBQL TD 12 & 13
What’s in V12
• Column Name Changes in DBQLogTBL
> Extrafield1 to InternalRequestNum
> TotalCPUTime to AmpCPUTime (Linux is recorded in Milliseconds, Unix is
recorded in hundredths)
> Extrafield5 to StatementType
> HotAmp1CPU to MaxAmpCPUTime
> HotCPUAmpNumber to MaxCPUAmpNumber
> LowAmp1CPU to MinAmpCPUTime
> HotAmp1IO to MaxAmpIO
> HotIOAmpNumber to MaxIOAmpNumber
> LowAmp1IO to MinAmpIO
> WDPeriodID to OPEnvID
> FirstRespTime will not be null and LastRespTime column is removed.
• New Columns in DBQLOGTBL
> LogonSource
> LastStateChange
> ParserCPUTime
15
What’s in V12
• New Columns in DBQLogTBL (Contd)
>
>
>
>
>
>
>
>
>
>
>
>
16
UtilityByteCount (not populated yet)
UtilityRowCount (only populated on the End Loading Statement entry)
SysConId
TDWMRuleID
AmpCPUTimeNorm
ParserCPUTimeNorm
MaxAmpCPUTimeNorm
MaxCPUAmpNumberNorm
MinAmpCPUTimeNorm
EstResultRows
EstProcTime
EstMaxRowCount
What’s in V12
• Column Name Changes in DBQLStepTBL
>
>
>
>
>
>
HotAmp1CPU to MaxAmpCPUTime
HotCPUAmpNumber to MaxCPUAmpNumber
LowAmp1CPU to MinAmpCPUTime
HotAmp1IO to MaxdAmpIO
HotIOAmpNumber to MaxIOAmpNumber
LowAmp1IO to MinAmpIO
• New Columns in DBQLStepTBL
>
>
>
>
>
>
>
>
17
EstCPUCost
RowCount2
SpoolUsage
MaxAmpSpool
MaxSpoolAmpNumber
MinAmpSpool
StepWD
LSN
What’s in V12
• New Columns in DBQLStepTBL
>
>
>
>
>
>
>
>
UtilityTableId
EstIOCost
EstNetCost
EstHrCost
CPUTimeNorm
MaxAmpCPUTimeNorm
MaxCPUAmpNumberNorm
MinAmpCPUTimeNorm
• Columns Changes in DBQLSummaryTBL (Contd)
> TotalCPUTime
AmpCPUTime
• New Columns in DBQLSummaryTBL (Contd)
> ParserCPUTime
> AmpCPUTimeNorm
> ParserCPUTimeNorm
18
TD13 DBQL Enhancement
• Other types of logging
allowed when logging all
turned on
• New SQL to add a WITH
NONE rule to inhibit
applying lower hierarchy
rules for a user,
user/account or group of
users
• Log by application name
or disable logging for an
application.
• Rules have a hierarchy
with a “Best Fit Rule”
policy.
19
Rule Type - Hierarchy
•1
• A rule based on
application name
•2
• A rule for this user and
the specific account
•3
• A rule for this user and
any account
•4
• A rule for all users and
the specific account
•5
• A rule for all users and
any account
Actionable DBQL
Application Performance
• Use DBQLogTBL to Identify Problem Queries
> Fields to focus on
– CPU
– IO
– Spool Usage (Combined with user spool info to proactively identify
users close to spool limit)
– ErrorCode
– Response Time (Identify possible blocking or contention on
resources)
– Returned rows
Response Time Calculation:
EXTRACT (SECOND FROM lastresptime) + (EXTRACT (MINUTE FROM lastresptime) * 60 ) +
(EXTRACT (HOUR FROM lastresptime) *60*60 ) + (86400 * (CAST ( lastresptime AS DATE) CAST ( starttime AS DATE) ) ) - (EXTRACT (SECOND FROM starttime) + (EXTRACT (MINUTE
FROM starttime) * 60 ) + (EXTRACT (HOUR FROM starttime) *60*60 ) )
21
Application Performance
• Use DBQLogTBL (Contd)
> Suspect Query Report
– Ranked on Impact CPU “HotAMP1CPU * (HASHAMP()+1)”
– Calculates CPU & IO skew, Product Join Indicator and Unnecessary
IO Indicator
– Filters out queries that don’t have a certain amount of skew, PJI or
UII
– UII & PJI – minimum depends on hardware version and/or
operating system
– Large PJI indicates possible product join
– Large UII indicates opportunity for reducing IO
PJI Calculation: CASE WHEN totalcputime < 1 OR TotalIOCount = 0 THEN 0 ELSE (a.TotalCPUTime
*1000)/a.TotalIOCount
UII Calculation:CASE WHEN totalcputime < 1 OR TotalCPUTime = 0 THEN 0 ELSE
a.TotalIOCount/(a.TotalCPUTime *1000)
22
Sample Suspect Query Report
23
Suspect Query Report by Process
• Suspect query report can be modified for different purposes
> Application
> Time of day
> Process
UserName
Baseline from MSI
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
MSI_USER
24
QryRespTime
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
00:00:00.100000
00:00:00.540000
00:00:06.320000
00:00:01.040000
00:00:03.410000
00:00:01.150000
00:24:57.790000
00:00:00.220000
00:14:43.790000
00:00:00.150000
00:00:09.290000
00:00:00.130000
00:00:08.170000
00:00:00.170000
00:19:58.840000
00:00:00.200000
00:09:45.470000
00:00:00.050000
00:00:00.230000
00:00:00.080000
00:09:43.140000
00:00:00.650000
00:00:00.110000
00:00:00.130000
00:00:00.120000
00:00:00.110000
00:00:00.110000
00:00:00.090000
00:00:00.130000
00:00:00.110000
00:00:00.100000
00:00:00.120000
ActiveAMPS ResultRows
1
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
539
0
0
0
0
0
0
0
0
0
0
SumCPU
0.00
0.01
4.24
0.02
0.13
0.02
6,159.66
0.02
5,178.87
0.02
41.20
0.02
41.47
0.02
5,342.41
0.02
5,151.85
0.02
0.09
0.02
5,275.76
3.72
0.01
0.02
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.01
27,199.67
SumIO
6
325
18094
325
2836
325
12166282
325
11051448
325
112437
325
120831
325
11798487
325
11280620
325
2984
325
11983233
13644
270
270
271
271
271
271
270
270
271
271
58,556,858
CPUSKW
0
0
1.16
0
0
0
1.02
0
1.02
0
1.04
0
1.05
0
1.02
0
1.02
0
0
0
1.02
2.03
0
0
0
0
0
0
0
0
0
0
IOSKW PJI
0
0
1.03
0
0
0
1.01
0
1.02
0
1.02
0
1.39
0
1.02
0
1.02
0
0
0
1.02
1.44
0
0
0
0
0
0
0
0
0
0
0.00
0.00
0.23
0.00
0.00
0.00
0.51
0.00
0.47
0.00
0.37
0.00
0.34
0.00
0.45
0.00
0.46
0.00
0.00
0.00
0.44
0.27
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
UII
ImpactCPU
0.00
0.00
4.26
0.00
0.00
0.00
1.98
0.00
2.13
0.00
2.73
0.00
2.91
0.00
2.21
0.00
2.19
0.00
0.00
0.00
2.27
3.67
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.06
0.13
4.93
0.13
0.32
0.13
6,272.00
0.13
5,259.52
0.13
42.94
0.13
43.39
0.06
5,449.02
0.06
5,231.10
0.06
0.26
0.06
5,374.27
7.55
0.13
0.19
0.06
0.06
0.06
0.13
0.06
0.06
0.13
0.06
QryText
DATABASE "P_MSI";
CREATE VOLATILE TABLE ZZTRE022KV
;insert into ZZTRE022KVWMQ000 select
CREATE VOLATILE TABLE ZZTRE022KV
;insert into ZZTRE022KVWRF001 select d
CREATE VOLATILE TABLE ZZTRE022KV
;insert into ZZTRE022KVWMD002 select
CREATE VOLATILE TABLE ZZTRE022KV
;insert into ZZTRE022KVWMD003 select
CREATE VOLATILE TABLE ZZTRE022KV
;insert into ZZTRE022KVWMD004 select
CREATE VOLATILE TABLE ZZTRE022KV
;insert into ZZTRE022KVWMD005 select
CREATE VOLATILE TABLE ZZTRE022KV
CREATE VOLATILE TABLE ZZTRE022KV
;insert into ZZTRE022KVWSP007 select a
CREATE VOLATILE TABLE ZZTRE022KV
;insert into ZZTRE022KVWMD008 select
CREATE VOLATILE TABLE ZZTRE022KV
DROP
DROP
DROP
DROP
DROP
DROP
DROP
DROP
DROP
DROP
TABLE
TABLE
TABLE
TABLE
TABLE
TABLE
TABLE
TABLE
TABLE
TABLE
ZZTRE022KVWMQ000 ;
ZZTRE022KVWRF001 ;
ZZTRE022KVWMD002 ;
ZZTRE022KVWMD003 ;
ZZTRE022KVWMD004 ;
ZZTRE022KVWMD005 ;
ZZTRE022KVWSP006 ;
ZZTRE022KVWSP007 ;
ZZTRE022KVWMD008 ;
ZZTRE022KVWMD009 ;
Suspect Query Rpt Post Fixes
• Validate success
• Iterative process
UserName
After Modification
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
SYSDBA
25
QryRespTime
0 00:00:00.050000
0 00:00:00.080000
0 00:00:00.790000
0 00:00:00.050000
0 00:00:00.210000
0 00:00:00.040000
0 00:00:00.530000
0 00:00:00.050000
0 00:00:00.310000
0 00:00:00.040000
0 00:00:00.320000
0 00:00:00.050000
0 00:00:46.120000
0 00:00:00.060000
0 00:00:00.520000
0 00:00:00.060000
0 00:00:00.190000
0 00:00:00.050000
0 00:00:46.160000
0 00:00:00.680000
0 00:00:00.130000
0 00:00:00.140000
0 00:00:00.090000
0 00:00:00.110000
0 00:00:00.110000
0 00:00:00.110000
0 00:00:00.170000
0 00:00:00.130000
0 00:00:00.120000
ActiveAMPS ResultRows
1
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
64
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
539
0
0
0
0
0
0
0
0
0
SumCPU
0.00
0.02
4.19
0.01
0.15
0.02
0.58
0.02
2.35
0.02
2.34
0.02
35.54
0.02
0.56
0.02
0.09
0.02
37.02
2.68
0.02
0.01
0.01
0.01
0.01
0.02
0.01
0.01
0.01
85.77
SumIO
6
585
18502
325
2835
325
10408
325
7649
325
7641
325
567418
325
9434
325
2984
325
572712
13702
271
271
271
271
271
271
271
271
271
1,218,915
CPUSKW IOSKW PJI
0
0
1.13
0
0
0
0
0
1.17
0
1.15
0
1.2
0
0
0
0
0
1.18
2.1
0
0
0
0
0
0
0
0
0
0
0
1.02
0
0
0
0
0
1.01
0
1.01
0
1.14
0
0
0
0
0
1.15
1.39
0
0
0
0
0
0
0
0
0
0.00
0.00
0.23
0.00
0.00
0.00
0.00
0.00
0.31
0.00
0.31
0.00
0.06
0.00
0.00
0.00
0.00
0.00
0.06
0.20
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
UII
ImpactCPU QryText
0.00
0.00
4.41
0.00
0.00
0.00
0.00
0.00
3.25
0.00
3.27
0.00
15.97
0.00
0.00
0.00
0.00
0.00
15.47
5.11
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.13
4.74
0.06
0.58
0.06
1.02
0.13
2.75
0.13
2.69
0.06
42.50
0.13
0.90
0.06
0.26
0.06
43.58
5.63
0.06
0.06
0.13
0.06
0.06
0.13
0.06
0.13
0.06
DATABASE "P_MSI";;
CREATE VOLATILE TABLE ZZMQ00 , NO FALLB
INSERT INTO ZZMQ00 SELECT DISTINCT a15.D
CREATE VOLATILE TABLE ZZRF01 , NO FALLBA
INSERT INTO ZZRF01 SELECT DISTINCT pr11.D
CREATE VOLATILE TABLE ZZMD02 , NO FALLB
INSERT INTO ZZMD02 SELECT a11.SUBCAT_NB
CREATE VOLATILE TABLE ZZMD03 , NO FALLB
INSERT INTO ZZMD03 SELECT a11.SUBCAT_NB
CREATE VOLATILE TABLE ZZMD04 , NO FALLB
INSERT INTO ZZMD04 SELECT a11.SUBCAT_NB
CREATE VOLATILE TABLE ZZSP05 , NO FALLBA
CREATE VOLATILE TABLE ZZSP06 , NO FALLBA
INSERT INTO ZZSP06 SELECT a11.SUBCAT_NB
CREATE VOLATILE TABLE ZZMD07 , NO FALLB
INSERT INTO ZZMD07 SELECT pa11.SUBCAT_N
CREATE VOLATILE TABLE ZZMD08 , NO FALLB
DROP TABLE ZZMQ00 ; ;
DROP TABLE ZZRF01 ; ;
DROP TABLE ZZMD02 ; ;
DROP TABLE ZZMD03 ; ;
DROP TABLE ZZMD04 ; ;
DROP TABLE ZZSP05 ; ;
DROP TABLE ZZSP06 ; ;
DROP TABLE ZZMD07 ; ;
DROP TABLE ZZMD08 ; ;
Application Performance
• Use DBQLObjectTBL
> Use object table before dropping or altering any DDL to
identify use, users and/or times
> Use to correlate suspect queries to tables involved
> Object tables and index use report
– Can be used after creating an index or as a recurring report
– Save space and CPU used to maintain under utilized indexes and
tables, resulting in reduced batch times
– Exclude collect stats and maintenance requests on base table
Notes : Prior to V2R6.1 – If the object is an index, its number is logged rather than a
name.
In V2R6.1 - There will be one row in the table for each column that makes up the
index, and the ColumnNames will be logged (one row for each component
column)
26
Sample Suspect Tables Report
27
Application Performance
• Use DBQLStepTBL
> Use to help diagnose problem queries
– Each step contains start and finish times, resources used
and skewing if present for each step
– Look at differences between estimated row count and
actual for each step
– Example: Can show blocking (not what was blocked)
– Example: All amp operation taking long time even though
not using much resources may indicate AWT shortage
– Example: Query taking long in aggregation step may
indicate aggregate join index will help
> Use to identify plan changes
– Can be done by # of steps or type of steps
– Should be used in conjunction with DBQLExplainTBL
28
Application Performance
• Use DBQLExplainTBL
> Use to identify plan changes
– Good for any changes to system (upgrades, DDL changes,
etc…)
– Can scan text for specific types of joins
– Example: Site turns on merge joins, with explain turned on
site is able to identify queries that were effected
– When used with DBQLStepTBL can provide more in depth
analysis
29
Workload Management
• Workload Management Example
Response Time for Tactical App
100%
14000
98%
12000
96%
10000
94%
% of Total
92%
8000
90%
6000
88%
86%
4000
84%
2000
82%
0
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
0
1
2
3
4
5
6
7
8
80%
6/16/2008
6/17/2008
6/18/2008
6/19/2008
Hour
Sum of LT_1
30
Sum of Sec_1_2
Sum of Sec_2_5
Sum of Sec_5_10
Sum of Sec_GT_10
Sum of Count(*)
Workload Management
• Workload Management Example (Contd)
> Previous graph shows large increase in number of
queries running longer than one second at 3 AM
> Additional analysis required
– Schmon, Resusage or DBQL can identify CPU, IO or
skewing issues
– If collecting Amp Worker Task information concurrency
issues could be determined
– DBQL can show if non-tactical work is being done by the
workload that is missing its SLAs
31
Workload Management
• Workload Management Example (Contd)
> Assume Previous example due to AWT shortage
> Need to limit work, great opportunity to use DBQL concurrency report
Concurrency Chart
600
500
# of Queries
400
300
200
100
01
23
45
67
89
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
590
12
34
56
78
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
590
12
34
56
78
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
0
2
3
4
6/16/2008
Date/Hour/Minute
ARCUSER1
PROD_USER
TCRM1
Application1
32
BOUSER
TCRM2
Applicatio2
User2
CMPGN_USER
TCRM3
TCRM
User3
DBCMANAGER
ONLINE_USER
MSTR_USER
USER1
Workload Management
• Determine Success After Changes to Workload management
> In TASM rule changes should be quantifiable and DBQL is just the data to do it
> Graph below shows example of Workload before and after changes to system
throttle
35
30
25
20
15
10
5
0
Average of TotalQuerytime
Average of TotalRunTime
PreTASM
33
Average of avgdelaytime
PostTASM
Average of ifdelayedavg
Workload Management
• Use DBQL with DBC.TDWMexceptionLog
> Identify queries that are exceptions and possible reasons
> Identify time periods that might require Workload adjustments
34
CPU Histogram – Bucket Query
• Shows number of queries, by account string or
WDID, based on the amount of CPU used
>
>
>
>
>
97.9% use < 1/2 CPU second
99.6% use < 100 CPU seconds
100% use < 200 CPU seconds
10% of “Tactical” queries used > 1 CPU second
TAC used > 64% of CPU used
• Useful to help set up TASM
> Indicates what the customer thinks is tactical, isn’t always
> Giving a very high priority for queries estimated to use <
1/2 second should provide the majority of queries a better
opportunity for a good response time
> Medium priority < 100 CPU seconds
> Queries > 300 seconds are
– Candidates for being terminated as “bad queries”
– Exceptions
35
Example CPU Histogram Chart
36
Response Time Histogram
• Average response time for each account string and/or
workload
> Broken down by the amount of CPU used
> Classify by CPU buckets that are site appropriate
• Queries that use < 1 CPU second have unacceptable
response times
> Evident in other areas also
• Useful information to help set up and analyze TASM
> Giving a very high priority for Queries estimated to use
less than 1 second should provide the majority of
queries a better opportunity for a good response time
37
Example Response Time Histogram
• Red areas represent potential areas for response time improvement
> Higher priority
> Reclassification
> Other action
38
[email protected]
®
39
Bonus Queries
DBC Suspect Query Macro
REPLACE MACRO PMCPINFO.SuspQuerysDBC_Rpt ( topN INT)
AS (
/* This Macro can be run anytime to list Poor Performing Querys
*/
/* That are in the DBC DBQL tables. It assumes that only the current */
/* day is in the DBC DBQL tables. The parameter requires the number */
/* of queries you want to list. If you put in 50, you will get the worst 50. */
SELECT RANK(ImpactCPU) AS CPURank
,UserName , AcctString, ExpandAcctString, appid, clientid, DefaultDatabase,
SpoolUsage, StartTime, ParseBlockTime, QryRespTime, NumOfActiveAmps,
NumResultRows, SumCPU, SumIO, CpuSkw, IOSkw, PJI, UII, ImpactCPU
,CASE WHEN s.sqltextinfo IS NOT NULL OR s.sqltextinfo = ' ' THEN s.SQLTextInfo ||';'
ELSE a.querytext||';' END AS QryText
FROM
( SELECT a.UserName, a.ProcId, a.QueryId, a.expandacctstring, appid, clientid,
CAST(EXTRACT(HOUR FROM ((a.firststeptime - a.StartTime) HOUR(2) TO
SECOND(2) ) ) * 3600 + EXTRACT(MINUTE FROM ((a.firststeptime - a.StartTime)
HOUR(2) TO SECOND(2) ) ) * 60 + EXTRACT(SECOND FROM ((a.firststeptime a.StartTime) HOUR(2) TO SECOND(2) ) ) AS INTEGER) AS ParseBlockTime
,((lastresptime-Starttime) DAY(4) TO SECOND) AS QryRespTime
, NumofActiveAmps, NumResultRows
, a.TotalCPUTime AS SumCPU , a.TotalIOCount AS SumIO,
CASE WHEN totalcputime < 1 OR (TotalCPUTime / (HASHAMP()+1)) =0 THEN 0
ELSE HotAmp1CPU/(TotalCPUTime / (HASHAMP()+1)) END (DEC(8,2)) AS
CPUSKW,
CASE WHEN totalcputime < 1 OR (TotalIOCount / (HASHAMP()+1)) =0 THEN 0
ELSE HotAmp1IO/(TotalIOCount / (HASHAMP()+1)) END (DEC(8,2)) AS IOSKW,
CASE WHEN totalcputime < 1 OR TotalIOCount = 0 THEN 0 ELSE (a.TotalCPUTime
*1000)/a.TotalIOCount END AS PJI,
CASE WHEN totalcputime < 1 OR TotalCPUTime = 0 THEN 0 ELSE
a.TotalIOCount/(a.TotalCPUTime *1000) END AS UII,
a.HotAMP1CPU * (HASHAMP()+1) /*no. of Amps*/ AS ImpactCPU,
a.acctstring , a.Starttime , a.querytext, a.logdate, a.defaultdatabase, a.SpoolUsage
FROM PMCPINFO.DBQLogTbl_DBC a
WHERE ( CpuSkw > 1.25 /*2.0*/ OR IOSkw > 1.25 /*2.0*/ /* Start with 2.0 as a threshold for CoExistence systems */
OR UII > 6 /* UnNecessary IO Indicator - may offer an opportunity for Indexing */
OR pji > 6 /* Could use a lower threshold (3) for Older Node systems */ )
GROUP BY 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23 ) AS a
LEFT OUTER JOIN (SELECT Procid, Queryid, SUBSTR(sqltextinfo,1,15000) AS sqltextinfo
FROM PMCPINFO.DBQLSQLTbl_DBC s
WHERE sqlrowno = 1) AS s
ON s.ProcID = a.ProcID
AND s.QueryID = a.QueryID
QUALIFY RANK(ImpactCPU) <= :TOPN
ORDER BY 1; );
41
TDWM Exceptions joined to DBQL SQL
SELECT
DL.logdate ,DL.starttime, DL.queryid, DL.Username, DL.expandacctstring,
EX.WDName, EX.WDPeriod, EX.ClassificationTime, EX.ExceptionTime,
EX.NewWDName, SumCPU, SumIO, CpuSkw, IOSkw, PJI, UII, ImpactCPU,
trim(EX.ErrorText) as ExceptionText,
Case When dl.sqltextinfo is not null or dl.sqltextinfo = ' ' Then dl.SQLTextInfo||';'
Else dl.querytext||';' End as QryText
FROM
(select cast (a.starttime as date) as logdate,
a.procid, a.queryid, a.Username, a.AcctString, a.expandacctstring,
((firstresptime-starttime) day(4) to second) as QryRespTime,
NumofActiveAmps, NumResultRows, a.totalcputime as SumCPU, a.TotalIOCount as SumIO, Case when (totalcputime / (hashamp()+1) ) =0 then 0 else HotAmp1CPU /(TotalCPUTime / (hashamp()+1) ) End (dec(7,2)) as CpuSkw, Case when
(TotalIOCount / (hashamp()+1) ) =0 then 0 else HotAmp1IO /(TotalIOCount / (hashamp()+1) ) End (dec(7,2)) as IOSkw,
Case when a.TotalIOCount = 0 then 0 else
(a.totalcputime *1000)/a.TotalIOCount end (dec(7,2)) as PJI,
Case when a.totalcputime = 0 then 0 else a.TotalIOCount /(a.totalcputime
*1000) end (dec(7,2)) as UII,
a.HotAmp1CPU * (hashamp()+1) as ImpactCPU, a.acctstring, a.starttime
,a.querytext , s.sqltextinfo, a.defaultdatabase , a.SpoolUsage
from dbc.dbqlogtbl a
left outer join (Select Procid, Queryid,cast(collecttimestamp as date) as
logdate, substr(sqltextinfo,1,15000) as sqltextinfo
from dbc.DBQLSQLTbl s
WHERE cast (collecttimestamp as date) between date and date
and sqlrowno = 1) as s
on s.ProcID = a.ProcID and s.QueryID = a.QueryID and s.logdate =
cast(a.starttime as date)) DL,
(Select
cast(Bufferts as date) as Logdate, ProcID, QueryId, BufferTS, UserName,
el.WDID, wl.WLCName as WDName, wp.Description as WDPeriod,
el.WDPeriodID, ClassificationTime, ExceptionTime, wd.WLCName as
NewWDName, NewWDID, ErrorText, ExtraInfo
from dbc.TDWMExceptionLog el
left outer join tdwm.wlcdefs wl on wdid=wl.wlcid
left outer join tdwm.wlcdefs wd on NewWDID=wd.wlcid
left outer join tdwm.WlcPeriods wp on WDPeriodID=wp.PeriodId
and WDID=wp.WLCID
where Cast(BufferTS as Date) between date and date
and extract(hour from bufferts) > 07 and wd.removedate=0
and wl.removedate=0 and wp.removedate=0) ex
where ex.Logdate=dl.Logdate and ex.Procid=DL.procid and
ex.queryid=DL.queryid and qrytext not like '%COLLECT%'
order by 1, 2,5;
42
Suspect Table Report
LOCKING ROW FOR ACCESS
SEL o.ObjectDatabaseName
,o.ObjectTableName
,SUM(NumOfQrys) AS TotNumOfQrys
,SUM(sumcpu) AS TotCPUConsumed
,SUM(CASE WHEN (cpuskw > 1.25 /*2.0*/ OR ioskw > 1.25 /*2.0*/
OR PJI > 6 OR UII >6 ) THEN sumcpu ELSE 0 END) AS TotSkdPJLSQrys
,SUM(CASE WHEN (cpuskw > 1.25 /*2.0*/ OR ioskw > 1.25 /*2.0*/
OR PJI > 6 OR UII >6 ) THEN sumcpu ELSE 0 END) AS TotSkdPJLSCPU
,SUM(CASE WHEN (cpuskw > 1.25 /*2.0*/ OR ioskw > 1.25 /*2.0*/ )
THEN sumcpu ELSE 0 END) (FLOAT) AS SkewedCPU
,SUM(CASE WHEN (cpuskw > 1.25 /*2.0*/ OR ioskw > 1.25 /*2.0*/ )
THEN 1 ELSE 0 END) (FLOAT) AS SkewedNumQrys
,SUM(CASE WHEN (PJI > 6) THEN sumcpu ELSE 0 END) (FLOAT) AS ProdJnCPU
,SUM(CASE WHEN (PJI > 6) THEN 1 ELSE 0 END) (FLOAT) AS ProdJnNumQrys
,SUM(CASE WHEN (UII > 6) THEN sumcpu ELSE 0 END) (FLOAT) AS LgScansCPU
,SUM(CASE WHEN (UII > 6) THEN 1 ELSE 0 END) (FLOAT) AS LgScansNumQrys
FROM dba_tables.DBQLObjTbl_hst o
, (SEL procid
, queryid
, logdate
,ZEROIFNULL( TotalCPUTime ) (FLOAT) AS sumcpu
,1 (FLOAT) AS NumOfQrys
,CASE WHEN totalcputime < 1 OR (TotalCPUTime / (HASHAMP()+1)) =0 THEN 0
ELSE HotAmp1CPU/(TotalCPUTime / (HASHAMP()+1)) END (DEC(8,2)) AS CPUSKW
,CASE WHEN totalcputime < 1 OR (TotalIOCount / (HASHAMP()+1)) =0 THEN 0
ELSE HotAmp1IO/(TotalIOCount / (HASHAMP()+1)) END (DEC(8,2)) AS IOSKW
,CASE WHEN totalcputime < 1 OR TotalIOCount = 0 THEN 0 ELSE (TotalCPUTime *1000)/TotalIOCount END AS PJI
,CASE WHEN totalcputime < 1 OR TotalCPUTime = 0 THEN 0 ELSE TotalIOCount/(TotalCPUTime *1000) END AS UII
FROM dba_tables.DBQLogTbl_hst
WHERE logdate BETWEEN '2007-10-22' AND '2007-10-28') AS a
WHERE a.logdate BETWEEN '2007-10-22' AND '2007-10-28'
AND o.objecttype = 'T'
AND a.procid = o.procid
AND a.queryid = o.queryid
AND a.logdate = o.logdate
GROUP BY 1,2
order by TotSkdPJLSCPU DESC
43