Practical Advice and Guidance on the use of PeopleTools

Download Report

Transcript Practical Advice and Guidance on the use of PeopleTools

Practical Advice and Guidance on
the use of PeopleTools Performance
Monitor
David Kurtz
Go-Faster Consultancy Ltd.
[email protected]
www.go-faster.co.uk
Who Am I?
• Oracle Database Specialist
– Independent consultant
• System Performance tuning
– PeopleSoft ERP
– Oracle RDBMS
• Book
– www.psftdba.com
• UKOUG
– PeopleSoft Technology SIG
Committee
•
Practical PPM
©2009 www.go-faster.co.uk
2
Resources
• If you can’t hear me say so now.
• Please feel free to ask questions as we go along.
• The presentation is available from
• UKOUG Conference Website
• www.go-faster.co.uk
• blog.psftdba.com
Practical PPM
©2009 www.go-faster.co.uk
3
Using PeopleSoft Performance Monitor
•
•
•
•
Very Quick Overview
Performance Tuning the Performance Monitor
PPM Bugs & Fixes
General Analyses
– Events
– Components
– Network Latency
• Performance Trace Demo
Practical PPM
©2009 www.go-faster.co.uk
4
Think!
• The way you think about performance is more
important than the tools you use.
– Mostly, you need to focus on time.
• Further reading:
– The Goal – Eli Goldratt
– Optimising Oracle Performance - Millsap & Holt
• www.method-r.com,
• http://oreilly.com/catalog/9780596005276/preview.html
Practical PPM
©2009 www.go-faster.co.uk
5
PPM – Key Features
• PeopleTools is used to Monitor PeopleTools
– You don’t have to buy any more software
– Monitoring system –v- Monitored systems
• Can and should use one monitor for many monitored
• Eg. Consolidated Portal/Application
– Any version >= 8.44 can monitor any version
• When it works!
Practical PPM
©2009 www.go-faster.co.uk
6
Web
Server
Browser
http /
https
(presentation &
JavaScript)
PIA
Servlet
PPMI
Servlet
Application
Server
Tuxedo
Message
Monitor
Servlet
APPQ
DBMS
PSAPPSRV
SQL
(application data
& meta-data
PSPPMSRV
(application logic)
Monitoring System
Web
Server
Browser
Application
Server
APPQ
Screen
Paint
Java
Script
http /
https
(presentation &
JavaScript)
PIA
Servlet
(presentation
logic)
Tuxedo
Message
DBMS
PSAPPSRV
SQL
PSMONITORSRV
(application data
& meta-data
(application logic)
Monitored System
Practical PPM
©2009 www.go-faster.co.uk
7
Performance Monitor Metrics
• Transactions
– User activities in PIA that cause communications with
application server
– Sampled
– Enabled to form a trace
• Events
– Periodic samples
– Usually initiated by monitoring agents
– eg. CPU, Tuxedo counters
Practical PPM
©2009 www.go-faster.co.uk
8
Performance Monitor Transactions
• User activity in PIA
• Performance Monitoring
Unit
– Hierarchy of transactions
• Similar to Oracle event
10046 trace
– recursive actions
Practical PPM
©2009 www.go-faster.co.uk
9
Transactions
• Stored to PSPMTRANSCURR table
– As PMUs are closed moved to PSPMTRANSHIST
– Later deleted or archived to PSPMTRANSARCH
• ERD downloadable from Metalink
– And you will need to get to grips with it.
Practical PPM
©2009 www.go-faster.co.uk
10
ERD of Transaction
PSPMTRANSDEFN
(E)
PM_TRANS_DEFN_SET,
PM_TRANS_DEFN_ID
PM_AGENTID
PSTRANSHIST
(C)
search criteria
PM_SYSTEMID
PSPMSYSDEFN
(B)
PSPMMETRICDEFN
(M1)
PM_METRICIDn
(n=1-7)
PSPMMETRICDEFN
(M2)
PSPMMETRICDEFN
(M3)
PM_CONTEXTID_n
PSPMMETRICDEFN
(M4)
PSPMMETRICDEFN
(M5)
(n=1-3)
PSPMCONTEXTDEFN
(C1)
PSPMMETRICDEFN
(M6)
PSPMMETRICDEFN
(M7)
Practical PPM
PSPMAGENT
(A)
©2009 www.go-faster.co.uk
PSPMCONTEXTDEFN
(C2)
PSPMCONTEXTDEFN
(C3)
11
Metrics
• Metric IDs specified on transaction definition
PSPMTRANSDEFN
– Metrics Types defined on PSPMMETRICDEFN
• Type 1: Counters (including timers)
– Metric 4: Total Servlet Request time (ms)
• Type 2: Gauges
– Metric 102: %CPU Used
• Type 3: Numeric Identifier
– Metric 20: HTTP response code
• Type 4: String Identifier
– Metric 27: File Name
Practical PPM
©2009 www.go-faster.co.uk
12
Transaction 101
• Reported at entry and exit of PIA servlet
– Context 1
Action=View Page
– Context 2
IP Address=10.0.0.3
– Context 3
Session ID=AN7tpzSwpZc4kt9k8 . . .
– Additional Description
http://go-faster-3:7201/
psc/ps/EMPLOYEE/HRMS/c/UTILITIES.PTPERF_TEST.GBL
Practical PPM
©2009 www.go-faster.co.uk
13
Transaction 101
• 4 metrics
– Metric 19: Response Size (bytes)
=17613
– Metric 20: Response Code
=200
– Metric 22: Static Content Count
=0
– Metric 23: Is this a Pagelet?
=0
Practical PPM
©2009 www.go-faster.co.uk
14
Transaction Query Results
PM_TOP_INST_ID PM_INSTANCE_ID PM_PARENT_INST_ID DBNAME
PM_HOST_PORT
PM_DOMAIN_NAME
PM_AGENT_TYPE
PM_INSTANCE
PM_AGENT_STRT_DTTM PM_MON_STRT_DTTM
OPRID
PM_PERF_TRACE
PM_PROCESS_ID
PM_TRANS_DEFN_ID DESCR60
'CONTEXT1:'||C.PM_CONTEXTID_1||'-'||C1.PM_CONTEXT_LABEL||'='||C.PM_CONTEXT_VALUE …
PM_TRANS_DURATION
'METRIC1:'||M1.PM_METRICLABEL||'='||C.PM_METRIC_VALUE1 …
PM_ADDTNL_DESCR
-------------------------------------------------------------------------------824633721163
824633721163
0 HR88
go-faster-3:7201:7202
ps
WEBSERVER
-1
16:12:07 14.06.2004 16:12:09 14.06.2004
PS
PS: 2004-06-14 16:01:11
0
101 Reported at entry and exit of PIA servlet
Context1:3-Session ID=AN7tpzSwpZc4kt9k8QNaCcYUWWh9FaFt!1963244185!1087224685145
Context2:2-IP Address=10.0.0.3
Context3:1-Action=View Page
1322
Metric1:Response Size (bytes)=17613
Metric2:Response Code=200
Metric3:Static Content Count=0
Metric4:Is this a Pagelet?=0
Metric5:=0
Metric6:=0
Metric7:=
http://go-faster-3:7201/psc/ps/EMPLOYEE/HRMS/c/UTILITIES.PTPERF_TEST.GBL
Practical PPM
©2009 www.go-faster.co.uk
15
Events
• Do not have an explicit context
– Collecting agent provide context
• Stored in PSPMEVENTHIST
– Later deleted or archived to PSPMEVETARCH
Practical PPM
©2009 www.go-faster.co.uk
16
ERD of Events
PSPMEVENTDEFN
(E)
PM_EVENT_DEFN_SET,
PM_EVENT_DEFN_ID
PSEVENTHIST
(C)
PM_AGENTID
search criteria
(n=1-7)
PM_SYSTEMID
PSPMSYSDEFN
(B)
PSPMMETRICDEFN
(M1)
PM_METRICIDn
PSPMAGENT
(A)
PSPMMETRICDEFN
(M2)
PSPMMETRICDEFN
(M3)
PSPMMETRICDEFN
(M4)
PSPMMETRICDEFN
(M5)
PSPMMETRICDEFN
(M6)
PSPMMETRICDEFN
(M7)
Practical PPM
©2009 www.go-faster.co.uk
17
Event Query Results
DBNAME
PM_HOST_PORT
PM_AGENT_TYPE
PM_DOMAIN_NAME
PM_INSTANCE
PM_AGENT_DTTM
PM_INSTANCE_ID
PM_EVENT_DEFN_ID DESCR60
'METRIC1:'||M1.PM_METRICLABEL||'='||C.PM_METRIC_VALUE1
'METRIC2:'||M2.PM_METRICLABEL||'='||C.PM_METRIC_VALUE2
'METRIC3:'||M3.PM_METRICLABEL||'='||C.PM_METRIC_VALUE3
'METRIC4:'||M4.PM_METRICLABEL||'='||C.PM_METRIC_VALUE4
'METRIC5:'||M5.PM_METRICLABEL||'='||C.PM_METRIC_VALUE5
'METRIC6:'||M6.PM_METRICLABEL||'='||C.PM_METRIC_VALUE6
'METRIC7:'||M7.PM_METRICLABEL||'='||C.PM_METRIC_VALUE7
PM_ADDTNL_DESCR
-------------------------------------------------------------------------------HR88
go-faster-3:7201:7202
WEBSERVER
ps
-1
16:12:08 14.06.2004
824633721166
600 PSPING metrics fowarded from browser
Metric1:Network Latency (ms)=435
Metric2:WebServer Latency (ms)=100
Metric3:AppServer Latency (ms)=561
Metric4:DB Latency (millisecs)=451
Metric5:=0
Metric6:=0
Metric7:IP Address=10.0.0.3
PS;AN7tpzSwpZc4kt9k8QNaCcYUWWh9FaFt!1963244185!1087224685145
Practical PPM
©2009 www.go-faster.co.uk
18
Tuning Performance Monitor
• Some of the delivered analytics do not perform
well with even moderate data volumes
– Set up the monitoring system to self monitor
– Then you can generate PPM traces on the analytics
– You will need additional indexes
• http://blog.psftdba.com/2006/04/performance-tuningperformance-monitor.html (YMMV)
Practical PPM
©2009 www.go-faster.co.uk
19
Purge Process
• Data normally held in history tables
– PSPMTRANSHIST, PSPMEVENTHIST
• Clone tables
– PSPMTRANSHISTCL, PSPMEVENTHISTCL
• PPM writes to tables specified in PSPMTABLEMAP
– Archive process switches this to clone tables
select * from pspmtablemap;
PM_TRANS_TBL_NAME PM_EVENT_TBL_NAME
------------------ -----------------PSPMTRANSHIST
PSPMEVENTHIST
Practical PPM
©2009 www.go-faster.co.uk
20
PSPMTABLEMAP
• Archive/Purge switches PPM destination
– Prevents concurrent INSERT and DELETE/Query operations
– Saves read consistency problems on Oracle
– Saves page locks on other databases
• PPM appears not to collect data during this processing
– But it is written to clone tables
– Archive process moves it to main hist tables after purge
Practical PPM
©2009 www.go-faster.co.uk
21
Purge Process Can’t Keep Up
• Platform generic query leads to full scan
– Even if data has been deleted (manually) high water
marks (HWM) on tables not reset
• Customisation
– Oracle specific statement
– May need to rebuild HIST tables to reset HWM
• In which case manually set PSPMTABLEMAP to clones,
rebuild history tables, run archive process
Practical PPM
©2009 www.go-faster.co.uk
22
Performance Fix for Purge
• Vanilla Code
&TransHistSQL.Open("SELECT …
AND %DateTimeDiff(X.PM_MON_STRT_DTTM, %CurrentDateTimeIn) >=
(PM_MAX_HIST_AGE * 24 * 60)");
• Expands to
… AND ROUND((CAST(( SYSDATE) AS DATE) CAST((X.PM_MON_STRT_DTTM) AS DATE)) * 1440, 0) >=
(PM_MAX_HIST_AGE * 24 * 60)
• My Suggestions
… AND X.PM_MON_STRT_DTTM < SYSDATE - Z.PM_MAX_HIST_AGE
• See blog entry http://blog.psftdba.com/2008/05/performance-tuningperformance-monitor.html
Practical PPM
©2009 www.go-faster.co.uk
23
How much data?
• Control sampling
– Proportion of transactions collected
• Depends upon activity on system
• On busy self-service system as little as 1 in 5000
– Event sampling frequency
• For each agent
• 5 minutes – 15 minutes
• Depends on whether you want to be able to see shortlived behaviours.
Practical PPM
©2009 www.go-faster.co.uk
24
Practical PPM
©2009 www.go-faster.co.uk
25
Recent problems in PT8.49
• Prior to patch 8.49.14
– Ports left open in close_wait
– Unix systems run out of ports
• Get ‘Application Server is down’ errors
– No Limit on Windows
• But the system does progressively slow down
– POC 752524 applied to 8.49.06
• Tuxedo Connections capped at 121-127
Practical PPM
©2009 www.go-faster.co.uk
26
Outstanding problems in 8.49.14
• Tuxedo Queuing not reported
– Events 300 and 301
• Tuxedo Connections not reported
– Event 300
Practical PPM
©2009 www.go-faster.co.uk
27
Practical Examples
•
•
•
•
Simple Graphs of Events
Cumulative Frequency Distributions
Network Latency
Performance Trace
Practical PPM
©2009 www.go-faster.co.uk
28
Simple Event Graphs
• You set an event collection interval
• All domains collect at that interval
– But each has its own clock
– Each collects at different times.
• If you have multiple web/app servers?
– Need to aggregate for system wide view
– Interpolate between points?
• |(PL/SQL package see notes for this slide)
Practical PPM
©2009 www.go-faster.co.uk
29
Simple Event Graphs
• Raw data from PSPMEVENTHIST or
PSPMEVENTARCH
– Extract into working storage tables
• Possibly two levels
– Aggregating as you go
Practical PPM
©2009 www.go-faster.co.uk
30
JVM% Free
100
90
80
70
%JVM Used
60
CS_PROD C
50
40
30
20
10
0
Sat 11.10.08
Practical PPM
Sat 18.10.08
Sat 25.10.08
Sat 1.11.08
©2009 www.go-faster.co.uk
Sat 8.11.08
31
JVM Sessions
2500
JVM Sessions
2000
1500
CS_PROD C
1000
500
0
Sat 11.10.08
Practical PPM
Sat 18.10.08
Sat 25.10.08
Sat 1.11.08
©2009 www.go-faster.co.uk
Sat 8.11.08
32
JVM Busy Threads
80
70
JVM Busy Threads
60
50
40
CS_PROD C
30
20
10
0
Sat 11.10.08
Practical PPM
Sat 18.10.08
Sat 25.10.08
Sat 1.11.08
©2009 www.go-faster.co.uk
Sat 8.11.08
33
Application Server Requests
CS_PROD C PSAPPSRV
4000
3500
Requests / Sample Period
3000
2500
2000
1500
1000
500
0
Sat 11.10.08 0:00
Practical PPM
Sat 18.10.08 0:00
Sat 25.10.08 0:00
Sat 1.11.08 0:00
©2009 www.go-faster.co.uk
Sat 8.11.08 0:00
34
Jolt Message Sizes
• Transaction 115
– Size of Jolt Messages into and out of Tuxedo
– Message Written to disk
• If message larger than specfied size
• or would cause queue to become ¾ full
– Default Queue Size is 64Kb
• Kernel Parameter (windows too)
– Most systems need 128-256Kb
Practical PPM
©2009 www.go-faster.co.uk
35
Cumulative Frequency – ntile()
SELECT pctile, MIN(<value>)
FROM (
SELECT
NTILE(100)
OVER (ORDER BY <value>) AS pctile
FROM
<table>
)
GROUP BY <key>, pctile
Practical PPM
©2009 www.go-faster.co.uk
36
JOLT_BYTES_SEND
JOLT_BYTES_RCVD
10,000,000
Jolt Message Size (bytes)
1,000,000
100,000
10,000
1,000
0
10
20
30
40
50
60
70
80
90
%Tile
Practical PPM
©2009 www.go-faster.co.uk
37
100
Anatomy of a Transaction
• Simple PIA Transaction
• 101 – PIA entry/exit
– 115 – Jolt Message
• 400- Tuxedo Service
– 401 ICPanel
– 410 ICScript
Practical PPM
©2009 www.go-faster.co.uk
38
Anatomy of a Transaction
• Portal PIA Transaction
– PMU is consolidated across databases.
• 106 – Portlet
– 115 – Jolt Message
• 400- Tuxedo Service
– 401 ICPanel
– 410 ICScript
Practical PPM
©2009 www.go-faster.co.uk
39
Transaction Duration Distribution
100
Duration (ms)
10
1
DUR401
DUR400
DUR115
DUR101
QDUR
0.1
0.01
0.001
0
10
20
30
40
50
60
70
80
90
100
%Tile
Practical PPM
©2009 www.go-faster.co.uk
40
Individual Components
• Now try the same analysis for specific components
– Determine the top-n components by cumulative execution
time
• PPM Analytic uses only event 401
– Doesn’t take web server or queuing time into account.
• I prefer to use event 101, 106
– But you have to join the transactions.
– Component identification from transaction 401 contexts
Practical PPM
©2009 www.go-faster.co.uk
41
UC_SAQ_PHOTO.GBL,
UC_SAQ_PHOTOGRAPH2,
#ICOK
Component
that uploads
an attachment
1000
This area is Time spent in
web server JVM
100
Duration (ms)
10
DUR401
DUR400
DUR115
DUR101
QDUR
1
0.1
0.01
0.001
0
10
20
30
40
50
60
70
80
90
100
%Tile
Practical PPM
©2009 www.go-faster.co.uk
42
UC_SAQ_PHOTO.GBL
UC_SAQ_PHOTOGRAPH2
#ICOK
Component
that uploads
an attachment
10,000,000
Jolt Message to App Server increases
– large attachments
Sizes (bytes)
1,000,000
JOLT_BYTES_SEND
JOLT_BYTES_RCVD
COMPONENT_BUFFER
100,000
10,000
1,000
0
10
20
30
40
50
60
70
80
90
100
%Tile
Practical PPM
©2009 www.go-faster.co.uk
43
Typical Component
UC_ENROL_QCKAPPRVL.GBL, UC_ENROL_QCKAPPRVL, Launch Page/Search Page
1000
100
Duration (ms)
Interesting jump in
response time
10
DUR401
DUR400
DUR115
DUR101
QDUR
1
0.1
Time spent in ICPanel Service – possibly
database or PeopleCode
0.01
0
10
20
30
40
50
60
70
80
90
100
%Tile
Practical PPM
©2009 www.go-faster.co.uk
44
UC_CG_INQUIRY.GBL, UC_CG_INQUIRY, Click PeopleCode Command Button for Field
UC_CG_WRK.REFRESH_BTN
Custom PeopleCode Button
1000
Significant jump in
response time
100
Duration (ms)
10
DUR401
DUR400
DUR115
DUR101
QDUR
1
0.1
0.01
Time spent in ICPanel Service – possibly
database or PeopleCode
0.001
0
10
20
30
40
50
60
70
80
90
100
%Tile
Practical PPM
©2009 www.go-faster.co.uk
45
Component that uploads attachment
UC_SAQ_PHOTO.GBL UC_SAQ_PHOTOGRAPH2 #ICCancel
1000
Time spent in web
server JVM
100
Duration (ms)
10
DUR401
DUR400
DUR115
DUR101
QDUR
1
0.1
0.01
0.001
0
10
20
30
40
50
60
70
80
90
100
%Tile
Practical PPM
©2009 www.go-faster.co.uk
46
Network Latency
• Most transactions are sampled
• But three transactions are always recorded
• See PMTRANSDEFN.PM_SAMPLING_ENABLE
– 108: User Session logout, expiration, timeout, or
error
– 109: User Session began (user logged in)
– 116: Redirected round trip time (network latency)
Practical PPM
©2009 www.go-faster.co.uk
47
Transaction 116
• Network round trip from webserver to browser
and back again
– Includes network transmission time
– Browser response time
– Client IP address
• Although that could be load balancer or NAT
– Operator ID
• LOCATION from HR database?
Practical PPM
©2009 www.go-faster.co.uk
48
Practical PPM
©2009 www.go-faster.co.uk
49
Analysis by Client IP
• IP addresses are of Routers not Clients
Mon Jun 30
page
1
Login Durations by Client IP
IP Address
MIN
AVG
MED
MAX
VAR NUM_EVENTS
---------------- ------ ------ ------ ---------- ------------ ---------174.149.127.223
0.000 0.128 0.094
0.390
18.3
12
174.149.1.200
0.047 0.356 0.321
3.046
57.8
290
174.149.126.149
0.015 0.604 0.156
14.077
1,835.0
1288
192.168.1.171
0.032 0.692 0.157
6.392
2,064.6
1226
174.149.126.147
0.031 0.712 0.156
15.062
2,322.9
1302
192.168.1.172
0.031 0.742 0.172
61.321
5,218.6
1234
192.168.1.170
0.047 0.748 0.172
12.989
2,342.9
1274
174.149.126.148
0.031 0.762 0.141
44.918
3,963.8
1300
193.113.139.184
0.422 5.380 5.423
20.951
589.8
803
Practical PPM
©2009 www.go-faster.co.uk
50
HR location of OPRID
• First three lines are different network topology
• Operator may not actually be at stated location
* especially home workers
Mon Jun 30
page
1
Login Durations by Location (min 10 logins)
LOCATION
---------EXT-00-IN
EXT-00-CZ
EXT-BK-UK
CITY
---------------Bangalore
Brno
Buckingham
Cty
MIN
AVG
MED
MAX
VAR EVENTS
--- ------ ------ ------ ------- ---------- -----IND 0.109 5.488 5.468 61.321
5,099.0
688
CZE 0.109 4.991 5.218 20.951
2,125.1
555
GBR 0.062 0.862 0.281
6.578
3,100.8
224
HOME
Home Worker*
GBR
0.031
0.299
0.188
9.109
377.7
489
INT-L3
INT-BK
INT-WS
Liverpool
Buckingham
Walsall
GBR
GBR
GBR
0.062
0.032
0.031
0.211
0.233
0.086
0.110
0.078
0.063
2.406
1.266
0.359
140.6
114.4
3.5
40
30
46
Practical PPM
©2009 www.go-faster.co.uk
51
After we fixed the problem…
Login Redirect Duration
1
0.9
0.8
0.7
Duration (s)
0.6
0.5
0.4
0.3
0.2
0.1
0
Fri 27.6.08 00:00
Sat 28.6.08 00:00
Sun 29.6.08 00:00
Mon 30.6.08 00:00
Tue 1.7.08 00:00
Wed 2.7.08 00:00
Thu 3.7.08 00:00
Fri 4.7.08 00:00
Time
Practical PPM
©2009 www.go-faster.co.uk
52
Analytics: Top Components
Practical PPM
©2009 www.go-faster.co.uk
53
Performance Trace
• Generates a group of
PMUs for activity in a
user session
– Choose an ID to identify
records later
Practical PPM
©2009 www.go-faster.co.uk
54
Performance Trace
Practical PPM
©2009 www.go-faster.co.uk
55
Performance Monitoring Unit
• Look at PMU Tree
• Demonstration
Practical PPM
©2009 www.go-faster.co.uk
56
Practical Advice and Guidance on
the use of PeopleTools Performance
Monitor
David Kurtz
Go-Faster Consultancy Ltd.
[email protected]
www.go-faster.co.uk