Storage Operations Management Work Plan

Download Report

Transcript Storage Operations Management Work Plan

Oaktable











Jonathan Lewis and ORACLE_TRACE
Oracle_Trace crashes my Database
I start the SGA attach by searching every offset
Anjo Kolk says James Morle wrote a program using
x$ksmmem
I show James my first draft using x$ksmmem
James is baffled by why I'm hard coding offsets
James says the offsets are in some X$ table
I search, turn up a mail by Jonathan Lewison x$kqfco
Goldmine – all the offsets
Thanks Mogens Nogard!
Thanks to TomKyte's Decimal to Hex
http://oraperf.sourceforge.net
Direct Oracle SGA Memory
Access
Reading data directly from
Oracle’s shared memory segment
using C code
Thursday, July 16, 2015
SGA on UNIX
Snnn
SMON
PMON
Dnnn
SGA
Shared Pool
Database Buffer Cache
Pnnn
Redo Log
Buffer
CKPT
DBWR
ARCH
LGWR
oracle
sqlplus
Machine
Memory
SGA on NT
Snnn
Dnnn
Pnnn
CKPT
SMON
Shared Pool
Database Buffer Cache
Redo Log
Buffer
Machine
Memory
PMON
DBWR
LGWR
ARCH
Process Space
oracle
sqlplus
What is the SGA
 Memory Cache
 Often Used Data
 Rapid Access
 Shareable
 Concurrently Access
SGA 4 main regions
 Fixed information
–
–
–
–
Users info
Database statistics
X$dual
etc
 Data block cache
 SQL cache ( library cache/shared pool)
 Redo log buffer
How is the SGA info Used?
 Automatically
–
–
–
–
data blocks cached
Log buffer
Sql cache
Updates of system and user statistics
 User Queries
–
–
–
–
User info v$session
System info v$parameter
Performance statistics v$sysstat, v$latch, v$system_event
Buffer cache headers, x$bh
Why Direct Access with C?
 Reading Hidden Information
–
–
–
Sort info on version 7
OPS locking info version 8
Contents of data blocks (only the headers or visible in X$)
 Access while Database is Hung
 High Speed Access
–
–
–
Sampling User Waits, catch ephemeral data
Scan LRU chain in X$bh
Statistically approximate statistics
 SQL statistics per user
 Low overhead
Database Slow or Hung
Often happens at the largest sites when cutting
edge support is expected.





Shared Pool errors ORA 4031
Archiver or Log file Switch Hangs
Hang Bugs
Library Cache Latch contention
ORA-00379: no free buffers available in buffer
pool DEFAULT
Statistical Sampling
By Rapidly Sampling SQL statistics
and the users who have the statistics
open, one can see how much work a
particular user does with a particular
SQL statement
Low Overhead
 Marketing Appeal
 Clients are sensitive about their production
databases
 Heisenberg uncertainty affect – less overhead
less affect monitoring has on performance
which we are monitoring
SGA made visible through x$tables





Most of the SGA is not visible
X$KSMMEM Exception, Raw Dump of SGA
Information Externalized through X$ tables
Useful or Necessary information is Externalized
Externalized publicly through V$ Tables
Machine Memory
0x80000000
SGA
SGA
Buffer Cache
Graphic SGA
SGA
0x80000000
Fixed Area
Buffer Cache
Shared Pool
Log Buffer
Fixed Area
SGA
0x80000000
0x85251EF4
X$KSUSECST- user waits
X$KSUSECST
170 Records
2328 bytes
0x85251EF4
Row 1
Row 2
Row 3 …
X$KSUSECST Record
One Record in X$KSUSECST
1276
2328 bytes
X$KSUSECST Fields
1276
Seq #
1278
Event #
1280
p1
1284
1288
p2
p3
Externalization of C structs: X$
tables
If Structure foo was externalized in a X$
SQL> describe x$foo
Column Name
-----------------------------ADDR
INDX
ID
B
Type
-------RAW(8)
NUMBER
NUMBER
NUMBER
SGA is One Large C Struct
struct foo
{
int id;
int A;
int B;
int C;
};
struct foo foo[N];
Struct C code
#include <stdio.h>
#include <fcntl.h>
#define N 20
/* structure definition: */
struct foo
{
int id;
int a;
int b;
int c;
};
/* end structure definition */
Struct Record
main(){
struct foo foo[20];
int fptr;
/* zero out memory of struct */
memset(foo,0,sizeof(foo));
foo[0].id=1;
foo[0].a=12;
foo[0].b=13;
foo[0].c=13;
/* row 0 */
Struct Write to File
foo[1].id=2;
foo[1].a=22;
foo[1].b=23;
foo[1].c=24;
/* row 1 */
/* write to file, simulate SGA */
if ((fptr = open("foo.out",O_WRONLY | O_CREAT,0777)) < 0 )
return -1;
write(fptr,foo,sizeof(foo));
return 0;
}
Simulate SGA with a File
write(fp,foo,sizeof(foo));
Simulate SGA with a File
Row 1
Row 0
ID
0
0
0
0
A
1
4
6
4
4
B
3
8
2
8
1
Memory address
0
C
4
1
8
C
2
1
4
ID
6
8
1
2
4
0
1
1
6
0
2
2
0
4
Increasing
0
4
A
…
bits
bytes
hex bytes
oct bytes
Struct File Contents
$ ./foo
$ ls -l foo.out
-rw-r--r-- joe dba 320 Feb 10 19:41 foo.out
int = 32 bits
Int = 4 bytes
20 entries * 4 int * 4 bytes/int = 320 bytes
od – octal dump
$ od -l foo.out
0000000
1
0000020
2
0000040
0
*
0000500
12
22
0
13
23
0
13
24
0
Struct File Contents
Address is in Hex
Column 2 is the ID
Column 3 is field A
Column 4 is field B
Column 5 is field C
X$ tables ?
 Ok, x$foo =~ foo[20]
 How do I get a list of x$ tables?
 Where is each X$ located?
 V$Fixed_Tables
V$Fixed_Table – list of X$ tables
SQL> desc v$fixed_table;
Name
Null? Type
----------------------------------------- -------- ----------------NAME
VARCHAR2(30)
OBJECT_ID
NUMBER
TYPE
VARCHAR2(5)
TABLE_NUM
NUMBER
Graphic: X$ Addresses
SGA
0x80000000
0x8????????
X$????
V$Fixed_Table
spool addr.sql
select
'select 'addr, ||''''||name||''''||' from ' || name ||' where
rownum < 2;'
from
v$fixed_table
where
name like 'X%'
/
spool off
@addr.sql
Example: finding the address
select
a.addr ,
'X$KSUSE'
from
X$KSUSE
where
rownum < 2 ;
X$ layout
6802B244 X$KSLEMAP
6802B7EC X$KSLEI
6820B758 X$KSURU
6820B758 X$KSUSE
- v$session
6820B758 X$KSUSECST – v$session_wait
6820B758 X$KSUSESTA – v$session_stat
6820B758 X$KSUSIO
6826FBD0 X$KSMDD
6831EA0C X$KSRCHDL
What's in these X$ views
 V$ views are documented
 V$ views are based often on X$ tables
 The map from v$ to X$ is described in :
V$Fixed_View_Definition
V$Fixed_View_Definition
SQL> desc V$Fixed_View_Definition
Name
Type
----------------------------------- -------------VIEW_NAME
VARCHAR2(30)
VIEW_DEFINITION
VARCHAR2(4000)
Definition of V$Session_Wait
SQL> select
VIEW_DEFINITION
from
V$FIXED_VIEW_DEFINITION
where
view_name='GV$SESSION_WAIT';
VIEW_DEFINITION
----------------------------------------------------------------------select s.inst_id,s.indx,s.ksussseq,e.kslednam, e.ksledp1,s.ksussp1,s.ksussp1r,e.
ksledp2, s.ksussp2,s.ksussp2r,e.ksledp3,s.ksussp3,s.ksussp3r, decode(s.ksusstim,
0,0,-1,-1,-2,-2, decode(round(s.ksusstim/10000),0,-1,round(s.ksusstim/10000)))
, s.ksusewtm, decode(s.ksusstim, 0, 'WAITING', -2, 'WAITED UNKNOWN TIME', -1, '
WAITED SHORT TIME', 'WAITED KNOWN TIME') from x$ksusecst s, x$ksled e where bit
and(s.ksspaflg,1)!=0 and bitand(s.ksuseflg,1)!=0 and s.ksussseq!=0 and s.ksussop
c=e.indx
The Fields in X$ tables
 OK, I've picked an X$
 I've got the starting address
 Now, how do I get the fields?
X$KQFTA
 Kernel Query Fixed_view Table
 INDX
use to find column information
 KQFTANAM X$ table names
X$KQFCO





Kernel Query Fixed_view Column
KQFCOTAB Join with X$KQFTA.INDX
KQFCONAM Column name
KQFCOOFF Offset from beginning of the row
KQFCOSIZ
Columns size in bytes
X$KSUSECST Fields
1276
Seq #
2
1278
1280
Event #
p1
2
4
1284
1288
p2
p3
4
4
Address
BYTES
SGA Contents in Resume
In resume:
Oracle takes the C structure defining the
SGA and maps it onto a shared memory
segment
0x800000
0 SGA
Fixed
Memory address
Increasing
Buffer
Redo
Library
Cache
Buffer
Cache
Oracle provides access to some of the SGA
contents via X$ tables
**** Procedure *****
1.
2.
3.
4.
5.
6.
7.
8.
Choose a V$ view
Find base X$ Tables for v$ view
Map X$ fields to V$ fields
Get address of X$ table in SGA
Get the size of each record in X$ table
Get the number of records in X$ table
Get offsets for each desired field in X$ table
Get the base address of SGA
1) V$SESSION_WAIT Example






List of all users waiting
Detailed information on the waits
Data is ephemeral
Useful in Bottleneck diagnostics
High sampling rate candidate
Event 10046 captures this info
Good table for SGA sampling
V$SESSION_WAIT Description
SQL> desc v$session_wait
Name
----------------------------------------SID
SEQ#
EVENT
P1TEXT
P1
P1RAW
P2TEXT
P2
P2RAW
P3TEXT
P3
P3RAW
WAIT_TIME
SECONDS_IN_WAIT
STATE
)
Type
-------------------------,NUMBER
,NUMBER
,VARCHAR2(64)
,VARCHAR2(64)
,NUMBER
,RAW(4)
,VARCHAR2(64)
,NUMBER
,RAW(4)
,VARCHAR2(64)
,NUMBER
,RAW(4)
,NUMBER
,NUMBER
,VARCHAR2(19)
V$SESSION_WAIT Short
SQL> desc v$session_wait
Name
Type
---------------------------------------SID
NUMBER
SEQ#
NUMBER
EVENT
VARCHAR2(64)
P1
NUMBER
P2
NUMBER
P3
NUMBER)
V$FIXED_VIEW_DEFINITION
Gives mappings of V$ views to X$ tables
SQL> select
VIEW_DEFINITION
from
V$FIXED_VIEW_DEFINITION
where
view_name='V$SESSION_WAIT‘;
V$SESSION_WAIT View Definition
VIEW_DEFINITION
--------------------------------------------------------------------select
s.inst_id,
s.indx,
s.ksussseq,
e.kslednam,
e.ksledp1,
s.ksussp1,
s.ksussp1r,
e.ksledp2,
s.ksussp2,
s.ksussp2r,
e.ksledp3,
s.ksussp3,
s.ksussp3r,
round(s.ksusstim / 10000),
s.ksusewtm,
decode(s.ksusstim, 0, 'WAITING', -2, 'WAITED UNKNOWN TIME',
-1, 'WAITED SHORT TIME', 'WAITED KNOWN TIME')
from
x$ksusecst s,
x$ksled e
where
bitand(s.ksspaflg,1)!=0 and
bitand(s.ksuseflg,1)!=0 and
s.ksussseq!=0 and
s.ksussopc=e.indx
View Definition Short
VIEW_DEFINITION
--------------------------------------------------------------------select
s.indx,
s.ksussseq,
e.kslednam,
s.ksussp1,
s.ksussp2,
s.ksussp3
from
x$ksusecst s,
x$ksled e
where
s.ksussopc=e.indx
2) V$SESSION_WAIT Based on X$KSUSECT
VIEW_DEFINITION
---------------------------------------------------select
indx,
ksussseq,
ksussopc,
ksussp1,
ksussp2,
ksussp3
from
x$ksusecst
Equivalent SQL Statements
select
select
indx,
ksussseq,
ksussopc,
ksussp1,
ksussp2,
ksussp3
from
sid
seq#
event
p1
p2
p3
from
x$ksusecst
v$session_wait )
Note: x$ksusecst. Ksussopc is the event #
x$ksled.kslednam is a list of the event names where
x$ksled.indx = x$ksusecst. ksussopc
3) V$ to X$ Field Mapping
4) Get base SGA address for X$ table
Find the location of X$KSUSECST in the SGA
SQL> select addr from x$ksusecst where rownum < 2
ADDR
-------85251EF4
5) Find the Size of Each Record
SQL> select
((to_dec(e.addr)-to_dec(s.addr))) row_size
from
(select addr from x$ksusecst where rownum < 2) s,
(select max(addr) addr from x$ksusecst where rownum < 3) e ;
ROW_SIZE
---------------2328
6) Find the Number of Records in the
structure
SQL> select count(*) from x$ksusecst ;
COUNT(*)
-------------170
Get Offsets for Each Desired Field in X$ table
SQL> select c.kqfconam field_name,
c.kqfcooff offset,
c.kqfcosiz sz
from
x$kqfco c,
x$kqfta t
where
t.indx = c.kqfcotab and
t.kqftanam='X$KSUSECST'
order by
offset
;
X$KQFTA - X$ Tables Names
List of X$ tables


INDX
use to find column information
KQFTANAM X$ table names
To get Column information join with X$KQFCO
X$KQFTA.INDX = X$KQFCO.KQFCOTAB
X$KQFCO – X$ Table Columns
List of all the columns in X$ Tables




KQFCOTAB
KQFCONAM
KQFCOOFF
KQFCOSIZ
Join with X$KQFTA.INDX
Column name
Offset from beginning of the row
Columns size in bytes
Field Offsets
FIELD_NAME
-----------------------------ADDR
INDX
KSUSEWTM
INST_ID
KSSPAFLG
KSUSSSEQ
KSUSSOPC
KSUSSP1
KSUSSP1R
KSUSSP2
KSUSSP2R
KSUSSP3
KSUSSP3R
KSUSSTIM
KSUSENUM
KSUSEFLG
OFFSET
---------0
0
0
0
1
1276
1278
1280
1280
1284
1284
1288
1288
1292
1300
1308
SZ
---------4
4
4
4
1
2
2
4
4
4
4
4
4
4
2
4
What are all the fields at OFFSET 0?
These are all calculated values and not stored
explicitly in the SGA.
ADDR
INDX
INST_ID
KSUSEWTM
memory address
record number, like rownum
database instance ID
calculated field
Unexposed Fields
What happens between OFFSET 1 and 1276?
• Unexposed Fields
• Sometimes exposed elsewhere, in our case
•
•
V$SESSION
V$SESSTAT
Fields at Same Address
Why do some fields start at the same address?
KSUSSP1
KSUSSP1R
Are at the same address
Equivalent of
V$SESSION_WAIT.P1
V$SESSION_WAIT.P1RAW
These are the same data, just exposed as
Hex
Decimal
7) Offsets of Fields
8) Get Base SGA Address
SQL> select addr from x$ksmmem where
rownum < 2
ADDR
-------------80000000
Results X$KSUSECST
Machine Memory
0x80000000
SGA
SGA
Fixed Area
SGA
0x80000000
0x85251EF4
X$KSUSECST- user waits
X$KSUSECST
170 Records
2328 bytes
0x85251EF4
Row 1
Row 2
Row 3 …
X$KSUSECST Record
One Record in X$KSUSECST
1276
2328 bytes
X$KSUSECST Fields
1276
Seq #
1278
Event #
1280
p1
1284
1288
p2
p3
Attaching to the SGA
 UNIX System Call “shmat”
To attach to shared memory Unix as a system
call
void *shmat(
int shmid,
const void *shmaddr,
int shmflg
);
ID and Address arguments to “shmat”
The arguments are:



shmid – shared memory identifier specified
shmaddr – starting address of the shared memory
shmflg - flags
The argument shmflg can be set to SHM_RDONLY . To
avoid any possible data corruption the SGA should only
be attached read only.
The arguments shmid and shmaddr need to be set to
Oracle’s SGA id and address.
Finding Oracle SGA’s ID and Address
Use ORADEBUG to find the SGA id
SQL> oradebug setmypid
Statement processed.
SQL> oradebug ipc
Information written to trace file.
Finding Trace File
SQL> show parameters user_dump
NAME
VALUE
----------------------- -------------------------------user_dump_dest /u02/app/oracle/admin/V901/udump
SQL> exit
$ cd /u02/app/oracle/admin/V901/udump
$ ls -ltr | tail -1
-rw-r----- usupport dba Aug 24 18:01 v901_ora_23179.trc
Finding SHMID in Trace File
$ vi v901_ora_23179.trc
…
Total size 004456c Minimum Subarea size 00000000
Area Subarea Shmid
Stable Addr
Actual Addr
0
0
34401 0080000000 0080000000
…
Attaching to the SGA
Shmid
Shmaddr
Shmflg
34401
0x80000000
SHM_RDONLY
The SGA attach call in C would be:
Shmat(34401, 0x80000000, SHM_RDONLY);
This call needs to be executed as a UNIX user who has
read permission to the Oracle SGA
C Code Headers
#include <stdio.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <errno.h>
#include "event.h"
event.h is for translating the event #s into event
names
Events.h
Spool events.h
select 'char event[][100]={' from dual;
select '"'||name||'",' from v$event_name;
select ' "" };' from dual;
spool off
Define Base Addresses and Sizes
/* SGA BASE ADDRESS */
#define SGA_BASE
0x80000000
/* START ADDR of KSUSECST(V$SESSION_WAIT) */
#define KSUSECST_ADDR
0x85251EF4
/* NUMBER of ROWS/RECORDS in KSUSECST */
#define SESSIONS
150
/* SIZE in BYTES of a ROW in KSUSECST */
#define RECORD_SZ
2328
Define Offsets to Fields
#define KSUSSSEQ 1276
#define KSUSSOPC 1278
#define KSUSSP1R 1280
#define KSUSSP2R 1284
#define KSUSSP3R 1288
/* sequence # */
/* event # */
/* p1 */
/* p2 */
/* p3 */
Set Up Variables
main(argc, argv)
int argc;
char **argv;
{
void *addr;
int shmid;
int shmaddr;
void *current_addr;
long p1r, p2r, p3r;
unsigned int i, seq, tim, flg, evn;
Attach to SGA
/* ATTACH TO SGA */
shmid=atoi(argv[1]);
shmaddr=SGA_BASE;
if (
(void *)shmat(
shmid,
(void *)shmaddr,
SHM_RDONLY)
== (void *)-1 ) {
printf("shmat: error attatching to SGA\n");
exit();
}
Set Up Sampling Loop
/* LOOP OVER ALL SESSIONS until CANCEL */
while (1) {
/* set current address to beginning of Table */
current_addr=(void *)KSUSECST_ADDR;
sleep(1);
printf("^[[H ^[[J"); /* clear screen */
/* print page heading */
printf("%4s %8s %-20.20s %10s %10s %10s \n",
"sid", "seq", "wait","p1","p2","p3");
Loop over all Sessions
for ( i=0; i < SESSIONS ; i++ ) {
seq=*(unsigned short *)((int)current_addr+KSUSSSEQ);
evn=*(short *)
((int)current_addr+KSUSSOPC);
p1r=*(long *)
((int)current_addr+KSUSSP1R);
p2r=*(long *)
((int)current_addr+KSUSSP2R);
p3r=*(long *)
((int)current_addr+KSUSSP3R);
if ( evn != 0 ) {
printf("%4d %8u %-20.20s %10X %10X %10X \n",
i, seq, event[evn] ,p1r, p2r,p3r
);
}
current_addr=(void *)((int)current_addr+RECORD_SZ);
}
}
}
Output
$ sga_read_session_wait 34401
sid
seq
wait
p1
0
40582 pmon timer
12C
1
40452 rdbms ipc message
12C
2
43248 rdbms ipc message
12C
3
24706 rdbms ipc message
12C
4
736
smon timer
12C
5
88
rdbms ipc message
2BF20
8
178
SQL*Net message from 6265710
p2
0
0
0
0
0
0
1
p3
0
0
0
0
0
0
0
Pitfalls






Byte Swapping
32 bit vs 64 bit
Multiple Shared Memory Segments
Segmented Memory
Addresses are "unsigned int"
Misaligned Access
Little Endian vs Big Endian
 Is low byte values first or high byte values first ?
 a byte is 8 bits
–
00000000-11111111 bits,0 – 31 dec, 0x0 - 0xFF hex
 Big Endian is "normal" , highest bit first
 In ascii, the word "byte" is stored as
–
b = 62, y = 79, t = 74, e = 65
 echo 'byte' | od -x
–
–
b y t e
62 79 74 65
 Little Endian, ie byte swapped (Linux, OSF, Sequent, ?
)
–
–
y b e t
79 62 65 74
Byte Swap Example
Short = 2 bytes ie 16 bits
Goal, get the flag in the "second" byte
#ifdef __linux
uflg=*(short *)((int)sga_address)>>8;
#else
uflg=*(short *)((int)sga_address);
#endif
Byte Swap
Big Endian:
00 00 00 00 00 00 00 01
Little Endian
00 00 00 01 00 00 00 00
Solution, push the value over 8 places, to the
right,
ie >>8
64 bit vs 32 bit
 SQL> desc x$ksmmem
Name
------------------------------------ADDR
INDX
INST_ID
KSMMMVAL
-> 32 bit
Raw(8) -> 64 bit
Type
--------RAW(4)
NUMBER
NUMBER
RAW(4)
Segmented Memory
x$ksuse – can be dis-contiguous
Work around:
select 'int users[]={' from dual;
select '0x'||addr||',' from x$ksuse;
select '0x0};' from dual;
Misaligned Access
 Some platforms seg fault when addressing
misaligned bytes, need to read in even bytes or
units of 4 bytes depending on platform
1
2
3
4
5
6
7
8
x$ksusecst Record: What's Missing?
One Record in X$KSUSECST
???
???
1276
2328 bytes
Select Addr from X$? where Rownum<
2;
6802B244 X$KSLEMAP
6802B7EC X$KSLEI
6820B758 X$KSURU
6820B758 X$KSUSE – v$session
6820B758 X$KSUSECST – v$session_wait
6820B758 X$KSUSESTA – v$sesstat
6820B758 X$KSUSIO
6826FBD0 X$KSMDD
6831EA0C X$KSRCHDL
x$ksuse Record Contains x$ksusecst
One Record in X$Ksusecst
v$session
v$sesstat
236
v$session_wait v$session
1276
2328 bytes
x$ksusesta
x$ksusecst
x$ksuse
Getting v$sesstat addresses
select '#define '||
upper(translate(s.name,' :-()/*''','________'))||' '||
to_char(c.kqfcooff + STATISTIC# * 4 )
from
x$kqfco c,
x$kqfta t,
v$statname s
where
t.indx = c.kqfcotab
and ( t.kqftanam='X$KSUSESTA' ) and c.kqfconam='KSUSESTV'
and kqfcooff > 0
order by
c.kqfcooff
/
User Drilldown Query: 4 joins
select
w.sid sid,
w.seq# seq,
w.event event,
w.p1raw p1,
w.p2raw p2,
w.p3raw p3,
w.SECONDS_IN_WAIT ctime,
s.sql_hash_value sqlhash,
s.prev_hash_value psqlhash,
st.value cpu
from
v$session s,
v$sesstat st,
v$statname sn,
v$session_wait w
where
w.sid = s.sid and
st.sid = s.sid and
st.statistic# = sn.statistic# and
sn.name = 'CPU used when call started' and
w.event != 'SQL*Net message from client'
order by w.sid;
Other Fun Stuff
The next example is output from an SGA program
that follows the LRU of the Buffer Cache
The program demonstrates the
•
insertion point of LRU
•
cold end of LRU
•
hot end of the LRU
•
Full Table Scan Insertion Point
LRU HOT
LRU COLD