Logical I/O - Julian Dyke

Download Report

Transcript Logical I/O - Julian Dyke

Logical I/O
Julian Dyke
Independent Consultant
Web Version
1
© 2005 Julian Dyke
juliandyke.com
Agenda
2

Introduction

Logical I/Os

Buffer Cache Behaviour

Statistics

Conclusion
© 2005 Julian Dyke
juliandyke.com
Logical I/Os
3

Logical I/Os are read operations

Buffers are cached in shared memory

Most logical I/Os can be satisfied from cache

The remainder will result in physical I/Os

Logical I/Os include
 current reads
 consistent reads
© 2005 Julian Dyke
juliandyke.com
Current Reads

4
Current reads

Current version of block

Can be updated

Can be dirty

Includes all changes

Only one current version of block in buffer cache

Only one current version of block across all instances

Can be used to construct consistent versions
© 2005 Julian Dyke
juliandyke.com
Consistent Reads

Consistent reads

Potentially historic version of block

Consistent to a specific System Change Number (SCN)

Cannot be updated

Cannot be dirty

Can be used to construct consistent versions

Can have multiple versions of same block in buffer cache

Can be
 single
 multi

5
block (sequential reads)
block (scattered reads)
Can be traced using events 10200 / 10201
© 2005 Julian Dyke
juliandyke.com
Logical I/O statistics
6

session logical reads statistic
 Total number of logical reads in session
 Unreliable at system level
 At session level
session logical reads = db block gets + consistent gets

db block gets statistic
 Number of current reads

consistent gets statistic
 Number of consistent reads
© 2005 Julian Dyke
juliandyke.com
Buffer Pools
7

There are up to eight buffer pools
 DEFAULT
 KEEP, RECYCLE
Oracle 8.0 and above
 2K, 4K, 8K, 16K and 32K
Oracle 9.0 and above

32K not available on all platforms

Cannot have non-standard block size same as DEFAULT block
size
© 2005 Julian Dyke
juliandyke.com
Buffer Pool Headers
8

One for each buffer pool (usable or unusable)

Externalized in
 V$BUFFER_POOL
 V$BUFFER_POOL_STATISTICS

Based on X$KCBWBPD

Created in shared pool permanent memory when instance is
started

Contain one or more working sets
© 2005 Julian Dyke
juliandyke.com
X$KCBWBPD

9
Externalises buffer pool header
ADDR
RAW(4)
INDX
NUMBER
INST_ID
NUMBER
BP_NAME
VARCHAR2(20)
BP_ID
NUMBER
BP_BLKSZ
NUMBER
BP_GRANSZ
NUMBER
BP_BUFPERGRAN
NUMBER
BP_LO_SID
NUMBER
BP_HI_SID
NUMBER
BP_SET_CT
NUMBER
BP_SIZE
NUMBER
BP_STATE
NUMBER
BP_CURRGRANS
NUMBER
BP_TGTGRANS
NUMBER
BP_PREVGRANS
NUMBER
© 2005 Julian Dyke
Buffer Pool Name
Buffer Pool ID
Block Size
Granule Size
Buffers per Granule
Minimum Working Set ID
Maximum Working Set ID
Number of Working Sets
Number of Buffers
juliandyke.com
Hash Buckets

Hash value of each block calculated from
 Data Block Address (DBA)
 Block Class

Number of hash buckets dependent on number of buffers in
cache e.g.
# buffers
# hash buckets

10
500
6000
64
1024
Each hash bucket contains
 Cache Buffers Chains latch
 Pointer to array of double linked lists
© 2005 Julian Dyke
juliandyke.com
Hash Buckets
# hash chains
BH
BH
BH
cache
buffers
chain
latch
BH
11
© 2005 Julian Dyke
juliandyke.com
Buffer Headers
12

Each buffer header describes contents of one buffer

All buffers accessed via buffer header

Buffer header contains pointers to
 Buffer
 Cache Buffers Chains latch

Buffer header includes double linked lists for
 Cache Buffers Chain list
 Replacement list
 Users list
 Waiters list
© 2005 Julian Dyke
juliandyke.com
X$BH

13
Externalises buffer headers
ADDR
RAW(4)
INDX
NUMBER
INST_ID
NUMBER
HLADDR
RAW(4)
BLSIZ
NUMBER
NXT_HASH
RAW(4)
PRV_HASH
RAW(4)
NXT_REPL
RAW(4)
Tablespace#
PRV_REPL
RAW(4)
Absolute File Number
TS#
NUMBER
FILE#
NUMBER
Relative File Number
DBARFIL
NUMBER
Block Number
DBABLK
NUMBER
OBJ
NUMBER
BA
RAW(4)
CR_SCN_BAS
NUMBER
© 2005 Julian Dyke
Hash List Address
Block Size
Hash List
Replacement List
Object ID
Buffer Address
juliandyke.com
Working Sets
14

Introduced in Oracle 8.1.5

Each buffer pool contains one or more working sets

Working set header
 created in shared pool permanent memory
 associated with one DBWn process
 protected by cache buffers lru chain latch

Each working set maintains separate set of LRU lists
© 2005 Julian Dyke
juliandyke.com
LRU Lists

15
In Oracle 9.2 each working set maintains 4 LRU lists

LRU - replacement list - normal blocks

LRU-W - write list - dirty blocks

LRU-XO - object list - buffers involved in
 DROP
 TRUNCATE

LRU-XR - range list - buffers involved in
 ALTER TABLESPACE BEGIN BACKUP
 ALTER TABLESPACE END BACKUP
 ALTER TABLESPACE OFFLINE
 ALTER TABLESPACE READ ONLY
© 2005 Julian Dyke
juliandyke.com
Main and Auxiliary Lists
16

Each LRU contains
 main list
 auxiliary list

Auxiliary list includes
 dirty buffers identified by DBWn processes
 buffers being written

Buffers are moved from main to auxiliary list by DBWn
processes to avoid unnecessary scans

Processes scan auxiliary lists first for free buffers

Buffers also allocated to auxiliary list
 at startup
 after FLUSH_CACHE
© 2005 Julian Dyke
juliandyke.com
Working Set Lists
Working Set
Header
Hot
Replacement
List
Cold
MAIN
AUX
Write
List
MAIN
Buffer
Header
AUX
Object
List
MAIN
AUX
Range
List
MAIN
AUX
17
© 2005 Julian Dyke
juliandyke.com
Replacement List







In Oracle 8.1.5 and above a mid-point insertion algorithm is
used
Buffer cache has a hot end and a cold end
Buffers are inserted at mid-point
Mid-point is head of cold end
Starts at hot end - moves down cache
Maximum mid-point determined by _db_percent_hot_default
Default value is 50%
Head of
Hot End
Hot End
18
© 2005 Julian Dyke
Head of
Cold End
Replacement List
Cold End
juliandyke.com
X$KCBWDS

19
Externalises working set header
Working Set ID
ADDR
RAW(4)
INDX
NUMBER
INST_ID
NUMBER
SET_ID
NUMBER
DBWR_NUM
NUMBER
BLK_SIZE
NUMBER
NXT_REPL
RAW(4)
PRV_REPL
RAW(4)
NXT_REPLAX
RAW(4)
PRV_REPLAX
RAW(4)
CNUM_REPL
RAW(4)
ANUM_REPL
RAW(4)
Insertion Point
COLD_HD
RAW(4)
HBMAX
NUMBER
Maximum number of
Hot Buffers
HBUFS
NUMBER
Number of Hot Buffers
NXT_WRITE
RAW(4)
© 2005 Julian Dyke
Database Writer Number
MAIN Replacement List
AUX Replacement List
Number of buffers on
MAIN Replacement List
Number of buffers on
AUX Replacement List
juliandyke.com
Touch Count
20

Each buffer header maintains
 touch count
 timestamp

Touch count represents number of 3 second intervals in which
buffer has been accessed since
 buffer last read into cache
 touch count last reset

Each time buffer is accessed
 if timestamp more than 3 seconds ago
 increment touch count
 set timestamp to current time
© 2005 Julian Dyke
juliandyke.com
Touch Count
21

When buffer reaches tail of cold end
 If touch count >= 2 then
buffer is moved to hot end
 Otherwise used as next free buffer

Hot criteria determined by
 _db_aging_hot_criteria
 default value is 2 touches

Time interval determined by
 _db_aging_touch_time
 default value is 3 seconds
© 2005 Julian Dyke
juliandyke.com
Single versus Multi-Block Reads
22

Single block reads
 Used with current reads
 Can be used with consistent reads
 Waits recorded by db file sequential read

Multi block reads
 Frequently used with consistent reads
 Maximum number of physical blocks read specified by
DB_FILE_MULTIBLOCK_READ_COUNT
 Waits recorded by db file scattered read
 Blocks moved to cold end of buffer cache
© 2005 Julian Dyke
juliandyke.com
Single-Block Reads
Head of
Hot End
92
71
42
34
92
71
72
34
92
45
72
34
52
42
45
33
72
87
11
71
52
42
45
33
72
11
66
71
52
42
45
33
11
49
66
71
52
42
45
11
0
3
0
3
4
0
2
4
2
4
1
2
1
4
0
2
1
0
1
2
Read Block 34
11
42
33
87
Get
Insert
Update
Move
Set
Gettouch
Insert
first
first
block
buffer
buffer
touch
buffer
available
available
count
71
42
at
atcount
contents
to
head
head
head
buffer
buffer
of
of cold
cold
from
end
for
of
on
from
end
hot
block
block
cold
cold
end42
71
42
34
end
end
to zero
23
STOP
Block
Number
Head of
Cold End
© 2005 Julian Dyke
42
33
87
11
Touch
Count
1
juliandyke.com
Consistent Reads
Current Block
Head of
Hot End
40
Head of
Cold End
56
34
17
27
27
34
17
95
27
34
33
95
27
132
128
132
150
150
150
Insert
Get
Read
Apply
first
current
consistent
buffer
undo
available
to
at
version
rollback
head
version
buffer
of
ofcold
of
from cold
block
end
27 SCN
to
into
endbuffer
132
128
STOP
© 2005 Julian Dyke
Consistent Block
17
27
Read Block 27 - SCN 128
132
24
Block
Number
85
33
95
System
Change
27
Number
128
132
150
juliandyke.com
Multi-Block Reads
DB_FILE_MULTIBLOCK_READ_COUNT = 4
Head of
Hot End
ReadBlock
Block7
2
6
58
Read
1
3
4
Head of
Cold End
7
2
1
3
4
8
5
6
6
2
3
7
1
5
5
2
1
6
1
5
41
5
2
3
6
2
3
7
4
1
8
1
5
1
2
5
6
2
1
5
3
6
7
2
3
4
6
7
8
1
5
Insert
Read
Move
Move
Get
Read
Insert
Move
next
next
block
buffers
next
block
block
buffers
four
four
four
17
2
3
4
5
86available
at
to
available
to
to
at
blocks
blocks
head
cold
cold
cold
headend
end
of
end
into
of
into
Get
first
four
cold
buffers
buffers
coldend
end
from cold
cold end
end
from
25
STOP
© 2005 Julian Dyke
juliandyke.com
Dirty Blocks
26

When blocks are updated they are marked dirty

Changes immediately written to redo buffer

Changes written back to disk asynchronously by DBWn
process

DBWn process
 scans from cold end of MAIN replacement list
 moves dirty blocks to auxiliary list
 writes dirty blocks back to disk

Written blocks remain on auxiliary list until re-used
© 2005 Julian Dyke
juliandyke.com
Buffer Pinning
27

In Oracle 8.0 and above, Oracle uses pinning to reduce
number of logical I/Os

If buffer will be accessed again by the statement, it is pinned
in the buffer cache

Frequently used with index scans

Only appears to be used with consistent gets
 not observed with current gets

If pinning was not implemented, number of logical I/Os would
significantly increase
© 2005 Julian Dyke
juliandyke.com
Buffer Pinning Statistics
28

buffer is not pinned count statistic
 Number of pin-able buffers not pinned by this session
when visited
 Equivalent to number of logical I/Os (for that part of
statement)

buffer is pinned count statistic
 Number of buffers already pinned by this session when
visited

Number of buffers visited =
 buffer is not pinned count + buffer is pinned count
© 2005 Julian Dyke
juliandyke.com
Consistent Gets Statistics
29

consistent gets - examination statistic
 Number of consistent gets that could be immediately
performed without pinning the buffer
 Generally apply to indexes
 Require one latch get
 Included in consistent gets statistic

no work - consistent read gets statistic
 Number of consistent gets that could be performed without
requiring rollback or cleanout
 Generally apply to tables
 Require two latch gets
 Included in consistent gets statistic
© 2005 Julian Dyke
juliandyke.com
Full Table Scan
SELECT SUM(c2) FROM t1;
0
SELECT STATEMENT
1 0
TABLE ACCESS (FULL) OF 'T1'
Read Block 21
1
2
4
5
6
8
9
10
12
13
14
16
17
18
20
3
7
11
15
19
Segment
Data
Empty
Block
Block
Header 2
1
3
Segment
Header
Data
Blocks
19
20
12
13
15
4
2
23
22
18
21
10
16
17
11
14
1
3
5
6
7
8
9
1
5
9
12
13
15
19
20
10
16
17
11
14
4
2
22
23
18
21
8
3
6
7
6
16
17
10
12
1
2
4
5
9
18
15
13
14
11
3
7
8
session logical reads
consistent gets
no work - consistent read gets
buffer is not pinned count
table scans (short tables)
table scans rows gotten
table scans blocks gotten
Empty
Blocks
10
12
2
3
4
5
9
16
17
13
14
11
1
6
7
8
18
15
1
20
36
40
48
12
16
24
28
32
52
56
44
8
64
68
4
72
60
10
12
9
13
14
11
6
7
8
16
17
1
2
3
4
5
18
15
Unused
Blocks
Table T1
High Water
Mark
30
STOP
© 2005 Julian Dyke
juliandyke.com
Full Table Scan - Summary
31

In Oracle 9.2
 segment header initially read 3 times
 segment header read again every 10 extents

All blocks are read up to high water mark

For longer tables blocks can be prefetched

Algorithm differs for Automatic Segment Space Managed
tablespaces
© 2005 Julian Dyke
juliandyke.com
Unique Scan
SELECT c2 FROM t1 WHERE c1 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'T1'
2 1
INDEX (UNIQUE SCAN) OF 'I1'
Read Index
Table Block 34
1
Data
Block
Branch
Leaf
Block
Block
Index I1
session logical reads
consistent gets
consistent gets - examination
buffer is not pinned count
index fetch by key
table fetch by rowid
rows fetched by callback
2
1
3
1
2
3
3
1
2
2
1
1
1
1
Branch
Block
Leaf
Blocks
Table T1
32
STOP
© 2005 Julian Dyke
juliandyke.com
Index Organised Table
SELECT c2 FROM t1 WHERE c1 = 42;
0
SELECT STATEMENT
1 0
INDEX (UNIQUE SCAN) OF 'I1'
session logical reads
consistent gets
consistent gets - examination
index fetch by key
2
1
1
2
2
1
1
Read Index Block 1
3
Leaf
Block
Branch
Block
Index I1
Branch
Block
Leaf
Blocks
33
STOP
© 2005 Julian Dyke
juliandyke.com
Single Table Hash Cluster
SELECT c2 FROM t1 WHERE c1 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (HASH) OF 'T1'
Read Table Block 7
session logical reads
consistent gets
no work - consistent read gets
cluster key scans
cluster key scan block gets
buffer is not pinned count
1
1
1
1
1
1
Data Block
Table T1
Leaf
Blocks
34
STOP
© 2005 Julian Dyke
juliandyke.com
Clustering Factor

Measures relationship between index entries and
corresponding data blocks

Used by CBO to calculate cost of using index


Good clustering factor approaches number of blocks in table;
Bad clustering factor approaches number of rows in table

CBO will favour indexes with a better clustering factor
Bad Clustering Factor
35
© 2005 Julian Dyke
Good Clustering Factor
juliandyke.com
Range Scan - Bad Clustering Factor
SELECT c2 FROM t1 WHERE c3 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'T1'
2 1
INDEX (RANGE SCAN) OF 'I2'
Read Table
Index Block 22
23
6
10
14
18
1
Leaf Block
Data
Block
(Pinned)
Branch
Block
Index I2
session logical reads
consistent gets
consistent gets - examination
no work - consistent read gets
index scans kdiixs1
buffer is not pinned count
buffer is pinned count
table fetch by rowid
5
6
8
2
4
17
3
5
6
2
3
4
18
7
11
3
4
6
1
2
5
11
4
5
7
1
2
3
6
3
5
1
2
4
3
4
6
1
2
5
Branch
Block
Leaf
Blocks
Table T1
36
STOP
© 2005 Julian Dyke
juliandyke.com
Range Scan - Good Clustering Factor
SELECT c2 FROM t1 WHERE c4 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'T1'
2 1
INDEX (RANGE SCAN) OF 'I3'
Read Table
Index Block 9
83
4
1
Leaf Block
Data
Block
(Pinned)
(Pinned)
Branch
Block
Index I3
session logical reads
consistent gets
consistent gets - examination
no work - consistent read gets
index scans kdiixs1
buffer is not pinned count
buffer is pinned count
table fetch by rowid
4
1
2
33
4
1
2
33
11
2
11
11
3
1
22
9
6
7
8
1
3
5
42
6
5
1
2
3
42
Branch
Block
Leaf
Blocks
Table T1
37
STOP
© 2005 Julian Dyke
juliandyke.com
Clustering Factor - Summary

38
Bad
Clustering
Factor
Good
Clustering
Factor
session logical reads
8
4
consistent gets
8
4
consistent gets - examination
1
1
no work - consistent gets
6
2
index scans kdiixs1
1
1
buffer is not pinned count
7
3
buffer is pinned count
5
9
table fetch by rowid
6
6
Higher clustering factor
 Reduces number of logical I/Os required
 Increases number of buffers that can be pinned
© 2005 Julian Dyke
juliandyke.com
Row Prefetching
39

For queries returning more than one row specify maximum
number of rows per round trip

If prefetch size too small
 Increased number of round trips
 Degrades performance

If prefetch size too large
 Increased number of packets
 May degrade performance
© 2005 Julian Dyke
juliandyke.com
Row Prefetching


Applies to
 OCI
 Pro*C
 JDBC
 PL/SQL
 SQL*Plus
OCI_ATTR_PREFETCH_ROWS
Host Array
setRowPrefetch ()
BULK COLLECT
SET ARRAYSIZE
OCI default prefetch value is 1 (returns 2 rows per fetch)
res = OCIAttrSet
(
(dvoid *)stmt,
(ub4)OCI_HTYPE_STMT,
(dvoid *)&prefetchRows,
(ub4)0,
(ub4)OCI_ATTR_PREFETCH_ROWS,
(OCIError *)err
);
40
© 2005 Julian Dyke
juliandyke.com
Row Prefetching
41
Example - full table scan
 1000 row table
 31 blocks (+ segment header)
Prefetch Size
Consistent Gets
1
1003
2
518
3
337
4
276
5
227
10
130
20
82
50
53
100
43
250
37
500
35
1000
34
© 2005 Julian Dyke
Consistent Gets

Prefetch Size
juliandyke.com
Thank you for your interest
For more information and to provide feedback
please contact me
My e-mail address is:
[email protected]
My website address is:
www.juliandyke.com
42
© 2005 Julian Dyke
juliandyke.com