Logical I/O - Julian Dyke
Download
Report
Transcript Logical I/O - Julian Dyke
Logical I/O
Julian Dyke
Independent Consultant
Web Version
1
© 2005 Julian Dyke
juliandyke.com
Agenda
2
Introduction
Logical I/Os
Buffer Cache Behaviour
Statistics
Conclusion
© 2005 Julian Dyke
juliandyke.com
Logical I/Os
3
Logical I/Os are read operations
Buffers are cached in shared memory
Most logical I/Os can be satisfied from cache
The remainder will result in physical I/Os
Logical I/Os include
current reads
consistent reads
© 2005 Julian Dyke
juliandyke.com
Current Reads
4
Current reads
Current version of block
Can be updated
Can be dirty
Includes all changes
Only one current version of block in buffer cache
Only one current version of block across all instances
Can be used to construct consistent versions
© 2005 Julian Dyke
juliandyke.com
Consistent Reads
Consistent reads
Potentially historic version of block
Consistent to a specific System Change Number (SCN)
Cannot be updated
Cannot be dirty
Can be used to construct consistent versions
Can have multiple versions of same block in buffer cache
Can be
single
multi
5
block (sequential reads)
block (scattered reads)
Can be traced using events 10200 / 10201
© 2005 Julian Dyke
juliandyke.com
Logical I/O statistics
6
session logical reads statistic
Total number of logical reads in session
Unreliable at system level
At session level
session logical reads = db block gets + consistent gets
db block gets statistic
Number of current reads
consistent gets statistic
Number of consistent reads
© 2005 Julian Dyke
juliandyke.com
Buffer Pools
7
There are up to eight buffer pools
DEFAULT
KEEP, RECYCLE
Oracle 8.0 and above
2K, 4K, 8K, 16K and 32K
Oracle 9.0 and above
32K not available on all platforms
Cannot have non-standard block size same as DEFAULT block
size
© 2005 Julian Dyke
juliandyke.com
Buffer Pool Headers
8
One for each buffer pool (usable or unusable)
Externalized in
V$BUFFER_POOL
V$BUFFER_POOL_STATISTICS
Based on X$KCBWBPD
Created in shared pool permanent memory when instance is
started
Contain one or more working sets
© 2005 Julian Dyke
juliandyke.com
X$KCBWBPD
9
Externalises buffer pool header
ADDR
RAW(4)
INDX
NUMBER
INST_ID
NUMBER
BP_NAME
VARCHAR2(20)
BP_ID
NUMBER
BP_BLKSZ
NUMBER
BP_GRANSZ
NUMBER
BP_BUFPERGRAN
NUMBER
BP_LO_SID
NUMBER
BP_HI_SID
NUMBER
BP_SET_CT
NUMBER
BP_SIZE
NUMBER
BP_STATE
NUMBER
BP_CURRGRANS
NUMBER
BP_TGTGRANS
NUMBER
BP_PREVGRANS
NUMBER
© 2005 Julian Dyke
Buffer Pool Name
Buffer Pool ID
Block Size
Granule Size
Buffers per Granule
Minimum Working Set ID
Maximum Working Set ID
Number of Working Sets
Number of Buffers
juliandyke.com
Hash Buckets
Hash value of each block calculated from
Data Block Address (DBA)
Block Class
Number of hash buckets dependent on number of buffers in
cache e.g.
# buffers
# hash buckets
10
500
6000
64
1024
Each hash bucket contains
Cache Buffers Chains latch
Pointer to array of double linked lists
© 2005 Julian Dyke
juliandyke.com
Hash Buckets
# hash chains
BH
BH
BH
cache
buffers
chain
latch
BH
11
© 2005 Julian Dyke
juliandyke.com
Buffer Headers
12
Each buffer header describes contents of one buffer
All buffers accessed via buffer header
Buffer header contains pointers to
Buffer
Cache Buffers Chains latch
Buffer header includes double linked lists for
Cache Buffers Chain list
Replacement list
Users list
Waiters list
© 2005 Julian Dyke
juliandyke.com
X$BH
13
Externalises buffer headers
ADDR
RAW(4)
INDX
NUMBER
INST_ID
NUMBER
HLADDR
RAW(4)
BLSIZ
NUMBER
NXT_HASH
RAW(4)
PRV_HASH
RAW(4)
NXT_REPL
RAW(4)
Tablespace#
PRV_REPL
RAW(4)
Absolute File Number
TS#
NUMBER
FILE#
NUMBER
Relative File Number
DBARFIL
NUMBER
Block Number
DBABLK
NUMBER
OBJ
NUMBER
BA
RAW(4)
CR_SCN_BAS
NUMBER
© 2005 Julian Dyke
Hash List Address
Block Size
Hash List
Replacement List
Object ID
Buffer Address
juliandyke.com
Working Sets
14
Introduced in Oracle 8.1.5
Each buffer pool contains one or more working sets
Working set header
created in shared pool permanent memory
associated with one DBWn process
protected by cache buffers lru chain latch
Each working set maintains separate set of LRU lists
© 2005 Julian Dyke
juliandyke.com
LRU Lists
15
In Oracle 9.2 each working set maintains 4 LRU lists
LRU - replacement list - normal blocks
LRU-W - write list - dirty blocks
LRU-XO - object list - buffers involved in
DROP
TRUNCATE
LRU-XR - range list - buffers involved in
ALTER TABLESPACE BEGIN BACKUP
ALTER TABLESPACE END BACKUP
ALTER TABLESPACE OFFLINE
ALTER TABLESPACE READ ONLY
© 2005 Julian Dyke
juliandyke.com
Main and Auxiliary Lists
16
Each LRU contains
main list
auxiliary list
Auxiliary list includes
dirty buffers identified by DBWn processes
buffers being written
Buffers are moved from main to auxiliary list by DBWn
processes to avoid unnecessary scans
Processes scan auxiliary lists first for free buffers
Buffers also allocated to auxiliary list
at startup
after FLUSH_CACHE
© 2005 Julian Dyke
juliandyke.com
Working Set Lists
Working Set
Header
Hot
Replacement
List
Cold
MAIN
AUX
Write
List
MAIN
Buffer
Header
AUX
Object
List
MAIN
AUX
Range
List
MAIN
AUX
17
© 2005 Julian Dyke
juliandyke.com
Replacement List
In Oracle 8.1.5 and above a mid-point insertion algorithm is
used
Buffer cache has a hot end and a cold end
Buffers are inserted at mid-point
Mid-point is head of cold end
Starts at hot end - moves down cache
Maximum mid-point determined by _db_percent_hot_default
Default value is 50%
Head of
Hot End
Hot End
18
© 2005 Julian Dyke
Head of
Cold End
Replacement List
Cold End
juliandyke.com
X$KCBWDS
19
Externalises working set header
Working Set ID
ADDR
RAW(4)
INDX
NUMBER
INST_ID
NUMBER
SET_ID
NUMBER
DBWR_NUM
NUMBER
BLK_SIZE
NUMBER
NXT_REPL
RAW(4)
PRV_REPL
RAW(4)
NXT_REPLAX
RAW(4)
PRV_REPLAX
RAW(4)
CNUM_REPL
RAW(4)
ANUM_REPL
RAW(4)
Insertion Point
COLD_HD
RAW(4)
HBMAX
NUMBER
Maximum number of
Hot Buffers
HBUFS
NUMBER
Number of Hot Buffers
NXT_WRITE
RAW(4)
© 2005 Julian Dyke
Database Writer Number
MAIN Replacement List
AUX Replacement List
Number of buffers on
MAIN Replacement List
Number of buffers on
AUX Replacement List
juliandyke.com
Touch Count
20
Each buffer header maintains
touch count
timestamp
Touch count represents number of 3 second intervals in which
buffer has been accessed since
buffer last read into cache
touch count last reset
Each time buffer is accessed
if timestamp more than 3 seconds ago
increment touch count
set timestamp to current time
© 2005 Julian Dyke
juliandyke.com
Touch Count
21
When buffer reaches tail of cold end
If touch count >= 2 then
buffer is moved to hot end
Otherwise used as next free buffer
Hot criteria determined by
_db_aging_hot_criteria
default value is 2 touches
Time interval determined by
_db_aging_touch_time
default value is 3 seconds
© 2005 Julian Dyke
juliandyke.com
Single versus Multi-Block Reads
22
Single block reads
Used with current reads
Can be used with consistent reads
Waits recorded by db file sequential read
Multi block reads
Frequently used with consistent reads
Maximum number of physical blocks read specified by
DB_FILE_MULTIBLOCK_READ_COUNT
Waits recorded by db file scattered read
Blocks moved to cold end of buffer cache
© 2005 Julian Dyke
juliandyke.com
Single-Block Reads
Head of
Hot End
92
71
42
34
92
71
72
34
92
45
72
34
52
42
45
33
72
87
11
71
52
42
45
33
72
11
66
71
52
42
45
33
11
49
66
71
52
42
45
11
0
3
0
3
4
0
2
4
2
4
1
2
1
4
0
2
1
0
1
2
Read Block 34
11
42
33
87
Get
Insert
Update
Move
Set
Gettouch
Insert
first
first
block
buffer
buffer
touch
buffer
available
available
count
71
42
at
atcount
contents
to
head
head
head
buffer
buffer
of
of cold
cold
from
end
for
of
on
from
end
hot
block
block
cold
cold
end42
71
42
34
end
end
to zero
23
STOP
Block
Number
Head of
Cold End
© 2005 Julian Dyke
42
33
87
11
Touch
Count
1
juliandyke.com
Consistent Reads
Current Block
Head of
Hot End
40
Head of
Cold End
56
34
17
27
27
34
17
95
27
34
33
95
27
132
128
132
150
150
150
Insert
Get
Read
Apply
first
current
consistent
buffer
undo
available
to
at
version
rollback
head
version
buffer
of
ofcold
of
from cold
block
end
27 SCN
to
into
endbuffer
132
128
STOP
© 2005 Julian Dyke
Consistent Block
17
27
Read Block 27 - SCN 128
132
24
Block
Number
85
33
95
System
Change
27
Number
128
132
150
juliandyke.com
Multi-Block Reads
DB_FILE_MULTIBLOCK_READ_COUNT = 4
Head of
Hot End
ReadBlock
Block7
2
6
58
Read
1
3
4
Head of
Cold End
7
2
1
3
4
8
5
6
6
2
3
7
1
5
5
2
1
6
1
5
41
5
2
3
6
2
3
7
4
1
8
1
5
1
2
5
6
2
1
5
3
6
7
2
3
4
6
7
8
1
5
Insert
Read
Move
Move
Get
Read
Insert
Move
next
next
block
buffers
next
block
block
buffers
four
four
four
17
2
3
4
5
86available
at
to
available
to
to
at
blocks
blocks
head
cold
cold
cold
headend
end
of
end
into
of
into
Get
first
four
cold
buffers
buffers
coldend
end
from cold
cold end
end
from
25
STOP
© 2005 Julian Dyke
juliandyke.com
Dirty Blocks
26
When blocks are updated they are marked dirty
Changes immediately written to redo buffer
Changes written back to disk asynchronously by DBWn
process
DBWn process
scans from cold end of MAIN replacement list
moves dirty blocks to auxiliary list
writes dirty blocks back to disk
Written blocks remain on auxiliary list until re-used
© 2005 Julian Dyke
juliandyke.com
Buffer Pinning
27
In Oracle 8.0 and above, Oracle uses pinning to reduce
number of logical I/Os
If buffer will be accessed again by the statement, it is pinned
in the buffer cache
Frequently used with index scans
Only appears to be used with consistent gets
not observed with current gets
If pinning was not implemented, number of logical I/Os would
significantly increase
© 2005 Julian Dyke
juliandyke.com
Buffer Pinning Statistics
28
buffer is not pinned count statistic
Number of pin-able buffers not pinned by this session
when visited
Equivalent to number of logical I/Os (for that part of
statement)
buffer is pinned count statistic
Number of buffers already pinned by this session when
visited
Number of buffers visited =
buffer is not pinned count + buffer is pinned count
© 2005 Julian Dyke
juliandyke.com
Consistent Gets Statistics
29
consistent gets - examination statistic
Number of consistent gets that could be immediately
performed without pinning the buffer
Generally apply to indexes
Require one latch get
Included in consistent gets statistic
no work - consistent read gets statistic
Number of consistent gets that could be performed without
requiring rollback or cleanout
Generally apply to tables
Require two latch gets
Included in consistent gets statistic
© 2005 Julian Dyke
juliandyke.com
Full Table Scan
SELECT SUM(c2) FROM t1;
0
SELECT STATEMENT
1 0
TABLE ACCESS (FULL) OF 'T1'
Read Block 21
1
2
4
5
6
8
9
10
12
13
14
16
17
18
20
3
7
11
15
19
Segment
Data
Empty
Block
Block
Header 2
1
3
Segment
Header
Data
Blocks
19
20
12
13
15
4
2
23
22
18
21
10
16
17
11
14
1
3
5
6
7
8
9
1
5
9
12
13
15
19
20
10
16
17
11
14
4
2
22
23
18
21
8
3
6
7
6
16
17
10
12
1
2
4
5
9
18
15
13
14
11
3
7
8
session logical reads
consistent gets
no work - consistent read gets
buffer is not pinned count
table scans (short tables)
table scans rows gotten
table scans blocks gotten
Empty
Blocks
10
12
2
3
4
5
9
16
17
13
14
11
1
6
7
8
18
15
1
20
36
40
48
12
16
24
28
32
52
56
44
8
64
68
4
72
60
10
12
9
13
14
11
6
7
8
16
17
1
2
3
4
5
18
15
Unused
Blocks
Table T1
High Water
Mark
30
STOP
© 2005 Julian Dyke
juliandyke.com
Full Table Scan - Summary
31
In Oracle 9.2
segment header initially read 3 times
segment header read again every 10 extents
All blocks are read up to high water mark
For longer tables blocks can be prefetched
Algorithm differs for Automatic Segment Space Managed
tablespaces
© 2005 Julian Dyke
juliandyke.com
Unique Scan
SELECT c2 FROM t1 WHERE c1 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'T1'
2 1
INDEX (UNIQUE SCAN) OF 'I1'
Read Index
Table Block 34
1
Data
Block
Branch
Leaf
Block
Block
Index I1
session logical reads
consistent gets
consistent gets - examination
buffer is not pinned count
index fetch by key
table fetch by rowid
rows fetched by callback
2
1
3
1
2
3
3
1
2
2
1
1
1
1
Branch
Block
Leaf
Blocks
Table T1
32
STOP
© 2005 Julian Dyke
juliandyke.com
Index Organised Table
SELECT c2 FROM t1 WHERE c1 = 42;
0
SELECT STATEMENT
1 0
INDEX (UNIQUE SCAN) OF 'I1'
session logical reads
consistent gets
consistent gets - examination
index fetch by key
2
1
1
2
2
1
1
Read Index Block 1
3
Leaf
Block
Branch
Block
Index I1
Branch
Block
Leaf
Blocks
33
STOP
© 2005 Julian Dyke
juliandyke.com
Single Table Hash Cluster
SELECT c2 FROM t1 WHERE c1 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (HASH) OF 'T1'
Read Table Block 7
session logical reads
consistent gets
no work - consistent read gets
cluster key scans
cluster key scan block gets
buffer is not pinned count
1
1
1
1
1
1
Data Block
Table T1
Leaf
Blocks
34
STOP
© 2005 Julian Dyke
juliandyke.com
Clustering Factor
Measures relationship between index entries and
corresponding data blocks
Used by CBO to calculate cost of using index
Good clustering factor approaches number of blocks in table;
Bad clustering factor approaches number of rows in table
CBO will favour indexes with a better clustering factor
Bad Clustering Factor
35
© 2005 Julian Dyke
Good Clustering Factor
juliandyke.com
Range Scan - Bad Clustering Factor
SELECT c2 FROM t1 WHERE c3 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'T1'
2 1
INDEX (RANGE SCAN) OF 'I2'
Read Table
Index Block 22
23
6
10
14
18
1
Leaf Block
Data
Block
(Pinned)
Branch
Block
Index I2
session logical reads
consistent gets
consistent gets - examination
no work - consistent read gets
index scans kdiixs1
buffer is not pinned count
buffer is pinned count
table fetch by rowid
5
6
8
2
4
17
3
5
6
2
3
4
18
7
11
3
4
6
1
2
5
11
4
5
7
1
2
3
6
3
5
1
2
4
3
4
6
1
2
5
Branch
Block
Leaf
Blocks
Table T1
36
STOP
© 2005 Julian Dyke
juliandyke.com
Range Scan - Good Clustering Factor
SELECT c2 FROM t1 WHERE c4 = 42;
0
SELECT STATEMENT
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'T1'
2 1
INDEX (RANGE SCAN) OF 'I3'
Read Table
Index Block 9
83
4
1
Leaf Block
Data
Block
(Pinned)
(Pinned)
Branch
Block
Index I3
session logical reads
consistent gets
consistent gets - examination
no work - consistent read gets
index scans kdiixs1
buffer is not pinned count
buffer is pinned count
table fetch by rowid
4
1
2
33
4
1
2
33
11
2
11
11
3
1
22
9
6
7
8
1
3
5
42
6
5
1
2
3
42
Branch
Block
Leaf
Blocks
Table T1
37
STOP
© 2005 Julian Dyke
juliandyke.com
Clustering Factor - Summary
38
Bad
Clustering
Factor
Good
Clustering
Factor
session logical reads
8
4
consistent gets
8
4
consistent gets - examination
1
1
no work - consistent gets
6
2
index scans kdiixs1
1
1
buffer is not pinned count
7
3
buffer is pinned count
5
9
table fetch by rowid
6
6
Higher clustering factor
Reduces number of logical I/Os required
Increases number of buffers that can be pinned
© 2005 Julian Dyke
juliandyke.com
Row Prefetching
39
For queries returning more than one row specify maximum
number of rows per round trip
If prefetch size too small
Increased number of round trips
Degrades performance
If prefetch size too large
Increased number of packets
May degrade performance
© 2005 Julian Dyke
juliandyke.com
Row Prefetching
Applies to
OCI
Pro*C
JDBC
PL/SQL
SQL*Plus
OCI_ATTR_PREFETCH_ROWS
Host Array
setRowPrefetch ()
BULK COLLECT
SET ARRAYSIZE
OCI default prefetch value is 1 (returns 2 rows per fetch)
res = OCIAttrSet
(
(dvoid *)stmt,
(ub4)OCI_HTYPE_STMT,
(dvoid *)&prefetchRows,
(ub4)0,
(ub4)OCI_ATTR_PREFETCH_ROWS,
(OCIError *)err
);
40
© 2005 Julian Dyke
juliandyke.com
Row Prefetching
41
Example - full table scan
1000 row table
31 blocks (+ segment header)
Prefetch Size
Consistent Gets
1
1003
2
518
3
337
4
276
5
227
10
130
20
82
50
53
100
43
250
37
500
35
1000
34
© 2005 Julian Dyke
Consistent Gets
Prefetch Size
juliandyke.com
Thank you for your interest
For more information and to provide feedback
please contact me
My e-mail address is:
[email protected]
My website address is:
www.juliandyke.com
42
© 2005 Julian Dyke
juliandyke.com