Forgotten Features - RAC
Download
Report
Transcript Forgotten Features - RAC
RAC
Internals
Julian Dyke
Independent Consultant
Web Version
1
© 2007 Julian Dyke
juliandyke.com
Agenda
2
© 2007 Julian Dyke
Transactions in RAC
Cross Instance Consistent Reads
juliandyke.com
Introduction
3
© 2007 Julian Dyke
juliandyke.com
System Change Number
4
In RAC clusters SCN must be maintained across all nodes in
cluster
SCN propagation scheme differs according to version
In Oracle 9.2 and below defaults to Lamport algorithm
Lamport or SCN Scheme 2 in alert.log
SCN piggy-backed on GCS/GES messages
Recorded in redo log
Default delay of 7 seconds
In Oracle 10.1 and above uses a new algorithm
SCN Scheme 3 in alert.log
Broadcast on commit
Apparently no delay
© 2007 Julian Dyke
juliandyke.com
MAX_COMMIT_PROPAGATION_DELAY
5
Prior to Oracle 10.2
Default value is 700 centiseconds (7 seconds)
Specifies maximum time taken for a COMMIT on one node
to be reflected on other nodes in the cluster
For some applications, value must be set to 0 (Broadcast
on commit) including:
E-Business suite
SAP
In Oracle 10.2 and above default value is 0
© 2007 Julian Dyke
juliandyke.com
LMS Background Processes
6
LMS background processes:
Implement cache fusion
Serve both consistent and current versions of blocks in
cache of local instance to other instances
Maintain local part of Global Resource Directory
Minimum of 1 LMS process per instance
Maximum is version dependent
Oracle 9.2
10
Oracle 10.1 20
Oracle 10.2 36
Prior to Oracle 10.1, could be configured using _lm_lms
parameter
In Oracle 10.1 and above, initial number of LMS processes
specified by gcs_server_processes parameter
© 2007 Julian Dyke
juliandyke.com
LMS Background Processes
7
Each LMS background process manages a set of blocks
Determined by hash function based on number of LMS
background processes
Consequently
a block will always be handled by the same LMS
process
Number of blocks served recorded in
Session / System statistics
V$CR_BLOCK_SERVER
© 2007 Julian Dyke
juliandyke.com
Cross Instance Consistent Read
Instance 1
Instance 2
Session 15
LMS0
SELECT runs,wickets
FROM score
WHERE team = 'ENG';
Build read
consistent version
of block 42
Session 27
UPDATE score
SET runs = runs + 6
4
2
WHERE team = 'ENG';
segment 5 slot 18:
state: 10
wrap#: 4E7
dba: 00800777
Undo Header
ITL1
ITL1
ITL1
seq: 530 irb 12
xid: 0005.018.4E7
xid: 0005.018.4E7
xid: 0005.018.4E7
xid: 0005.018.4E7
uba:
uba: -800777.530.12
800777.530.13
800777.530.12
800777.530.13
800777.530.14
uba:
uba: -800777.530.12
800777.530.13
800777.530.12
800777.530.13
800777.530.14
uba: 800777.530.14
800777.530.12
800777.530.13
slot 0
slot 0
slot 0
col1: ENG
col1: ENG
col1: ENG
col2: 340
350
344
352
col2: 340
350
344
352
340
col2: 352
344
350
col3: 1
col3: 1
col3: 1
12 uba: 5.1
slot 1
slot 1
col1: AUS
col1: AUS
col1: AUS
col2: 99
col2: 99
col2: 99
col3: 10
col3: 10
col3: 10
DataData
Block
Block
42 (copy)
42
8
© 2007 Julian Dyke
DataData
Block
Block
42 (copy)
42
Data Block 42
col3: 340
13 uba 800777.530.12
5.1
slot 1
block 42 slot 0
block 42 slot 0
col3: 344
14 uba 800777.530.13
5.1
block 42 slot 0
col3: 350
Undo Block 800777
juliandyke.com
V$CR_BLOCK_SERVER
9
Column Name
Data Type
Description
CR_REQUESTS
NUMBER
# CR Blocks served to other instances
CURRENT_REQUESTS
NUMBER
# Current Blocks served to other instances
DATA_REQUESTS
NUMBER
# Data Blocks served to other instances
UNDO_REQUESTS
NUMBER
# Undo Blocks served to other instances
TX_REQUESTS
NUMBER
# Undo Segment Headers served to other instances
CURRENT_REQUESTS
NUMBER
# requests requiring no changes to blocks served
PRIVATE_REQUESTS
NUMBER
# requests requiring changes for requesting transaction only
ZERO_RESULTS
NUMBER
# requests requiring changes for zero-XID transactions only
DISK_READ_RESULTS
NUMBER
# requests requiring requesting instance to read block from disk
FAIL_RESULTS
NUMBER
# requests failing - requesting instance must reissue request
FAIRNESS_DOWN_CONVERTS
NUMBER
# times receiving instance has downgraded an X lock
FAIRNESS_CLEARS
NUMBER
# times fairness counter was cleared
FREE_GC_ELEMENTS
NUMBER
# times request received and X-lock had no buffers
FLUSHES
NUMBER
# times log flushes by LMS process(es)
FLUSHES_QUEUED
NUMBER
# flushes queued by LMS process(es)
FLUSH_QUEUE_FULL
NUMBER
# times flush queue was full
FLUSH_MAX_TIME
NUMBER
maximum time for flush
LIGHT_WORKS
NUMBER
# times light works rule was invoked
ERRORS
NUMBER
# times error signalled by LMS process
© 2007 Julian Dyke
juliandyke.com
Light Works Rule
10
In theory, once a block has been written to disk, the LMS
process will not attempt to read it again when responding to a
consistent read request
Light Works Rule
Prevents LMS processes from going to disk when
responding to CR requests for data, undo or undo segment
blocks
Can prevent LMS process from completing its response to
a CR request
© 2007 Julian Dyke
juliandyke.com
GC Read Committed Block
Committed Block - Data Block on disk
Session15
LMS0
Session27
22:9
22:10
ENG 199
ENG 205
ENG 205
199
200
204
AUS 99
ENG 200
AUS 99
ENG 204
Block 42
Undo
Block
SELECT runs
FROM score
WHERE team = 'ENG';
UPDATE score
SET runs = 200
WHERE team = 'ENG';
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
COMMIT;
199
ENG 205
AUS 99
Instance 1
11
STOP
© 2007 Julian Dyke
Instance 2
juliandyke.com
GC Read Committed Block
Committed Block - Data Block in buffer cache
Session15
LMS0
Session27
22:9
22:10
ENG 199
ENG 205
ENG 205
200
204
199
AUS 99
ENG 200
AUS 99
ENG 204
Block 42
Undo
Block
SELECT runs
FROM score
WHERE team = 'ENG';
UPDATE score
SET runs = 200
WHERE team = 'ENG';
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
COMMIT;
ENG 199
AUS 99
Instance 1
12
STOP
© 2007 Julian Dyke
Instance 2
juliandyke.com
GC Read Uncommitted Block
13
Uncommitted changes MUST be flushed to the redo log before
the LMS process can ship a consistent block to another
instance
Reading process must wait until redo log changes have been
written to redo log by LMS process
Bad for standard RAC databases
Reads must wait for redo log writes
Worse for extended / stretch RAC clusters
Increased latency of cross site disk communications
© 2007 Julian Dyke
juliandyke.com
GC Read Uncommitted Block
14
For each block on which a consistent read is performed, a
redo log flush must first be performed
Number of redo log flushes is recorded in the FLUSHES
column of V$CR_BLOCK_SERVER
Redo log flush time
is recorded in the gc cr block flush time statistic for the
LMS process
will increase time taken to serve consistent block
will increase time taken to perform consistent read
If LMS processes become very busy, consistent reads will
experience high wait times e.g. for a full table scan
gc cr multi block request
© 2007 Julian Dyke
juliandyke.com
GC Read Uncommitted Block
Uncommitted Block - Data Block in buffer cache
Session15
LMS0
Session27
22:10
ENG 199
ENG 199
ENG 199
205
204
200
AUS 99
SELECT runs
FROM score
WHERE team = 'ENG';
ENG 205
199
200
204
ENG 200
AUS 99
AUS 99
ENG 204
Block 42
Copy
Block 42
Undo
Block
UPDATE score
SET runs = 200
WHERE team = 'ENG';
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
ENG 199
AUS 99
Instance 1
15
STOP
© 2007 Julian Dyke
Instance 2
juliandyke.com
GC Read Uncommitted Block
Uncommitted Block - Data Block on disk
Session15
LMS0
Session27
ENG 199
ENG 205
199
200
204
AUS 99
SELECT runs
FROM score
WHERE team = 'ENG';
22:10
ENG 199
ENG 200
ENG 205
199
200
204
ENG 204
AUS 99
ENG 204
Block 42
Undo
Block
ENG 200
Instance 1
STOP
© 2007 Julian Dyke
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
ENG 205
199
200
204
SEE SLIDE NOTES
FOR ADDITIONAL
INFORMATION
AUS 99
16
UPDATE score
SET runs = 200
WHERE team = 'ENG';
Instance 2
juliandyke.com
Consistent Reads in RAC
17
If possible, blocks will always be read from the cache of
another instance
Undo blocks will be flushed to disk more frequently when:
All columns are updated
Indexed columns are updated
Single rows inserted
as opposed to using array inserts
Transactions are regularly rolled back
Rows locked using SELECT FOR UPDATE
Data blocks will be flushed to disk more frequently when:
Most transactions are read-only
© 2007 Julian Dyke
juliandyke.com
Consistent Reads in RAC
18
Consistent read response times in RAC can be reduced by:
Avoid reading uncommitted blocks on remote nodes
Partitioning
Limiting number of rows per block
Specifying SCN
Minimizing size of transactions on remote nodes
Must retain ACID properties
May be possible to use application logic to synchronize
writes and reads
Increasing number of LMS processes on remote node
Should be added dynamically by kernel
Also by obvious hardware changes such as
reducing latency of interconnect
increasing disk speed
© 2007 Julian Dyke
juliandyke.com
Thank you for your interest
For more information and to provide feedback
please contact me
My e-mail address is:
[email protected]
My website address is:
www.juliandyke.com
19
© 2007 Julian Dyke
juliandyke.com