Forgotten Features - RAC

Download Report

Transcript Forgotten Features - RAC

RAC
Internals
Julian Dyke
Independent Consultant
Web Version
1
© 2007 Julian Dyke
juliandyke.com
Agenda

2
© 2007 Julian Dyke
Transactions in RAC
 Cross Instance Consistent Reads
juliandyke.com
Introduction
3
© 2007 Julian Dyke
juliandyke.com
System Change Number

4
In RAC clusters SCN must be maintained across all nodes in
cluster
 SCN propagation scheme differs according to version

In Oracle 9.2 and below defaults to Lamport algorithm
 Lamport or SCN Scheme 2 in alert.log
 SCN piggy-backed on GCS/GES messages
 Recorded in redo log
 Default delay of 7 seconds

In Oracle 10.1 and above uses a new algorithm
 SCN Scheme 3 in alert.log
 Broadcast on commit
 Apparently no delay
© 2007 Julian Dyke
juliandyke.com
MAX_COMMIT_PROPAGATION_DELAY
5

Prior to Oracle 10.2
 Default value is 700 centiseconds (7 seconds)
 Specifies maximum time taken for a COMMIT on one node
to be reflected on other nodes in the cluster
 For some applications, value must be set to 0 (Broadcast
on commit) including:
 E-Business suite
 SAP

In Oracle 10.2 and above default value is 0
© 2007 Julian Dyke
juliandyke.com
LMS Background Processes

6
LMS background processes:
 Implement cache fusion
 Serve both consistent and current versions of blocks in
cache of local instance to other instances
 Maintain local part of Global Resource Directory
 Minimum of 1 LMS process per instance
 Maximum is version dependent
 Oracle 9.2
10
 Oracle 10.1 20
 Oracle 10.2 36
 Prior to Oracle 10.1, could be configured using _lm_lms
parameter
 In Oracle 10.1 and above, initial number of LMS processes
specified by gcs_server_processes parameter
© 2007 Julian Dyke
juliandyke.com
LMS Background Processes
7

Each LMS background process manages a set of blocks
 Determined by hash function based on number of LMS
background processes
 Consequently
 a block will always be handled by the same LMS
process

Number of blocks served recorded in
 Session / System statistics
 V$CR_BLOCK_SERVER
© 2007 Julian Dyke
juliandyke.com
Cross Instance Consistent Read
Instance 1
Instance 2
Session 15
LMS0
SELECT runs,wickets
FROM score
WHERE team = 'ENG';
Build read
consistent version
of block 42
Session 27
UPDATE score
SET runs = runs + 6
4
2
WHERE team = 'ENG';
segment 5 slot 18:
state: 10
wrap#: 4E7
dba: 00800777
Undo Header
ITL1
ITL1
ITL1
seq: 530 irb 12
xid: 0005.018.4E7
xid: 0005.018.4E7
xid: 0005.018.4E7
xid: 0005.018.4E7
uba:
uba: -800777.530.12
800777.530.13
800777.530.12
800777.530.13
800777.530.14
uba:
uba: -800777.530.12
800777.530.13
800777.530.12
800777.530.13
800777.530.14
uba: 800777.530.14
800777.530.12
800777.530.13
slot 0
slot 0
slot 0
col1: ENG
col1: ENG
col1: ENG
col2: 340
350
344
352
col2: 340
350
344
352
340
col2: 352
344
350
col3: 1
col3: 1
col3: 1
12 uba: 5.1
slot 1
slot 1
col1: AUS
col1: AUS
col1: AUS
col2: 99
col2: 99
col2: 99
col3: 10
col3: 10
col3: 10
DataData
Block
Block
42 (copy)
42
8
© 2007 Julian Dyke
DataData
Block
Block
42 (copy)
42
Data Block 42
col3: 340
13 uba 800777.530.12
5.1
slot 1
block 42 slot 0
block 42 slot 0
col3: 344
14 uba 800777.530.13
5.1
block 42 slot 0
col3: 350
Undo Block 800777
juliandyke.com
V$CR_BLOCK_SERVER
9
Column Name
Data Type
Description
CR_REQUESTS
NUMBER
# CR Blocks served to other instances
CURRENT_REQUESTS
NUMBER
# Current Blocks served to other instances
DATA_REQUESTS
NUMBER
# Data Blocks served to other instances
UNDO_REQUESTS
NUMBER
# Undo Blocks served to other instances
TX_REQUESTS
NUMBER
# Undo Segment Headers served to other instances
CURRENT_REQUESTS
NUMBER
# requests requiring no changes to blocks served
PRIVATE_REQUESTS
NUMBER
# requests requiring changes for requesting transaction only
ZERO_RESULTS
NUMBER
# requests requiring changes for zero-XID transactions only
DISK_READ_RESULTS
NUMBER
# requests requiring requesting instance to read block from disk
FAIL_RESULTS
NUMBER
# requests failing - requesting instance must reissue request
FAIRNESS_DOWN_CONVERTS
NUMBER
# times receiving instance has downgraded an X lock
FAIRNESS_CLEARS
NUMBER
# times fairness counter was cleared
FREE_GC_ELEMENTS
NUMBER
# times request received and X-lock had no buffers
FLUSHES
NUMBER
# times log flushes by LMS process(es)
FLUSHES_QUEUED
NUMBER
# flushes queued by LMS process(es)
FLUSH_QUEUE_FULL
NUMBER
# times flush queue was full
FLUSH_MAX_TIME
NUMBER
maximum time for flush
LIGHT_WORKS
NUMBER
# times light works rule was invoked
ERRORS
NUMBER
# times error signalled by LMS process
© 2007 Julian Dyke
juliandyke.com
Light Works Rule
10

In theory, once a block has been written to disk, the LMS
process will not attempt to read it again when responding to a
consistent read request

Light Works Rule
 Prevents LMS processes from going to disk when
responding to CR requests for data, undo or undo segment
blocks
 Can prevent LMS process from completing its response to
a CR request
© 2007 Julian Dyke
juliandyke.com
GC Read Committed Block
Committed Block - Data Block on disk
Session15
LMS0
Session27
22:9
22:10
ENG 199
ENG 205
ENG 205
199
200
204
AUS 99
ENG 200
AUS 99
ENG 204
Block 42
Undo
Block
SELECT runs
FROM score
WHERE team = 'ENG';
UPDATE score
SET runs = 200
WHERE team = 'ENG';
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
COMMIT;
199
ENG 205
AUS 99
Instance 1
11
STOP
© 2007 Julian Dyke
Instance 2
juliandyke.com
GC Read Committed Block
Committed Block - Data Block in buffer cache
Session15
LMS0
Session27
22:9
22:10
ENG 199
ENG 205
ENG 205
200
204
199
AUS 99
ENG 200
AUS 99
ENG 204
Block 42
Undo
Block
SELECT runs
FROM score
WHERE team = 'ENG';
UPDATE score
SET runs = 200
WHERE team = 'ENG';
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
COMMIT;
ENG 199
AUS 99
Instance 1
12
STOP
© 2007 Julian Dyke
Instance 2
juliandyke.com
GC Read Uncommitted Block
13

Uncommitted changes MUST be flushed to the redo log before
the LMS process can ship a consistent block to another
instance

Reading process must wait until redo log changes have been
written to redo log by LMS process

Bad for standard RAC databases
 Reads must wait for redo log writes

Worse for extended / stretch RAC clusters
 Increased latency of cross site disk communications
© 2007 Julian Dyke
juliandyke.com
GC Read Uncommitted Block
14

For each block on which a consistent read is performed, a
redo log flush must first be performed

Number of redo log flushes is recorded in the FLUSHES
column of V$CR_BLOCK_SERVER

Redo log flush time
 is recorded in the gc cr block flush time statistic for the
LMS process
 will increase time taken to serve consistent block
 will increase time taken to perform consistent read

If LMS processes become very busy, consistent reads will
experience high wait times e.g. for a full table scan
gc cr multi block request
© 2007 Julian Dyke
juliandyke.com
GC Read Uncommitted Block
Uncommitted Block - Data Block in buffer cache
Session15
LMS0
Session27
22:10
ENG 199
ENG 199
ENG 199
205
204
200
AUS 99
SELECT runs
FROM score
WHERE team = 'ENG';
ENG 205
199
200
204
ENG 200
AUS 99
AUS 99
ENG 204
Block 42
Copy
Block 42
Undo
Block
UPDATE score
SET runs = 200
WHERE team = 'ENG';
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
ENG 199
AUS 99
Instance 1
15
STOP
© 2007 Julian Dyke
Instance 2
juliandyke.com
GC Read Uncommitted Block
Uncommitted Block - Data Block on disk
Session15
LMS0
Session27
ENG 199
ENG 205
199
200
204
AUS 99
SELECT runs
FROM score
WHERE team = 'ENG';
22:10
ENG 199
ENG 200
ENG 205
199
200
204
ENG 204
AUS 99
ENG 204
Block 42
Undo
Block
ENG 200
Instance 1
STOP
© 2007 Julian Dyke
UPDATE score
SET runs = 204
WHERE team = 'ENG';
UPDATE score
SET runs = 205
WHERE team = 'ENG';
ENG 205
199
200
204
SEE SLIDE NOTES
FOR ADDITIONAL
INFORMATION
AUS 99
16
UPDATE score
SET runs = 200
WHERE team = 'ENG';
Instance 2
juliandyke.com
Consistent Reads in RAC
17

If possible, blocks will always be read from the cache of
another instance

Undo blocks will be flushed to disk more frequently when:
 All columns are updated
 Indexed columns are updated
 Single rows inserted
 as opposed to using array inserts
 Transactions are regularly rolled back
 Rows locked using SELECT FOR UPDATE

Data blocks will be flushed to disk more frequently when:
 Most transactions are read-only
© 2007 Julian Dyke
juliandyke.com
Consistent Reads in RAC


18
Consistent read response times in RAC can be reduced by:
 Avoid reading uncommitted blocks on remote nodes
 Partitioning
 Limiting number of rows per block
 Specifying SCN
 Minimizing size of transactions on remote nodes
 Must retain ACID properties
 May be possible to use application logic to synchronize
writes and reads
 Increasing number of LMS processes on remote node
 Should be added dynamically by kernel
Also by obvious hardware changes such as
 reducing latency of interconnect
 increasing disk speed
© 2007 Julian Dyke
juliandyke.com
Thank you for your interest
For more information and to provide feedback
please contact me
My e-mail address is:
[email protected]
My website address is:
www.juliandyke.com
19
© 2007 Julian Dyke
juliandyke.com