Transcript Flashback Logging Internals
Flashback Logging Internals
Julian Dyke Independent Consultant
1
Web Version - December 2007
© 2007 Julian Dyke juliandyke.com
Agenda
Extended Clusters versus Fast Start Failover Flashback Database Flashback Logging Internals 2 © 2007 Julian Dyke juliandyke.com
Extended Clusters versus Fast Start Failover
juliandyke.com
3 © 2007 Julian Dyke
Extended Clusters Overview
Public Network Private Network Instance 1 Node 1 Quorum Site3 Instance 2 Node 2 4 Database Site1 © 2007 Julian Dyke Database Site2 juliandyke.com
Extended Clusters Overview
Currently the Holy Grail of high availability
RAC nodes located at physically separate sites
In-built disaster recovery
In the event of a site failure, database is still available
Active / Active configuration
Users can access database via either site 5
Storage is duplicated at each site
Can use ASM or vendor-supplied storage technology to ensure all writes are replicated to storage on each site © 2007 Julian Dyke juliandyke.com
6
Extended Clusters Advantages and Disadvantages
Advantages
Disaster recovery - all changes written to both sites
Active / Active - both sites available
Disadvantages
Complexity
Cache fusion traffic between sites
Requires Enterprise Edition licences + RAC option
Cost of inter-site fibre network © 2007 Julian Dyke juliandyke.com
Fast Start Failover Overview
Public Network Private Network Instance 1 Node 1 Instance 2 Node 2 Site3 7 Database Site1 - Primary © 2007 Julian Dyke Database Site2 - Standby juliandyke.com
Fast Start Failover Overview
Target standby database must be nominated
Failure of primary database can be detected and automatically failed over to nominated standby database
Primary database can potentially be reinstated automatically
Requires flashback logging
Requires DGMGRL configuration 8
Must configure MAXIMUM AVAILABILITY protection mode
Standby database archive log destination must be configured as LGWR SYNC © 2007 Julian Dyke juliandyke.com
Fast Start Failover Advantages & Disadvantages
Advantages
No interconnect network required between sites
No fibre network required between sites
RAC licences not required if each site is a single-instance
Disadvantages
Active / Passive
Requires Enterprise Edition licence 9 © 2007 Julian Dyke juliandyke.com
Fast Start Failover Observer
Requires third independent site with:
Oracle client installation (administrative user)
Oracle Net configuration to primary and standby
On third site:
DGMGRL starts observer
Observer monitors state of primary database
If primary database fails observer initiates failover to target standby database
Observer checks if standby database can still see primary database before initiating failover 10
Performance impact of observer process on primary / standby is minimal © 2007 Julian Dyke juliandyke.com
11 © 2007 Julian Dyke
Flashback Database
juliandyke.com
Flashback Database Introduction
Introduced in Oracle 10.1
Uses past block images to back out changes to a database
Allows database to be recovered to a previous time to correct problems caused by:
logical data corruptions
user errors
Amount of time required to flashback a database is proportional to how far back database must be reverted 12
Time to restore and recover entire database could be much longer © 2007 Julian Dyke juliandyke.com
Flashback Database Introduction
During normal database operation, Oracle occasionally logs past block images in flashback logs
Flashback logs are
written sequentially
not archived
Oracle automatically creates, resizes and deletes flashback logs in the flash recovery area 13
DBA should be aware of flashback logs
To monitor performance
To decide how much space to allocate to flash recovery area © 2007 Julian Dyke juliandyke.com
Flashback Database Flashing Back
Before images are used to restore database to a point in the past
Forward recovery is then used to bring the database to a consistent state
Oracle returns datafiles to previous point in time
Not auxiliary files such as initialization parameter files 14 © 2007 Julian Dyke juliandyke.com
Flashback Database Applications
Flashback recovery of database to earlier SCN
Testing
Application / User errors
Recovery through resetlogs
Opening standby database with write access
Fast start failover
Automatic reinstantiation of old primary following fast start failover to standby 15
Alternative to delayed redo application for physical or logical standby databases © 2007 Julian Dyke juliandyke.com
Flashback Database What do we already know?
Introduced in Oracle 10.1
Requires flash recovery area
Maintains before image logs for block changes
Records are appended to flashback logs
Uses RVWR background process 16 © 2007 Julian Dyke juliandyke.com
Flashback Database What don't we know?
Are index blocks logged?
Is undo logged?
Is temporary segments logged?
What happens when a segment is deleted Is a block logged every time it is changed?
If not, how does Oracle know?
What when an object leaves the buffer cache Is there any control structure What about multiple block sizes?
How does it work in RAC?
What about contention - latches?
Undocumented parameters?
When is flashback overwritten?
17 © 2007 Julian Dyke juliandyke.com
Flash Recovery Area Prerequisites
Archiving must be enabled
Flash recovery area must be configured using
DB_RECOVERY_FILE_DEST_SIZE - size of flashback recovery area in bytes
DB_RECOVERY_FILE_DEST - location of flashback recovery area
For example: SQL> ALTER SYSTEM SET db_recovery_file_dest_size = 10G; SQL> ALTER SYSTEM SET db_recovery_file_dest = '/oradata/recovery'; 18 © 2007 Julian Dyke juliandyke.com
Flashback Database Parameters
One supported parameter:
DB_FLASHBACK_RETENTION_TARGET
Specifies upper limit on how far back in time database may be flashed back
Specified in minutes
Default value is 1440 minutes (24 hours)
Affects number of flashback logs retained in flash recovery area 19 © 2007 Julian Dyke juliandyke.com
Flashback Database Configuration
To enable flashback logging database must be mounted but not open SQL> STARTUP MOUNT SQL> ALTER DATABASE FLASHBACK ON; SQL> ALTER DATABASE OPEN;
To disable flashback logging use: SQL> ALTER DATABASE FLASHBACK OFF;
To check if flashback is currently enabled: SQL> SELECT flashback_on FROM v$database;
FLASHBACK_ON ----------- YES
20 © 2007 Julian Dyke juliandyke.com
Flashback Database System Change Numbers and Times
To check current SCN use: SQL> SELECT current_scn FROM v$database;
To check oldest SCN that can be flashed back to use: SQL> SELECT oldest_flashback_scn FROM v$flashback_database_log;
To check oldest time that can be flashed back to use: SQL> ALTER SESSION SET nls_date_format = 'DD-MON-YYYY HH24:MI:SS'; SQL> SELECT oldest_flashback_time FROM v$flashback_database_log; 21 © 2007 Julian Dyke juliandyke.com
Flashback Database Operation
To flashback the database use the following syntax: SQL> FLASHBACK [ STANDBY ] DATABASE [ database ] { TO { { SCN | TIMESTAMP } expr | RESTORE POINT restore_point } | TO BEFORE { SCN | TIMESTAMP } expr | RESETLOGS} };
Database must be mounted and not open to flashback
For example SQL> SHUTDOWN IMMEDIATE SQL> STARTUP MOUNT SQL> FLASHBACK DATABASE TO SCN 461918; Flashback complete.
SQL> ALTER DATABASE OPEN READ ONLY; SQL> ALTER DATABASE OPEN RESETLOGS 22 © 2007 Julian Dyke juliandyke.com
Flashback Database Restrictions
Cannot flash back to an SCN ahead of the current SCN
Cannot flash back to a time in the future
Database must be opened with read write access
Cannot open read only
Database must be opened with RESETLOGS
Cannot flash back if datafile resized (shrunk) during flashback period 23 © 2007 Julian Dyke juliandyke.com
Flashback Database Dynamic Performance Views
V$FLASHBACK_DATABASE_LOG OLDEST_FLASHBACK_SCN OLDEST_FLASHBACK_TIME RETENTION_TARGET FLASHBACK_SIZE ESTIMATED_FLASHBACK_SIZE NUMBER DATE NUMBER NUMBER NUMBER 24
V$FLASHBACK_DATABASE_STAT BEGIN_TIME END_TIME FLASHBACK_DATA DB_DATA REDO_DATA ESTIMATED_FLASHBACK_SIZE DATE DATE NUMBER NUMBER NUMBER NUMBER © 2007 Julian Dyke juliandyke.com
Flashback Database Dynamic Performance Views
V$FLASHBACK_DATABASE_LOGFILE NAME LOG# THREAD# SEQUENCE# BYTES FIRST_CHANGE# FIRST_TIME VARCHAR2(13) NUMBER NUMBER NUMBER NUMBER NUMBER DATE 25 © 2007 Julian Dyke juliandyke.com
26 © 2007 Julian Dyke
Flashback Logging Internals
juliandyke.com
Flashback Log Files Location and Naming
Stored in Flash Recovery Area (mandatory)
Subdirectory is
Use Oracle-Managed Files (OMF) (mandatory) For example
o1_mf_3504ofnh_.flb
o1_mf_350g3r24_.flb
o1_mf_350jl666_.flb
Used sequentially
Can be reused 27
Generated when required
Dropped when space required in flash recovery area © 2007 Julian Dyke juliandyke.com
Flashback Log Files Sizing
Flashback log size same as database block size
e.g. 4096 or 8192
Initial size is 1001 x block size
determined by
_flashback_log_size (defaults to 1000)
additional block for file header e.g
1001 x 8192 = 8200192 bytes 28
Subsequent size reduces to 3989504
probably determined by
size of flashback generation buffer
additional block for file header note there is a rounding error here (3981204) © 2007 Julian Dyke juliandyke.com
29
Flashback Log Files Controlfile Dumps
SQL> ALTER SESSION SET EVENTS 'immediate trace name controlf level 3';
******************************************************************* FLASHBACK LOGFILE RECORDS ******************************************************************* FLASHBACK LOG FILE #4: (name #12) /oradata/recovery/PROD/flashback/o1_mf_350kw47d_.flb
Thread 1 flashback log links: forward: 5 backward: 3 size: 486 seq: 4 bsz: 8192 nab: 0x1e7 flg: 0x0 magic: 3 dup: 1 Low scn: 0x0000.00071169 05/20/2007 14:05:08 High scn: 0x0000.00071980 05/02/2007 15:16:48 FLASHBACK LOG FILE #5: (name #13) /oradata/recovery/PROD/flashback/o1_mf_350p2jz0_.flb
Thread 1 flashback log links: forward: 6 backward: 4 size: 486 seq: 5 bsz: 8192 nab: 0x1e7 flg: 0x0 magic: 5 dup: 1 Low scn: 0x0000.00071980 05/20/2007 15:16:48 High scn: 0x0000.0007247b 05/02/2007 16:43:13
Current Logfile
FLASHBACK LOG FILE #6: (name #14) /oradata/recovery/PROD/flashback/o1_mf_350v4kz1_.flb
Thread 1 flashback log links: forward: 1 backward: 5 size: 486 seq: 4 bsz: 8192 nab: 0xffffffff flg: 0x0 magic: 4 dup: 1 Low scn: 0x0000.0007247b 05/20/2007 16:43:13 High scn: 0xffff.ffffffff 05/02/2007 00:00:00
© 2007 Julian Dyke juliandyke.com
Flashback Logging Recovery Writer Process
Flashback uses the recovery writer ( RVWR ) background process
Copies flashback blocks from flashback generation buffer to flashback logs SELECT description FROM v$bgprocess WHERE name = 'RVWR';
DESCRIPTION -------------- Recovery Writer
Checks for records in flashback generation buffer every 3 seconds
Waits on rdbms ipc message 30
In Linux records written to disk using pwrite64
Multi block writes (8192 byte records) © 2007 Julian Dyke juliandyke.com
Flashback Logging Recovery Writer Process
Recovery process structure is linked into SGA global area SELECT addr FROM x$ksbdp WHERE ksbdpnam = 'RVWR';
ADDR -------------- 2000D860
SELECT ksmfsnam,ksmfstyp FROM x$ksmfsv WHERE ksmfsadr = '2000D860';
KSMFSNAM KSMFSTYP -------- ------ krfwrp_ ksbdp
ksbdp structure for RVWR background process is krfwrp_ 31 © 2007 Julian Dyke juliandyke.com
Flashback Generation Buffer Sizing
Flashback uses a flashback generation buffer
Size of generation buffer is recorded in V$SGASTAT
Size is determined by _flashback_generation_buffer_size
defaults to 4194304
To verify size of buffer use SELECT bytes FROM v$sgastat WHERE pool = 'shared pool' AND name = 'flashback generation buff';
BYTES --------- 4194304
32 © 2007 Julian Dyke juliandyke.com
Flashback Generation Buffer Granules
Flashback generation buffer appears to be limited to a single granule
If granule size is less than _flashback_generation_buffer_size
buffer size will be rounded down
For example for a 4mb granule size: SELECT bytes FROM v$sgastat WHERE pool = 'shared pool' AND name = 'flashback generation buff';
BYTES --------- 3981204
33
Granule size can be controlled using _ksmg_granule_size © 2007 Julian Dyke juliandyke.com
34
Flashback Generation Buffer Location
To determine location of flashback generation buffer use: ALTER SYSTEM SET EVENTS 'immediate trace name global_area level 2';
ksbdp krfwrp_ [2000D860, 2000D88C) = 0000007B 2AE1C924 00000000 00000000 ...
Dump of memory from 0x2000D870 to 0x2000D88C 2000D870 52575652 00000200 00006723 0005A080 [RVWR....#g......] 2000D880 00000001 199DC5EA 00040081 KSBDPPRO = 0X2AE1C924 KSBDPSER = 1 KSBDPERR = 0 KSBDPNAM = 'RVWR' KSBDPFLG = 2
Location of RVWR background process
krfwb krfwbf_ [2000D8DC 2000D970) = 000001E5 00002000 003C7288 00001FE8 ...
Dump of memory from 0x2000D8CC to 0x2000D9F0 2000D8C0 27834200 2000D8D0 003CBD94 000001E6 000001E6 00000003 2000D8E0 29A1B71C 00000002 00037D60 00000001 etc..
Location of flashback generation buffer
In this 32 bit example location is 0x27834200 © 2007 Julian Dyke juliandyke.com
Flashback Generation Buffer Shared Pool Reserved Area
Size of flashback generation buffer is affected by shared pool reserved area
By default 5% of each granule is allocated to shared pool reserved area
For example our flashback generation buffer is 0x27834200
Granule size is 4MB SELECT MAX(baseaddr), gransize FROM x$ksmge WHERE baseaddr <= '27834200';
MAX(BASEADDR) GRANSIZE ------------------------ 27800000 4194304
SELECT ksmchptr,ksmchsiz FROM x$ksmspr WHERE ksmchptr >= '27800000' AND ksmchptr < '27C00000';
KSMCHPTR KSMCHSIZ ---------------------- 27800038 24 27800050 212888 27833FE8 24
35 © 2007 Julian Dyke juliandyke.com
Flashback Logging Latches
The following latches are used by flashback logging
flashback allocation
flashback mapping
flashback copy
flashback sync request
flashback FBA barrier
flashback SCN barrier
hint flashback FBA barrier
flashback hint SCN barrier 36
By default each latch only has one child except
flashback copy latch
maximum number of copy latches may be determined by _flashback_copy_latches © 2007 Julian Dyke juliandyke.com
Flashback Log Files Dumps
The following dumps are undocumented
All flashback records for a thread can be dumped using:
SQL> ALTER SYSTEM DUMP FLASHBACK THREAD
All flashback records for a specific flashback logfile can be dumped using SQL> ALTER SYSTEM DUMP FLASHBACK LOGFILE
Flashback logfiles are numbered from 1 upwards 37 © 2007 Julian Dyke juliandyke.com
Flashback Log Files Dumps
All flashback records for a specific record type can be dumped using: SQL> ALTER SYSTEM DUMP FLASHBACK LOGFILE
All flashback records for a specific database block number can be dumped using: SQL> ALTER SYSTEM DUMP FLASHBACK LOGFILE
By default block dumps etc are included in the dump file To dump a summary of records in the flashback log use: SQL> ALTER SYSTEM DUMP FLASHBACK LOGFILE
Flashback Log Files Dumps
Example of header
DUMP OF FLASHBACK LOG FILE 9 FILE HEADER: Compatibility Vsn = 169869568=0xa200100 Db ID=308670124=0x1265eeac, Db Name='FLASH' Activation ID=308689068=0x126638ac Control Seq=318=0x13e, File size=972=0x3cc File Number=9, Blksiz=8192, File Type=8 FLASH BACK FLASHBACK HEADER: Flashback Block Header: Seq: 9 Block: 1 Cks: 0x22b Flag: 0x1 Lst: 0 description:"Thread 0001, Seq# 0000000009, SCN 0x00000003a2d7" thread: 1 seq: 9 version 0 nab: 0x3cd reset logs count: 0x25102f2c scn: 0x0000.00000001
formatted blocks: 972 usable blocks: 972 magic: 5 previous magic: 0 flags: 0x0 Low scn: 0x0000.0003a2d7 05/07/2007 10:31:48 High scn: 0x0000.000401d3 05/26/2007 16:59:06 Last Marker: fba: (lno 0 thr 0 seq 0 bno 0 bof 0)
39 © 2007 Julian Dyke juliandyke.com
40
Flashback Database Dumps
Example of block image
**** Record at fba: (lno 9 thr 1 seq 9 bno 966 bof 692) **** RECORD HEADER: Type: 1 (Block Image) Size: 28 RECORD DATA (Block Image): file#: 1 rdba: 0x00406efc Next scn: 0x0000.00000000 [0.0] Flag: 0x0 Block Size: 8192 BLOCK IMAGE: buffer rdba: 0x00406efc scn: 0x0000.00034d8e seq: 0x01 flg: 0x06 tail: 0x4d8e0601 frmt: 0x02 chkval: 0xf52b type: 0x06=trans data Hex dump of block: st=0, typ_found=1 Dump of memory from 0xB56CDC00 to 0xB56CFC00 B56CDC00 0000A206 00406EFC 00034D8E 06010000 [[email protected]......] B56CDC10 0000F52B 00000001 0000023D 00034D8C [+.......=....M..]
© 2007 Julian Dyke juliandyke.com
Flashback Records Record Types
Every flashback record has a type Type 1 2 3 4 5 6 7 8 9 10 Description Block Image Marker Skip Set 4 Byte Skip Empty Block Image Begin Crash Recovery Record Drop File Drop Tablespace Add File Type 11 12 13 14 15 16 17 18 19 20 Description Add Tablespace Resize File Convert Plugin Rename Tablespace TSPITR Resetlogs Absolute Set Primary Switchover Standby Switchover Incarnation Change 41 © 2007 Julian Dyke juliandyke.com
42
Flashback Logging RVWR Background Process Dumps
Some additional RVWR background process dumps can be executed from ORADEBUG Dumping session must attach to RVWR process Either use operating system process id
$ ps -ef | grep rvwr | grep -v grep oracle 11055 1 0 16:04 ? 00:00:00 ora_rvwr_PROD
SQL> ORADEBUG SETOSPID 11055;
Or use Oracle process id ( SQL> SELECT pid FROM v$process WHERE addr IN SELECT paddr FROM v$bgprocess WHERE name = 'RVWR' );
PID -- 20
SQL> ORADEBUG SETORAPID 20; © 2007 Julian Dyke juliandyke.com
43
Flashback Logging RVWR Background Process Dumps
To dump flashback generation status use: SQL> ORADEBUG DUMP FLASHBACK_GEN 1
To dump flashback logfile headers use: SQL> ORADEBUG DUMP FBHDR 1
To dump all logical flashback records in the current flashback incarnation use: SQL> ORADEBUG DUMP FBINC 1
To include before images in the above dump use: SQL> ORADEBUG DUMP FBINC 2
To dump the last 2000 flashback records use: SQL> ORADEBUG DUMP FBTAIL 1 © 2007 Julian Dyke juliandyke.com
Flashback Log Physical Structure
Block size determined by db_block_size parameter
Block 0 contains file header
Remaining blocks have 16 byte block header Includes check sum File Header Block Header 44 STOP © 2007 Julian Dyke juliandyke.com
Flashback Records Logical Structure
Added sequentially to flashback logs
Consists of a header and an optional body
If present body is written first followed by header
For all record types
Header includes type and length
Structure is read backwards
Logical records can cross physical record boundaries 45 © 2007 Julian Dyke juliandyke.com
Flashback Records Logical Structure
Record# 1 Record# 2 Record# 3 Record# 4 Record# 5 46 STOP © 2007 Julian Dyke Body Header Body Header Body Header Header Body Header juliandyke.com
Flashback Records Physiological Structure
File Header Flashback Records Empty Space 47 STOP © 2007 Julian Dyke juliandyke.com
Flashback Records Block Images
For block images
Body is a copy of the data block
Used for data blocks, undo blocks
Not compressed
Flashback records are always larger than single block
Include 28 byte header 48
Common block types appearing as block images include
Data and index blocks (trans data)
Segment headers
Undo headers
Undo blocks (manual and automatic)
Local tablespace bitmap blocks
Automatic segment space management bitmap blocks © 2007 Julian Dyke juliandyke.com
49
Flashback Logs Flashback log tail
New flashback records are always appended beyond the flashback log tail Flashback database commands start at the flashback log tail and work forwards
To check flashback log tail use: SQL> ALTER SESSION SET EVENTS 'immediate trace name controlf level 2';
For example:
**************************************************************** CHECKPOINT PROGRESS RECORDS **************************************************************** THREAD #1 - status:0x2 flags:0x0 dirty:15 low cache rba:(0xd.1f33.0) on disk rba:(0xd.1f42.0) on disk scn: 0x0000.0004087e 05/26/2007 18:11:01 resetlogs scn: 0x0000.00000001 05/05/2007 23:07:24 heartbeat: 623592856 mount id: 310450827 Flashback log tail log# 1 thread# 1 seq 10 block 309 byte 0
© 2007 Julian Dyke juliandyke.com
Flashback Logs Flashback log tail
Current pointer is also maintained in SGA. For example: SQL> ALTER SESSION SET EVENTS 'immediate trace name global_area level 2'; 50
krfwb krfwbf_ [2000D8BC, 2000D9F0) = 000001E5 00002000 003C7288 00001FE8 Dump of memory from 0x2000D8CC to 0x2000D9F0 2000D8C0 28434200 [.BC(] 2000D8D0 003CBD94 000001E6 000001E6 00000003 [..<.............] 2000D8E0 2A61B71C 00000002 003C7288 00000001 [..a*.....r<.....] 2000D8F0 00000001 002A21F0 00000002 00000000 [.....!*.........] 2000D900 00000001 00000000 00000000 00000002 [................] 2000D910 00000000 002A01D4 003C6C3C 00000001 [......*.
Sequence Number 0xA = 10 Block Number 0x135=309 Log Number 0x1=1
2000D9E0 001E5ECA 002A2050 00000000 00000000 [.^..P *.........]
© 2007 Julian Dyke juliandyke.com
Flashback Logging Conclusions
Very similar design to LGWR
Changes initially written to memory buffer
RVWR subsequently flushes flashback records to disk
Requires memory buffer
Defaults to 4MB
Efficient multi-block disk writes 51 © 2007 Julian Dyke juliandyke.com
Flashback Logging Conclusions
All blocks are logged when they first become dirty
Includes data, indexes, undo, segment headers, bitmaps
Subsequent changes not necessarily logged
No separate structure identified so probably uses flags in buffer headers to monitor which blocks have been logged
Flag may be reset when DBWR flushes dirty block to disk
Amount of flashback redo log generated roughly equivalent to value of physical writes statistics 52 © 2007 Julian Dyke juliandyke.com
Thank you for listening
[email protected]
53 © 2007 Julian Dyke juliandyke.com