Informix Backups and OnBar
Download
Report
Transcript Informix Backups and OnBar
Informix
Backup/Recovery 2000
John F. Miller III
Erik van Veen
Outline
• History
• Architecture
• Clients
• ontape
• onbar
•
•
•
•
onbar system tables
Data Transfer
Engine Threads
Internal Data Formats
• Improvements
• Future
Informix
user.conference
2
History
• 1.X Turbo - Only Quiescent mode archives
• 4.X named OnLine for advanced archiving
technology
• 5.X same core technology
• limitation revealed (scalability & extensibility)
•
•
•
•
•
6.0 new client/server model developed
7.1 & 7.20 same core technology
7.21 new client (onbar)
7.3/9.2 server API re-write
9.21 removal of timestamp updating
Informix
user.conference
3
Pre-DSA Archive Creation
Bad Grammar Archiver
• Archive Checkpoint
• Acquire archive timestamp
• Free extents recorded
• Reserved pages saved
• Chunks backed-up by ascending chunk number
• Before image of pages modified during archive are
placed in physical log
• tbtape scans physical log for unarchived beforeimages
• Pages placed directly to tape based on their:
• Page header
• Timestamp counter
Note: Excellent detailed description in 5.0 Admin guide
4
Informix
user.conference
Understanding Archive
Timestamps
All Pages in the green region
are sent directly to tape
0
Min-Stamp
Archive Stamp
Timestamp 50%
away from
Archive Stamp
The timestamp at the
start of the archive
Not Archived
All Pages in the red region
have their timestamp updated
before being archived
5
Current Stamp
The timestamp at the
current point in time
Informix
user.conference
Problems seen in Pre-DSA
Architecture
• No Parallelism
• Data Streams
• Division of labor (tape and disk I/O)
• Changing a tape can hang the system
• All or nothing restores
• Not an Enterprise solution
• No jukebox support
• No integration with storage vendors
Informix
user.conference
6
DSA Archive Architecture
Major Differences
•
•
•
•
•
•
True client-server architecture
Archived pages logically grouped by dbspaces
Granularity of creations
Granularity of restores
Warm restores
Physical log pages kept in temp tables
Informix
user.conference
7
Architecture Overview
CLIENT
SERVER
Network Connection
ONINIT
Archive
Client
Shared Memory
Informix
user.conference
8
Architecture Overview
Archive Client
XBSA
SMV
XBSA
Onbar
Common
Archive
Code
Informix
user.conference
9
Architecture Overview -- OnTape
Basic I/O
ontape
ESQL/C
Dynamic
Server
sysmaster
Informix
user.conference
10
Architecture Overview -- OnBar
XBSA
onbar
Storage
Manager
ESQL/C
ixbar
oncfg._
sqlhosts
onconfig
Dynamic
Server
sysutils
sysmaster
Informix
user.conference
11
OnBar System Tables
• All tables reside in the
sysutils database
• Four main tables
•
•
•
•
bar_server
bar_object
bar_action
bar_instance
Informix
user.conference
12
OnBar System Tables
bar_server
srv_name
obj_ srv_name
bar_object
obj_oid
obj_oid
bar_instance
ins_oid
ins_aid, ins_oid
act_oid
bar_action
act_aid, act_oid
Informix
user.conference
13
bar_server
• Contains a list of database servers
• This table is populated based on sqlhosts file
Column
bar_server- Active online servers
Type Contains
srv_name
srv_node
char(128) Name of the online server
char(64) Node name
Informix
user.conference
14
bar_object
• Describes each backup object
• dbspaces
• logical logs
• A backup must be attempted on an object for
this table to have any values
Column
obj_srv_name
obj_oid
obj_name
obj_type
bar_object - Backed up objects
Type Contains
char(128)
serial
char(128)
char(2)
R
CD
ND
B
L
Name of the online server
Object identifier
Object name
Object Type (R, CD, ND, B, L)
Root Dbspace
Critical Dbspace
Non-Critical Dbspace
Blobspace
Logical Logs
Informix
user.conference
15
bar_action
• List all backup and restore actions ATTEMPTED
• Excluding cold restores
Column
bar_action - actions attempted
Type Contains
act_aid
act_oid
act_type
integer
integer
smallint
1
2
3
4
5
6
7
8
act_status
act_start
act_end
integer
datetime
datetime
action identifier
object identifier
type of action
Backup
Restore
imported Restore
Fake Backup
Whole system backup
Whole system restore
Backup of deleted object
External restore
status of action (0 is a successful operation)
begin time
end time of action
Informix
user.conference
16
bar_instance
• Track all successful backups
• Description of each backed up object used to
develop a restore strategy
bar_instance - completed action status
Column
Type Contains
ins_aid
ins_oid
ins_time
ins_level
ins_copyid_hi
ins_copyid_lo
ins_req_aid
ins_first_log
ins_verify
ins_verify_date
integer
integer
integer
smallint
integer
integer
integer
integer
integer
datetime
Action ID
Object identifier
Archive timestamp from the server
Archive level (0,1,2)
Copy ID from the Storage Manager
Copy ID from the Storage Manager
Prerequisite object id
First Logical log required
Has archive been validated
Time of archive validation
Informix
user.conference
17
ixbar file format
jmiller_12
2000-10-12
jmiller_12
2000-10-12
jmiller_12
2000-10-12
jmiller_12
2000-10-12
jmiller_12
2000-10-12
rootdbs
14:50:46 1
plog
14:51:44 1
dbspace1
14:51:44 1
dbspace2
14:51:45 1
50
14:52:09 1
R
0
CD
0
ND
0
ND
0
L
0
0 77
0
0 78
0
0 79
0
0 80
0
0 81
0
0 25226
77
0
0 25227
78
0
0 25228
79
0
0 25229
80
0
0 25230
81
0
0
- 0
- 0
- 0
- 0
- -
Informix
user.conference
18
Archive Client - onbar
Configuration Parameters
•
•
•
•
•
•
•
•
•
•
BAR_ACT_LOG
BAR_MAX_BACKUP
BAR_NB_XPORT_COUNT
BAR_XFER_BUF_SIZE
BAR_RETRY
BAR_BSALIB_PATH
BAR_PROGRESS_FREQ
BAR_HISTORY
BAR_DEBUG_LOG
BAR_DEBUG
Informix
user.conference
19
Moving Data between Client/Server
ASF
Network Connection
ONINIT
Shared Memory
Archive
Client
Informix
user.conference
20
Configuring Data Transfer Buffers
• OnBar
• BAR_NB_XPORT_COUNT
• BAR_XFER_BUF_SIZE
• Limitations
• Changing the buffer size will render all previous object
unrestorable
• Maximum size is one online page smaller than 64KB
• Maximum number of buffers is 99
• OnTape
• TAPEBLK
• ARCHIVE_BUF_COUNT
• Limitations
• Maximum number of buffers is 99
Informix
user.conference
21
Monitoring Data Transfer Buffers
onstat -g stq
• Full Queue - Buffers/work server
• Empty Queue - Buffers being filled by the
archive client
Stream Queue: (session 15 cnt 10) 0:b0e0400 1:b0f0400 2:b100400 3:b110400
4:b120400 5:b130400 6:b140400 7:b150400 8:b160400 9:b170400
Full Queue: (cnt 0 waiters 1)
Empty Queue: (cnt 9 waiters 0) 0:0 1:b0f0400 2:b100400 3:b110400 4:b120400
5:b130400 6:b140400 7:b150400 8:b160400
Stream Queue: (session 13 cnt 10) 0:af6f400 1:af7f400 2:af8f400 3:affe400
4:b00e400 5:b01e400 6:b02e400 7:b03e400 8:b04e400 9:b05e400
Full Queue: (cnt 0 waiters 1)
Empty Queue: (cnt 9 waiters 0) 0:0 1:af7f400 2:af8f400 3:affe400
4:b00e40 5:b01e400 6:b02e400 7:b03e400 8:b04e400
Informix
user.conference
22
Tuning the Transfer Buffers
• Memory required by Transfer Buffers
• parallel session * buffer size * number of buffers
• Oscillating performance is an indication that
more memory buffers can improve performance
Informix
user.conference
23
Major Archive Change
• 7.30 & 9.20 the server changed
• 9.21 & 7.30.UC7 & 7.31.UC the updating of
timestamps was removed
Informix
user.conference
24
Threads Involved in an Archive
• Each archive session will have its own set of
threads
• Three threads for each archive
• ontape
• arcbackup1
• arcbackup2
onstat -g ath
76
77
78
ae83928
ae987d0
ae98908
ac0cdc8
ac0f758
ac0d3b8
2
2
3
cond wait sm_read
sleeping forever
sleeping secs: 1
1cpu
1cpu
1cpu
ontape
arcbackup1
arcbackup2
Informix
user.conference
25
Archive Threads
ontape
• The name of this thread
is always ontape
regardless of the
archive client used
• General coordinator of
the backup session
• Responsible for starting
the two arcbackup
threads
• Passes errors to the
client
Informix
user.conference
26
Archive Threads
arcbackup1
• This thread is called
archive scanner
• The DUMB thread
• Given a list of pages it
sends them to the archive
client, concentrating
exclusively on I/O
• Checks the format of the
pages
Informix
user.conference
27
Archive Threads
arcbackup2
• This thread is called “Before
Image Processor”
• The thinker
• Responsible for collecting all
the images that are modified
during the archive
• Manager of the temp tables
the archiver creates
• Able to create multiple temp
tables for a single dbspace
Informix
user.conference
28
Threads Involved in a Restore
• Each restore session will have its own set of
threads
• Two threads for each restore
• ontape
• ontape is the name of this thread regardless of the archive
client
• physrecover
• writes archive pages back to disk
onstat -g ath
24
25
aed8558
adc74e0
ac0bbf8
ac0d3b8
2
2
sleeping secs: 1
sleeping forever
1cpu
1cpu
ontape
physrecover
Informix
user.conference
29
Data Stream Format
Archive Trailer
Archive Header
Object 1
dbspace_X
Data
Object 2
dbspace_Y
Data
Object 3
dbspace_Z
Data
Tape Control Pages
Informix
user.conference
30
Format of a Tape Object
Dbspace Header
Dbspace Data
Dbspace Trailer
Before image data
for this dbspace
Tape Control Pages
31
Informix
user.conference
Blobpages
• Archive examines blob free map pages to
determine which blob pages to archive
• Uses the the blob page header to determine the
size
• DSA does not lock the blobspace during the
archive, but delete blobs are not freed until after
the archive completes
Informix
user.conference
32
SmartBlobs
• First the extent meta data is backed up
• From this meta data an extent list is created
• The extent lists indicates which smart blob
pages are sent to tape
Informix
user.conference
33
onsmsync
• Deletes backup history from the sysutils database and
emergency boot file
• Based on time
• Interval
• Generations
• Regenerate a corrupted or lost ixbar file for sysutils
• Regenerates a damaged sysutils from the ixbar file
If you lose both the sysutils database and the
emergency boot file, onsmsync cannot regenerate
them from the storage manager.
Informix
user.conference
34
Managing Temp Space
• Versions 7.2
• A single dbspace has exactly one temp table
• If a temp table fills archive aborts
• Temp table is kept until the very last dbspace
being archive is complete
• Version 7.30 & 9.20 and higher
• A single archive may have multiple temp tables in
different dbspaces
• If a temp table fills, the before image processor
create a new table in a different dbspace
• Upon the completion of a dbspace’s archive the
associated temp table is dropped immediately
Informix
user.conference
35
Validating Archives
• How do I validate
archives
• What is actually
validated
• What other information
is there for me
• What else can go wrong
with my validated
restore
Informix
user.conference
36
How do I validate my archives
• Standalone - archecker connect directly to
media
• ontape
• archecker -tdvs
• AC_TAPEBLK, AC_TAPEDEV
• Integrated - built into the product
• onbar -r -v (version 7.3X)
• onbar -v (9.20 & 8.30)
• onbar -b -v (8.30)
Informix
user.conference
37
What is actually validated
• Format of each page on the archive is checked
(similar to oncheck -cd)
• Tape control pages are sanity checked
• Each table is checked ensuring all pages of the
table exist on the archive tape
• Reserved page format is validated
• Each chunk free list is verified
• Table extents are checked for overlap
(oncheck -pe)
Informix
user.conference
38
Other Information for Me
• AC_MSGPATH - message and debug log for
archecker
• {AC_STORAGE}/INFO
• extent list for each dbspace
• DBS.{dbspace #}
• similar to oncheck -pe
• time to process each tape or object
• TAPE
• Information about the number and type of pages
processed
• profile.{pid}
Informix
user.conference
39
Archecker Integration
XBSA
Storage
Manager
onbar
ixbar
oncfg._
sqlhosts
onconfig
ESQL/C
sysutils
sysmaster
archecker
Dynamic
Server
40
Informix
user.conference
Algorithm: Scan Phase
•
•
•
•
•
Start reading all data from the data stream
Save the reserve and control pages
Check the format of the page
Mark each page as seen in bitmaps
At end of each tape sync all structures to disk
Informix
user.conference
Algorithm: Verification Phase
• Verifies the existence of:
• chunk free list pages
• blob pages
• pages for each partition/table
Informix
user.conference
Example Output Missing Pages
• AC_MSGPATH archecker debug log
• Table dbs1:cust missing pages
ERROR: Table dbs1:cust partnum 0x00C0011A missing
physical page 0x00D246B3
ERROR: Page 0x00d246b5 not found in bitmaps
ERROR: Table dbs1:cust partnum 0x00C0011A missing
physical page 0x00D246B5
...
STATUS: Table checks FAILED
NOTE!
STATUS: BLOBChunk checks PASSED
Informix
user.conference
External Backups & Restores
• Use third party products to make a backup
• Backups created with third party products can
rollforward Informix’s logical logs
• Archive takes place while the system is online
• Modifications to the system are blocked during
the backup
Informix
user.conference
44
External Backup
• Flush all modified data to disk and suspend any
write activity (including queries with temp
tables)
• onmode -c block
• Backup the desired chunks in a dbspace using a
third party product
• Release the server
• onmode -c unblock
• Backup the logical logs
Informix
user.conference
45
External Restore
Common Command Lines
• Salvage the logical logs before restoring
• onbar -l -s
• Restore the whole system
• onbar -r -e
• Restore a single dbspace
• onbar -r -e dbspace_name
Informix
user.conference
46
Suspended Restore vs Restartable
Restore
• Suspended Restore
• The archive client has encountered an error
before the restore has completed
• Restartable Restore
• The database server was terminated prior to the
restore completing
Informix
user.conference
47
Restartable Restore
• Turned OFF by default
• Only available with OnBar
• onbar -RESTART
• Requires RESTARTABLE_RESTORE be set to
“ON” or “on” ONLY
• What can restart when?
• Physical Restore
• Fully automatic
• Cold and warm restore are restartable
• Logical Recovery
• Warm logical recovery may NOT be restarted
• Cold logical recovery is fully restartable
Informix
user.conference
New Releases and Future Plans
•
•
•
•
•
Progress frequency
Level 1 & 2 for smart blobs
No onarchive
Dynamic logs
Point in Time Table Level Restore
Informix
user.conference
49
onstat -g arc
• Shows the status of archive in progress
• Displays the latest archives which have
occurred
num
1
DBSpace
rootdbs
Q Size Q Len
37
0
Dbspaces - Archive Status
name
number level
rootdbs
1
0
plog
2
0
dbspace1
3
0
1
dbspace2
4
0
Buffer partnum
27
0x1002a1
date
10/12/2000.14:50
10/12/2000.14:51
10/12/2000.14:51
10/12/2000.14:54
10/12/2000.14:51
log
52
52
52
53
52
size
420
scanner
0x116ff7
log-position
0x6b7018
0x6bb550
0x6bb550
0x7a404c
0x6bd018
Informix
user.conference
50
oncheck -pr
• Shows most recent archive for each dbspace
• Display information about each archive level
DBspace number
DBspace name
:
Logical Log Unique Id
Logical Log Position
Oldest Logical Log Unique Id
Last Logical Log Unique Id
DBspace archive status
4
dbspace2
0
0x0
0
0
Archive Level
Real Time Archive Began
Time Stamp Archive Began
Logical Log Unique Id
Logical Log Position
0
10/12/2000 14:51:44
4520641
52
0x6bd018
Archive Level
1
Informix
user.conference
51
Problems Tech Support has Seen
• Incorrect labeling of tapes
• Mixing of ontape and onbar
• Using rewind tape devices with ISM
Informix
user.conference
52