ADABAS Disaster Recovery

Download Report

Transcript ADABAS Disaster Recovery

Backup Methods For a Hot Site
Dieter W. Storr
Los Angeles Times
23 August 2005
B/R Methods







Existing Backup Method
Experiences
Mirroring or Replicating
Fast Copy of Data
Proposals and Costs
Future Technology
Lessons learned
23 August 2005
Dieter W. Storr -www.storrconsulting.com
2
Existing Backup Method
•From disk (databases)
•Copy to
•3490 / 3590-1 / VTS
•Then, copy to
•3590-1 (cartridge)
23 August 2005
Dieter W. Storr -www.storrconsulting.com
3
ADABAS 6.2.2 Back-up at LA Times
Weekly
21:00-21:30
ADAPnBKF
Online SAVE
21:30-1:15
ADAPnPLC
FEOFPL
Job
DAP1BKO
ADAP2BKO
ADAP3BKO
ADAP4BKO
ADAP5BKO
3:00
8:00-11:00
ADAPnPLC
PLOG Switch
Disk
Pool
DFDSS
Full Volume
Back-up
2:00
PDS, GDGs, etc.
ADAPnBKO
Copy Online
SAVEs
BRM/ABARS
Several Jobs
3490 tapes
(3590-1)
2
(1)
35
(1)
16
(1)
8
(1)
4
(1)
65
(5)
59
(?)
DFDSS / one
tape per
volume
BRM/ABARS
22
(?)
TOTAL
211 (?)
(Only for ADABAS)
23 August 2005
Pick-up by Recall
Status: 27 Jan 2005
Dieter W. Storr -www.storrconsulting.com
4
B/R
Methods
“Companies that relied on
tape or on third-party provider
found in many cases they had
difficulty meeting their recovery
time objectives.”
Source: http://www.drj.com/articles/spr02/1502-07.html
23 August 2005
Dieter W. Storr -www.storrconsulting.com
5
B/R Methods
“Flaws in tape-based data backup may be
leaving enterprises without key information
and could lead to legal exposure under
emerging laws such as Sarbanes-Oxley, say
data backup and recovery experts. “
Source: 15 Apr 2004 | SearchSecurity.com
23 August 2005
Dieter W. Storr -www.storrconsulting.com
6
B/R Methods

In a survey of 500 IT departments completed …
found that as many as 20% of routine, nightly
backups fail to capture all data.

40% of IT managers had been unable
to recover data from a tape when
they needed it

More than 23% sought to use data stored on tape
backups more than 20 times in a year
Source: 15 Apr 2004 | SearchSecurity.com
23 August 2005
Dieter W. Storr -www.storrconsulting.com
7
B/R Methods
Are tapes really so bad?
LA Times experiences?
23 August 2005
Dieter W. Storr -www.storrconsulting.com
8
Tape Problems
1 November 2002:

Six tape drive errors

Delay
23 August 2005
Dieter W. Storr -www.storrconsulting.com
9
Tape Problems
24 March 2003:

Only two channel paths per
tape controller were provided

Slow restore time
23 August 2005
Dieter W. Storr -www.storrconsulting.com
10
Tape Problems
5 October 2003:

3590 tape drives were not
defined to DFSMS (SMS)

ADABAS restore and
application test cancelled
23 August 2005
Dieter W. Storr -www.storrconsulting.com
11
Tape Problems
6 December 2003:

VTS problems with GDG
datasets

End-user functions
couldn’t be tested
23 August 2005
Dieter W. Storr -www.storrconsulting.com
12
Tape Problems
5 August 2004:

Restore jobs had to wait for an input
tape that was being used by another
restore job

Delay
23 August 2005
Dieter W. Storr -www.storrconsulting.com
13
Tape Problems
30 October 2004:


Packages didn’t arrive in time,
due to a thunderstorm that
affected FedEx delivery
Major delay
23 August 2005
Dieter W. Storr -www.storrconsulting.com
14
Tape Problems
30 October 2004:

Automated tape library experienced
unit address problems during the
restore process

Delay
23 August 2005
Dieter W. Storr -www.storrconsulting.com
15
Tape Problems
30 October 2004:

VTS logical tapes were not shipped
to Wood Dale (HSM level 2, SAR
level 2)

Delay
23 August 2005
Dieter W. Storr -www.storrconsulting.com
16
Tape Problems
30 October 2004:

Confusion about
when to load DRP1
and DRP2 tapes,
before or after IPL

Delay
23 August 2005
Dieter W. Storr -www.storrconsulting.com
17
Tape Problems
30 October 2004:

ICIS libraries were not
backed up to tape

Application tests were not
possible
23 August 2005
Dieter W. Storr -www.storrconsulting.com
18
Tape Problems
8 December 2004:

Load problems

Tapes were loaded before IPL and
not after IPL

Major delay
23 August 2005
Dieter W. Storr -www.storrconsulting.com
19
Tape Problems
8 December 2004:

Experienced problems when
trying to restore MIG1 data,
e.G. DRADABC0 job

Major delay
23 August 2005
Dieter W. Storr -www.storrconsulting.com
20
Tape Problems
8 December 2004:

Recall sent by FedEx tapes to SunGard

One damaged package arrived without
tapes

Restored DATA one generation back (-1)
System was generation (0)
23 August 2005
Dieter W. Storr -www.storrconsulting.com
21
Tape Problems
21 March 2005:


Level 2 tapes for VTS not
being sent off-site (but have
been on the list)
Application team couldn’t
test all data
23 August 2005
Dieter W. Storr -www.storrconsulting.com
22
Tape Problems
5 August 2005:

3590-1 cartridges ejected,
not found


DSS8370W - TMS SHOWS TAPE
N00318 OUT OF AREA “DRP1”,SLOT
00031
Delay
23 August 2005
Dieter W. Storr -www.storrconsulting.com
23
Time Warner employee data missing
May 2, 2005: 5:51 PM EDT
NEW YORK (CNN) - Time Warner Inc. said
Monday that data on 600,000 current and
former employees stored on computer
backup tapes was lost by an outside
storage company and that the Secret
Service is now investigating.
Dieter W. Storr -23 August 2005
www.storrconsulting.com
24
Lost Backup Tape Held Ameritrade
Client Data
Wednesday, April 20, 2005 - LA Times
… package was damaged during shipping
between vendors ….. fourth tape is still
missing…… The tapes may have included
customers’ Social Security numbers …..
23 August 2005
Dieter W. Storr -www.storrconsulting.com
25
Info On 3.9M Citigroup Customers Lost
Monday, June 6, 2005 – CNN.COM
Citigroup, the nation's biggest financial
services company, said that UPS lost the
tapes while shipping them to a credit
bureau in Texas.
23 August 2005
Dieter W. Storr -www.storrconsulting.com
26
Costs for Tape Backups





SunGard recovery services
Offsite tape storage
Tape handling
Shipping per test
Special extra pick-ups
Yearly $150,000
23 August 2005
Dieter W. Storr -www.storrconsulting.com
27
Costs

Not capable to restore one day



$$ ???
Last December: 2 weeks to rebuild
manually (?) customer tables
Does it make sense to restore more
than 2 days back ??
23 August 2005
Dieter W. Storr -www.storrconsulting.com
28
Costs
Example:
20 employees x $140 per day x 10 days
= $28,000
And they couldn’t work on other projects
$140 is based on $51,100 yearly income
23 August 2005
Dieter W. Storr -www.storrconsulting.com
29
Quantitative Risk Analysis
Single Loss Expectancy



SLE = Single Loss
Expectancy
EF = Exposure Factor, for
example 50% or .50
AV = Asset Value, for
example $1,000,000
SLE = EF * AV
SLE = .5 x $1,000,000 = $500,000
23 August 2005
Dieter W. Storr -www.storrconsulting.com
30
B/R Methods
Reducing tapes
23 August 2005
Dieter W. Storr -www.storrconsulting.com
31
B/R Methods
Reducing tapes


Stacking datasets to
3590-1 cartridges
Using Delta Save Facility
from ADABAS
23 August 2005
Dieter W. Storr -www.storrconsulting.com
32
B/R Methods
Reducing tapes


Using Forward Index
Compression (FIC) from
ADABAS
Using larger block size
for 3590 tapes = 256K,
supported by ADABAS
23 August 2005
Dieter W. Storr -www.storrconsulting.com
33
Delta Save Facility (DSF)
ASSO
ASSO
ADASAV
ASSO
NUCLEUS
Buffer Pool
Delta Log (R ABN)
SAVE
DLOG
changed RABN
DSF=YES
changed blocks
DATA
DATA
DATA
DSF=YES
Dual Protection Log
Delta Save
DDDSIM
DDPLOGR2
DDSAVE1
Extracted
ADARES
PLCOPY
DDPLOGR1
23 August 2005
DELTA
DSF=YES
DSIM
Blocks
PLOG copy
Dieter W. Storr -www.storrconsulting.com
DDSIAUS1
34
Delta Save Facility
Full Image
Save
Online/Offline
DDREST1
ASSO
ADASAV
Delta Save
RESTORE
RABN
DSF=YES
DDDELT1-8
Online
Images
DATA
DSIM
extracted
RABN
from PLOG
DDDSIM
23 August 2005
Dieter W. Storr -www.storrconsulting.com
35
B/R Methods
Forward Index Compression
Rochester Gas & Electric
Space savings:
 Normal Index: 37% - 55%
 Upper Index: 21% - 69%
Within an index block the part of the index
value that is identical to the forward part of the
previous index value is suppressed.
23 August 2005
Dieter W. Storr -www.storrconsulting.com
36
B/R Methods
IBM Magstar 3494 / Virtual Tape Server
(VTS)
SunGard
LA Times
23 August 2005
Dieter W. Storr -www.storrconsulting.com
37
B/R Methods
VTS problems
LA Times:
 Completion code A78 RC 18
 We switched from VTS to 3590-1
cartridges
23 August 2005
Dieter W. Storr -www.storrconsulting.com
38
B/R Methods
VTS problems
Virginia Information Technologies Agency:
 Ran 2003/2004 into the same problem system
completion code A78 RC 18
 We … converted … to 3490/3590 physical
tapes
 Problem solved
23 August 2005
Dieter W. Storr -www.storrconsulting.com
39
B/R Methods
Disk to Disk

Mirroring



Hardware
Software
Replicating

Software
23 August 2005
Dieter W. Storr -www.storrconsulting.com
40
B/R Methods –
Enterprise Server
Enterprise
Server
NT / 2000 / XP
UNIX
23 August 2005
Hot Site
Dieter W. Storr -www.storrconsulting.com
41
B/R Methods – Open System
Hot Site
23 August 2005
Dieter W. Storr -www.storrconsulting.com
42
B/R Methods
Marty Stewart
Disaster Recovery Manager
AnMed Health:
“…we’d rather have a server that’s
running slower than having no server at
all.”
23 August 2005
Dieter W. Storr -www.storrconsulting.com
43
Disk Mirroring
ASSO
Benefits


DATA
Asynchronous disk mirroring can
provide better physical protection
by supporting extended physical
distances.
No loss of committed
transactions in synchronous
storage (mirroring/RAID) on a CPU
failure
23 August 2005
Dieter W. Storr -www.storrconsulting.com
ASSO
DATA
44
Disk Mirroring
ASSO
DATA
Limitations



No protection from data corruption
Secondary site is not guaranteed to
be transitionally consistent, in the
case of asynchronous mirroring.
Client application must be re-started
after failure and need to be aware of
failure
23 August 2005
Dieter W. Storr -www.storrconsulting.com
ASSO
DATA
45
Disk Mirroring
ASSO
DATA
Limitations


Synchronous mirroring and RAID
devices can add overhead to
application performance.
Redundant/specialized high availability
hardware/software can be expensive
and restricted to use for backup
purposes only.
23 August 2005
Dieter W. Storr -www.storrconsulting.com
ASSO
DATA
46
Disk Mirroring
ASSO
DATA
Limitations


Secondary copy of data is not
available for use – low hardware
utilization.
Need to replicate everything on disk,
no selectivity of data replication
ASSO
DATA
23 August 2005
Dieter W. Storr -www.storrconsulting.com
47
Example For Disk Mirroring
Back Up / Hot Site
S/390
UNIX
EMC 5700
SRDF
remote mirrored
synchronized
OC-3 link
12-15 miles
SRDF
remote mirrored
synchronized
EMC 5700
S/390
23 August 2005
Main Platform
Dieter W. Storr -www.storrconsulting.com
UNIX
48
B/R Methods



Can we buy used
Enterprise Servers?
Yes…..and inexpensive
OP system is free for D/R
Search for “selling used mainframes,” for example:
http://www.used-line.com/fdc3236-find-dealer.htm
http://www.azure.co.uk/
etc.
23 August 2005
Dieter W. Storr -www.storrconsulting.com
49
Dedicated line broadband speeds and prices
T-1 - 1.544 megabits per second (24 DS0 lines)
Ave. cost $400.-$650./mo.

T-3 - 43.232 megabits per second (28 T1s)
Ave. cost $6,000.-$16,000./mo.

OC-3 - 155 megabits per second (100 T1s)
Ave. cost $20,000.-$45,000./mo.

OC-12 - 622 megabits per second (4 OC3s) no price

OC-48 - 2.5 gigabits per seconds (4 OC12s) no price

OC-192 - 9.6 gigabits per second (4 OC48s) no price
Source: http://www.infobahn.com/research-information.htm
prices updated: 12 May 2005

23 August 2005
Dieter W. Storr -www.storrconsulting.com
50
Peer-to-Peer Remote Copy Extended Distance (PPRC-XD)
PPRC = 60 miles - PPRC-XD = continent
FlashCopy
ESS Shark
ESS Shark
- IBM ESS DASD
- HDS
also support PPRC
Also see TimeFinder from EMC
23 August 2005
Dieter W. Storr -www.storrconsulting.com
51
External Back-up
Systems
Fast Copy of Data
 Snapshot



No data movement
A virtual copy by copying pointers
Copy Process


23 August 2005
Physical copy async. from the log. copy
No impact on applic. on the original data
Dieter W. Storr -www.storrconsulting.com
52
External Back-up Systems
Fast Copy of Data

Specific Hardware Required


Software works only with the hardware
Work on Volume Level

Some snapshot only tools work also on
dataset level
23 August 2005
Dieter W. Storr -www.storrconsulting.com
53
Snapshot & Physical Copy
IBM


Hardware: Enterprise Storage Server
Software: FlashCopy
http://www.share.org/proceedings/sh98/data/S3087.PDF
EMC2
 Hardware: Symmetrix Remote Data Facility
 Software: EMC TimeFinder
http://www.emc.com/interactive_center/media/timefinder/tf_noRC.html
23 August 2005
Dieter W. Storr -www.storrconsulting.com
54
Flash Copy
23 August 2005
Dieter W. Storr -www.storrconsulting.com
55
How It Works
Pre-defined
time window
Suspend
Read / update
Read only: update requests are queued
Resume
Read
only
Read / update
snap
Source
Data
Snapshot
Physical
Backup
Source: SAG
ADADBS TRANSACTIONS SUSPEND,TTSYN=60,TRESUME=120
23 August 2005
Dieter W. Storr -www.storrconsulting.com
56
Replication
Benefits
 Warm standby systems can be
configured over a Wide Area Network,
providing protection from site failures.
 Ability to more quickly swap to the
standby system in the event of failure,
as backup database is already on-line.
 Data corruption is typically not
replicated as transactions are logically
reproduced rather than I/O blocks
mirrored.
23 August 2005
Dieter W. Storr -www.storrconsulting.com
ASSO
DATA
WORK
WORK
DATA
ASSO
57
Replication
Benefits
 Automatic switch over for clients using
a switching mechanism, no client restart
needed.
 Originating applications are minimally
impacted as replication takes place
asynchronously after commit of the
originating transaction.
 The warm standby database is available
for read-only operations, allowing
better utilization of backup systems.
23 August 2005
Dieter W. Storr -www.storrconsulting.com
ASSO
DATA
WORK
WORK
DATA
ASSO
58
Replication
ASSO
DATA
WORK
Benefits
 Ability to resynchronize and easily
switch back to primary system when it
becomes available without loss of data.
WORK
DATA
ASSO
23 August 2005
Dieter W. Storr -www.storrconsulting.com
59
Replication
Limitations
 Warm standby system will be out-ofdate by transactions committed at the
active database that have not been
applied to the standby.
 Protection is limited to components
supporting Warm Standby (e.g. DBMS
data sources may be protected but file
systems may not be supported).
23 August 2005
Dieter W. Storr -www.storrconsulting.com
ASSO
DATA
WORK
WORK
DATA
ASSO
60
Entire Transaction Propagator

The Entire Transaction
Propagator allows for
asynchronous data
replication.

Replicated data can be
updated and
synchronized with
master data at user
specified intervals.
23 August 2005
Dieter W. Storr -www.storrconsulting.com
61
ADABAS Data Replication








Logical dissemination of ADABAS Data to
homogeneous or heterogeneous targets
Near real time propagation
Event driven at the Transaction level
Implemented at the Database/file level for Store,
Delete and Update commands
Define Replication rules through subscriptions
Minimal Impact on normal nucleus activity
Strategic for Enterprise Data Sharing
Replace Entire Transaction Propagator
23 August 2005
Dieter W. Storr -www.storrconsulting.com
62
ADABAS Data Replication
z/OS
z/OS Image A
Target
Field
Origin
Target
File
Target
DBMS
Field
File
Target
Target
DBMS
z/OS
Table
Image B
23 August 2005
Image C
Unix
Server D
Dieter W. Storr -www.storrconsulting.com
63
Possible Hot Site Solutions
Enterprise Server Los Angeles
Shark
Shark
EMC
Converter ESCON
OC3
Shark
OC3
EMC
FICON
Fiber
Optic
OC3
EMC
Own Enterprise Server Hot Site
23 August 2005
Dieter W. Storr -www.storrconsulting.com
64
Costs for Tape Backups





SunGard recovery services
Offsite tape storage
Tape handling
Shipping per test
Special extra pick-ups
Yearly $150,000
23 August 2005
Dieter W. Storr -www.storrconsulting.com
65
Costs for Real Disaster







SunGard Declaration Fee
D/R Site Daily Usage Fee
Office Space Daily Usage Fee
Work Group Declaration Fee
Work Group Daily Usage Fee
LAN Bridge Declaration Fee
LAN Bridge Daily Usage Fee
30 Days $475,000
23 August 2005
Dieter W. Storr -www.storrconsulting.com
66
Costs for Own Hot Site







Used IBM Z800-0X2 Mainframe
Used IBM 2105-F20 Shark Storage
Used IBM 3494 Library, VTS and Tape Drives
3rd Party Next Day HW Maintenance
Printer and Terminal Controller Re-location Costs
3490 Tape Drive and Controller Re-location Costs
Other Costs
1st Year $520,000
After 5 Years Total $735,000
23 August 2005
Dieter W. Storr -www.storrconsulting.com
67
Costs for Own Hot Site
5 Years SunGard=
$750,000
30 Days Real Disaster = $475,000
5 Years Own Facility =
23 August 2005
Dieter W. Storr -www.storrconsulting.com
$735,000
68
Restore Times (Min)
600
500
400
300
200
100
0
Feb- Jun02
02
Oct03
Dec03
Oct04
Dec- Mar- Aug04
05
05
0
330
425
540
365
420
420
330
270
360
229
129
165
221
0
104
63
90
78
136
IPL done
DB restore
Nov- Mar02
03
IPL done
23 August 2005
DB restore
Dieter W. Storr -www.storrconsulting.com
69
Benefits of Own Hot Site





Financial savings > $150,000
annually providing an almost
5% ROI
Reduced recovery time
Reduced impact due to road and airport
closures
Elimination of reliance on external vendors
Mainframe and open system can use the
same facility
23 August 2005
Dieter W. Storr -www.storrconsulting.com
70
Grid Computing
virtual machine
virtual memory
virtual storage
virtual I/O
23 August 2005
Dieter W. Storr -www.storrconsulting.com
71
23 August 2005
Dieter W. Storr -www.storrconsulting.com
72
Interface
Grid Computing
User
Passkey
Hardware
Software
Information
Service
23 August 2005
Resource
Broker
Computer
Element(s)
Locations
Replica
Catalog
Dieter W. Storr -www.storrconsulting.com
Storage
Element(s)
73
Grid Computing
23 August 2005
Dieter W. Storr -www.storrconsulting.com
74
Grid Computing

BBC builds distributed
grid for content sharing
(Gridcast).
62.73.167.57/publicdocs/ppt/prompeg307.ppt
file:///C:/My%20Documents/Dieter/my%20presentations/prompeg307.ppt
23 August 2005
Dieter W. Storr -www.storrconsulting.com
75
Grid Computing For Backup?

Intra or Extra Grid?

Pull or Push?

Grid Software
http://www.gridcomputing.com/
http://www.gridforum.org/ggf_grid_understand.htm
http://gridcafe.web.cern.ch/gridcafe/animations.html
23 August 2005
Dieter W. Storr -www.storrconsulting.com
76
Backup Methods
Mostly used by other companies
Source: DRJ Magazine


VTS
Disk to disk

Is more and more common for
enterprise storage servers and AIX
server technology, for example.
Source: @server Magazine
23 August 2005
Dieter W. Storr -www.storrconsulting.com
77
B/R Methods
Problems for other companies


High third-party hot site costs, approx.
$10,000 - $70,000 per month
Restore time
24-30 hours
23 August 2005
Dieter W. Storr -www.storrconsulting.com
78
How Far is ‘Far Enough?’
(http://www.drj.com/articles/spr03/1602-02.html)

Alternate Facility

Offsite Storage
Facility
Answer = 105 miles
…so the survey
23 August 2005
Dieter W. Storr -www.storrconsulting.com
79
Lessons Learned
(http://www.drj.com/articles/spr02/1502-07.html)

Distance is key
Streets, bridges, tunnels, airports are
closed

Tape recovery is not effective

All applications are critical

Inconsistent back-up is no back-up at all
23 August 2005
Dieter W. Storr -www.storrconsulting.com
80
Lessons Learned
(http://www.drj.com/articles/spr02/1502-07.html)

People-dependent processes do not
suffice

Two sites are not enough

People are hard to replace but
information is irreplaceable
23 August 2005
Dieter W. Storr -www.storrconsulting.com
81
…..we should have an
excellent HOT SITE!
23 August 2005
Dieter W. Storr -www.storrconsulting.com
82