Disaster Recovery Costs and Impacts (Tom Walsh)

Download Report

Transcript Disaster Recovery Costs and Impacts (Tom Walsh)

Disaster Recovery Costs and
Impact on Healthcare Operations
Tom Walsh, CISSP
Tom Walsh Consulting, LLC
Overland Park, KS
Copyright © 2009, Tom Walsh Consulting, LLC
Things to Consider...
• When regional disasters strike, the public
expects healthcare services to be available
• Healthcare is a critical component of our
nation’s infrastructure
• We cannot always control events; the only
thing we can control is our reactions
Copyright © 2009, Tom Walsh Consulting, LLC
2
Tom Walsh
• Certified Information Systems Security Professional
(CISSP)
• Passed the exam for a Certified Business Continuity
Professional
• Co-authored three books on security
• Invited speaker at national conferences
• Information Security Officer for San Antonio
Community Hospital in Upland, CA (Outsourced)
• A little nerdy, but overall, a nice guy 
Copyright © 2009, Tom Walsh Consulting, LLC
4
Objectives
• Using a Business Impact Analysis (BIA) to
determine patient care and business
operations needs
• Discussing key concepts such as: Recovery
Time Objective (RTO) and Recovery Point
Objective (RPO) and their roles in determining
an appropriate disaster recovery strategy
Copyright © 2009, Tom Walsh Consulting, LLC
5
Objectives (2)
• Examining the pros and cons of the various
strategies
• Comparing recovery costs versus recovery
time
• Selecting recovery strategies that support
recovery time
• Explaining potential solutions such as
virtualization for reducing costs
• Reviewing some lessons learned (if time permits)
Copyright © 2009, Tom Walsh Consulting, LLC
6
Terminology
• Business Continuity Plan (BCP) – The larger
umbrella plan that covers multiple plans; the
overall goal is to ensure the business can continue
to operate in the aftermath of any problem or
disastrous event
A business continuity plan includes all departments
Note: Government agencies often use the term Continuity of
Operations Plan (COOP) or Contingency Plan instead of
business continuity plan
Copyright © 2009, Tom Walsh Consulting, LLC
8
Terminology
• Disaster Recovery Plan (DRP) – Applies to major,
usually catastrophic, events that deny access to the
normal facility for an extended period (tend to focus on
technology in a Data Center)
• Contingency Plan – Focuses on sustaining a business
function during a temporary disruption
• Data Backup Plan – Outlines how backups of systems
are performed, frequency of backups, rotation of
backups, and storage of backups (on-site and off-site
backups)
Copyright © 2009, Tom Walsh Consulting, LLC
9
Terminology
• Business Impact Analysis (BIA) – An exercise that
determines the impact of losing the support of any
resource to an organization and establishes the
escalation of that loss over time, and identifies the
minimum resources needed to recover, the
Recovery Time Objective (RTO), and prioritizes the
recovery of processes and supporting systems
Copyright © 2009, Tom Walsh Consulting, LLC
10
Terminology
• Recovery Time Objective (RTO) – The time within
which business functions or application systems
must be restored to acceptable levels of operational
capacity
• Recovery Point Objective (RPO) – The maximum
tolerable loss of information due to the frequency
of the backups
– Example: If daily backups are made, then the RPO = 24
hours which is maximum loss of data (unless there are
periodic snapshots of memory, transactional logs, or
journaling)
Copyright © 2009, Tom Walsh Consulting, LLC
11
Terminology
• Disaster – A calamitous event that creates an
inability on an organization’s part to provide the
critical business functions for some predetermined
period of time and which results in great damage or
loss
Note: The time factor which determines whether a service
interruption is an inconvenience or a disaster will vary from
organization to organization
Healthcare executives should move beyond “What if” to questions of
“Are we prepared?”
Copyright © 2009, Tom Walsh Consulting, LLC
12
Interruptions, Disasters, & Recovery
RTO
< RTO = Problem
Event
Contingency Plan
or Downtime
Procedures
> RTO = Disaster
Recovery Time
Activation of the
Disaster Recovery Plan
The Recovery Time Objective (RTO) is determined by the Business
Impact Analysis
Copyright © 2009, Tom Walsh Consulting, LLC
13
Terminology
• Data Owner – (a.k.a. Information Owner) The
directors or senior managers who are responsible
for the functional areas or business units that
depend on information systems to run their
operations
• Interdependencies – Relying upon input, assistance,
support, or interaction between business units in
order for each to complete their mission and
objectives
Copyright © 2009, Tom Walsh Consulting, LLC
14
Terminology
Instead of…
Try using…
Redundancy
High availability, Resiliency,
or Failover systems
Backup Data Center
Recovery Site or Alternate
Data Center
Return on Investment
Loss avoidance
Unimportant
Less critical
Copyright © 2009, Tom Walsh Consulting, LLC
15
Business Continuity Plan
The objectives of a business continuity plan
(BCP) are to:
– Protect human life
– Maintain services to patients
– Lessen the overall impacts by defining strategies
and predetermined responses
– Create a systematic approach to recover and
restore systems
– Comply with applicable laws and regulations
Copyright © 2009, Tom Walsh Consulting, LLC
16
It’s Not Just “A Plan”
Business Continuity and Disaster Recovery
Planning focuses on three things:
#1 People
#3 Information
Systems
Copyright © 2009, Tom Walsh Consulting, LLC
#2 Data
17
Key Steps in BCP and DRP
•
•
•
•
•
•
•
•
Define the scope of the project
Conduct a risk analysis
Conduct a Business Impact Analysis (BIA)
Research and recommend strategies
Write the plan
Educate staff on the plan
Exercise and test the plan
Revise and maintain the plan
Copyright © 2009, Tom Walsh Consulting, LLC
18
Conduct a Business Impact Analysis
• Without a Business Impact Analysis (BIA), the
organization runs the risk of either
overcommitting or underestimating the
resources required to respond to a disaster or
business disruption
• The BIA is the foundation for Business
Continuity and Disaster Recovery Planning
Copyright © 2009, Tom Walsh Consulting, LLC
19
BIA Objectives
1. Identify the critical resources required to
minimally maintain business operations in
the wake of a disastrous event
2. Estimate the operational and financial
impacts due to the loss of an information
resource as it relates to the functioning of
the organization
Copyright © 2009, Tom Walsh Consulting, LLC
20
BIA Objectives
3. Determine business recovery objectives and
assumptions
4. Establish an order or priority for restoring
business functions and the information
resources that support those functions
5. Facilitate planning strategies
Copyright © 2009, Tom Walsh Consulting, LLC
21
BIA Questions
• What is the impact to patient care?
– Identify key patient care departments
• How much downtime, loss of revenue, and
loss of data can each department or business
unit sustain?
• What are the IT systems that support those
mission-critical operations?
Copyright © 2009, Tom Walsh Consulting, LLC
22
BIA Questions
• If this business unit generates revenue, then
on average, what is the hourly revenue
generated?
• How is data or information received and
processed by those departments?
• What are the dependencies?
– Key employees, vendors, workflows, supply chain, etc.
Copyright © 2009, Tom Walsh Consulting, LLC
23
Possible Impacts
•
•
•
•
•
Inability to treat patients
Financial losses and lost revenue
An organization's credibility and reputation
Penalties or fines for noncompliance
Litigation
– Executives and officers are potentially culpable for
not allocating the necessary resources to ensure
the continuity of business (Duty of Care)
Copyright © 2009, Tom Walsh Consulting, LLC
24
Analysis of BIA Data
• Determine the Recovery Point Objective (RPO)
for each department or business unit
– Assess any gaps with current backup plan
• Determine the Recovery Time Objective (RTO)
for each department or business unit
– Determine the order in which information systems
are needed (restoration priority)
Copyright © 2009, Tom Walsh Consulting, LLC
25
Analysis of BIA Data (2)
• Identify the vital records necessary for running
the business
– Format and location of the records
• Determine existing technologies for supporting
high availability and recovery
• Assess the gap between current recovery
capabilities and needed capabilities to sustain
the business
Copyright © 2009, Tom Walsh Consulting, LLC
26
Analysis of BIA Data (3)
• List departments and business units ordered
by their recovery time objective (RTO) and/or
impact to patient care
• Identify gaps between current recovery
capability and needed recovery capability
• Validation of BIA with key stakeholders
Copyright © 2009, Tom Walsh Consulting, LLC
27
New Threat – Pandemic Flu
Copyright © 2009, Tom Walsh Consulting, LLC
28
Research Recovery Strategies
• Determine how gaps between current
recovery capability and recovery needs (RTO
and RPO) will be handled
• Research potential recovery strategies to
meet the overall RTO
• Create cost-benefit analysis
• Make recommendations for business
continuity and disaster recovery
Copyright © 2009, Tom Walsh Consulting, LLC
29
Strategy – Alternate Sites
Site
Hot
Warm
Cold
Advantages
Disadvantages
Shortest recovery time
Most expensive
Equipment is supplied
Short-term use of facility
Easy to test backups and
recovery plans
Facility may not always
be available
Moderately priced
Not easy to test plans
Basic infrastructure with
some equipment
Facility may not always
be available
Most inexpensive
Longest recovery time
Basic infrastructure
No equipment is supplied;
it must be ordered,
delivered, and installed
Can usually rent the
space for longer
period ofCopyright
time
No
© 2009, Tom Walsh Consulting,
LLC way to test
30
Recovery Time versus Strategy
Copyright © 2009, Tom Walsh Consulting, LLC
31
Costs versus Recovery Time
Source: DRI International
DRP-501
Business Continuity
Planning Review
32
Recovery Site Location
• Too close – It may be affected by the same
regional disaster
• Too far away – May have difficulty getting
employees to leave their homes and families
during a disaster to work at an alternate or
recovery site
– Ability to leave the disaster area
– Costs associated with travel and temporary living
expenses
Copyright © 2009, Tom Walsh Consulting, LLC
33
Strategy – Virtualization
• Virtualization – A condition without boundaries or
constraints
• Virtual machine – A single server running multiple
operating systems (Windows, Linux, NetWare, etc.)
and applications
• Originally developed by IBM in 1960s for the
mainframe operating system
• Breaks the “one server, one application” standard
by decoupling the physical hardware from the
operating system
Copyright © 2009, Tom Walsh Consulting, LLC
34
Virtualization
Virtual machine
One server, multiple
operating systems and
applications
One server per
operating system
and application
Copyright © 2009, Tom Walsh Consulting, LLC
35
Virtualization – Benefits
• Zero downtime
– Within seconds, systems can be moved from one
physical server to another
• Ease of managing failover systems
– Servers are treated as a uniform pool
– Any spare server could be the recovery target for
a virtual machine
• Virtual machine environment is saved as a
single file
– Easier to back up, move and copy
Copyright © 2009, Tom Walsh Consulting, LLC
36
Virtualization – Benefits
• Owning and maintaining fewer servers
– Making high availability more cost-effective
– Curbing the proliferation of servers
• Maintenance budget
– Reduces hardware, power, cooling, and floor
space requirements
• Data does not leak across on virtual machines
Copyright © 2009, Tom Walsh Consulting, LLC
37
Findings and Recommendations
• Present report of findings and
recommendations at meeting with data
owners and senior leadership
• Obtain an agreement on recovery strategies
• Conclude the BIA portion of the project
Providing realistic cost estimates may be difficult given the
many variables and vendors’ unwillingness to disclose prices
Copyright © 2009, Tom Walsh Consulting, LLC
38
Lessons Learned from Katrina
Major challenges:
• Communications outages made it difficult to
locate missing personnel
• Access to and reliable transportation into
restricted areas was not always available
• Lack of electrical power or fuel for generators
rendered computer systems inoperable
Copyright © 2009, Tom Walsh Consulting, LLC
40
Lessons Learned from Katrina
Major challenges:
• Obtaining replacement supplies as initial
stocks are exhausted can be difficult
– Diesel fuel for generators
– Food and water
• May need large amounts of cash to pay for
critical supplies and services
• Mail service was interrupted for months in
some areas
Copyright © 2009, Tom Walsh Consulting, LLC
41
Summary
• Business continuity and disaster recovery
planning should involve the entire
organization
(It is more than the recovery of the technology; it is the
recovery of the business)
• A business impact analysis is the foundation
for planning
• Select strategies that support recovery
objectives which meet the needs of the
organization (RPO & RTO)
Copyright © 2009, Tom Walsh Consulting, LLC
43
References
• DRI International, DRP-501
Business Continuity Planning Review
• FFIEC Lessons Learned From Hurricane
Katrina: Preparing Your Institution for a
Catastrophic Event
NIST Special Publications:
• 800-34 Contingency Planning Guide for
Information Technology Systems
• 800-30 Risk Management Guide for
Information Technology Systems
Copyright © 2009, Tom Walsh Consulting, LLC
44
Tom Walsh, CISSP
[email protected]
913-696-1573
Copyright © 2009, Tom Walsh Consulting, LLC