Transcript Slide 1

ME 4054W: Design Projects
RISK MANAGEMENT
Lecture Topics
• What is risk?
• Types of risk
• Risk assessment and
management techniques
2
Risk Trivia
• The study of risk as a science started
during the Renaissance in the 16th
century.
• The initial impetus was from the
financial world.
• The word “risk” is derived from the
early Italian word “risicare” which
means “to dare”.
3
Risk
Risk is a measure of the probability and
severity of adverse effects.1
Risk involves the possibility of loss.
Without the potential for loss, there is
no risk.
Managing risk involves choice. Without
choice, there is no risk management.
1
Lowrance 1976
4
Risk
Risks are future events with a probability
of occurrence and a potential for loss.
If caught in time, risks can be avoided or
have their impacts reduced.
Completely eliminating all risk would be
very expensive, if not impossible. The
key is achieving an acceptable risk level.
5
What are some of the types of
risk found in projects?
6
Types of Risk Found in Projects
• Technology risk • Human resource risk
• Economic risk • Operational risk
• Market risk
• Geographic risk
• Timing risk
• Environmental risk
• etc.
The primary focus of the balance of
this lecture will be technology risk
7
Team Exercise
• What are some items on your
project that you consider risks?
• What are some of the attributes
of the items identified as risks?
8
Risk Assessment
It is common in risk assessment
processes to seek answers to the
following set of questions:
• What can go wrong?
• What is the likelihood that it would
go wrong?
• What are the consequences?
Kaplan and Garrick 1981
9
What is Risk Management?
The identification, analysis,
assessment, control, and
avoidance, minimization, or
elimination of unacceptable risks.
Identify, assess, prioritize, then manage risk.
businessdictionary.com
10
Risk Management
Risk management seeks to answer a
second set of three questions:
1. What can be done and what options
are available?
2. What are the associated tradeoffs in
terms of all costs, benefits, and risks?
3. What are the impacts of current
management decisions on future
options?
Haimes 1991
11
Risk Management
• When should it be done?
– Recommended practice is to apply risk
management before “release” using
judgment, expert knowledge and experience.
– Risk should also be monitored throughout a
product’s life.
– Not all issues are identified before the fact.
Analyzing and responding to failures is an
important aspect of risk management.
12
Failure Mode & Effects Analysis (FMEA)
• FMEA is a structured approach aimed
at identifying all possible failures in a
design, process, product or service.
• It is also called potential failure modes
and effects analysis (PFMEA) and
failure modes, effects and criticality
analysis (FMECA).
Source: American Society for Quality (asq.org)
13
Failure Mode & Effects Analysis (FMEA)
• “Failure modes” means the ways, or
modes, in which something might fail.
• Failures are any errors or defects,
especially ones that affect the customer.
• Failures can be potential or actual.
• “Effects analysis” refers to studying the
consequences of those failures.
Source: American Society for Quality (asq.org)
14
Failure Mode & Effects Analysis (FMEA)
• The purpose of the FMEA is to take
actions to eliminate or reduce failures,
starting with those that are the highest
priority.
• Ideally, the use of FMEA begins
during the earliest conceptual stages
of design and continues throughout
the life of the product or service.
Source: American Society for Quality (asq.org)
15
Failure Mode & Effects Analysis (FMEA)
When to use FMEA
• When a process, product or service is
being designed or redesigned, after
quality function deployment.
• When an existing process, product or
service is being applied in a new way.
• Before developing control plans for a
new or modified process.
Source: American Society for Quality (asq.org)
16
Failure Mode & Effects Analysis (FMEA)
When to use FMEA
• When improvement goals are planned
for an existing process, product or
service.
• When analyzing failures of an existing
process, product or service.
• Periodically throughout the life of the
process, product or service.
Source: American Society for Quality (asq.org)
17
FMEA Procedure
1. Assemble a cross-functional team with
diverse knowledge about the process,
product or service and customer needs.
Functions often included are: design,
manufacturing, quality, testing, reliability,
maintenance, purchasing, sales, marketing
and customer service.
Suppliers and customers can be added, if
needed, to supplement the team.
Source: American Society for Quality (asq.org)
18
FMEA Procedure
2. Identify the scope of the FMEA.
• Is it for a concept, system, design,
process or service?
• What are the boundaries?
• How detailed should we be?
Source: American Society for Quality (asq.org)
19
FMEA Procedure
3. Complete the FMEA form beginning with
the identifying information at the top of the
form.
FAILURE MODE AND EFFECTS ANALYSIS
Responsibility:
Prepared by:
Item:
Model:
Core Team:
Potential
Process Function
Failure Mode
Potential
Effect(s) of
Failure
S
e
v
C
O
Potential
l
c
Cause(s)/
a
c
Mechanism(s)
s
u
of Failure
s
r
FMEA number:
Page :
FMEA Date (Orig):
D
e
t
e
c
Current
Process
Controls
R
P
N
0
0
0
0
0
0
0
0
0
0
0
0
0
Source: American Society for Quality (asq.org)
20
Recommended
Action(s)
Responsibility and
Target Completion
Date
Rev:
Post-Action Results
Actions Taken
S
e
v
O
c
c
D
e
t
R
P
N
0
0
0
0
0
0
0
0
0
0
0
0
0
FMEA Procedure
• In an FMEA, potential failure modes are
rated on the following attributes:
– Severity (S) is an assessment of the severity
of the potential effect of the failure.
– Occurrence (O) is an assessment of the
likelihood of occurrence of the failure.
– Detection (D) is an assessment of the
likelihood that the problem will be detected
before it reaches the end-user/customer.
21
Severity Rating Scale
(Should be tailored to meet the needs of your project or company)
Rating
Description
Definition (Severity of Effect)
10
Dangerously high
9
Extremely high
8
Very high
7
High
Failure causes a high degree of customer dissatisfaction.
6
Moderate
Failure results in a subsystem or partial malfunction of the
product.
5
Low
Failure creates enough of a performance loss to cause the
customer to complain.
4
Very Low
3
Minor
Failure would create a minor nuisance to the customer, but the
customer can overcome it without performance loss.
2
Very Minor
Failure may not be readily apparent to the customer, but would
have minor effects on the customer’s process or product.
1
None
Failure would not be noticeable to the customer and would not
affect the customer’s process or product.
Failure could injure the customer or an employee.
Failure would create noncompliance with federal regulations.
Failure renders the unit inoperable or unfit for use.
Failure can be overcome with modifications to the customer’s
process or product, but there is minor performance loss.
22
Occurrence Rating Scale
(Should be tailored to meet the needs of your project or company)
Rating
Description
Potential Failure Rate
10
Very High: Failure is
almost inevitable.
More than one occurrence per day or a probability of
more than three occurrences in 10 events (Cpk < 0.33).
9
High: Failures occur
almost as often as not.
One occurrence every three to four days or a probability
of three occurrences in 10 events (Cpk ≈ 0.33).
8
High: Repeated
failures.
One occurrence per week or a probability of 5
occurrences in 100 events (Cpk ≈ 0.67).
7
High: Failures occur
often.
One occurrence every month or one occurrence in 100
events (Cpk ≈ 0.83).
6
Moderately High:
Frequent failures.
One occurrence every three months or three
occurrences in 1,000 events (Cpk ≈ 1.00).
5
Moderate: Occasional One occurrence every six months to one year or five
failures.
occurrences in 10,000 events (Cpk ≈ 1.17).
4
Moderately Low:
Infrequent failures.
One occurrence per year or six occurrences in 100,000
events (Cpk ≈ 1.33).
3
Low: Relatively few
failures.
One occurrence every one to three years or six
occurrences in ten million events (Cpk ≈ 1.67).
2
Low: Failures are few
and far between.
One occurrence every three to five years or 2
occurrences in one billion events (Cpk ≈ 2.00).
1
Remote: Failure is
unlikely.
One occurrence in greater than five years or less than
two occurrences in one billion events (Cpk > 2.00).
23
Detection Rating Scale
(Should be tailored to meet the needs of your project or company)
Rating
Description
10
Absolute
Uncertainty
9
Very Remote
8
Remote
7
Very Low
6
Low
5
Moderate
4
Moderately High
3
High
2
Very High
1
Almost Certain
Definition
The product is not inspected or the defect caused by
failure is not detectable.
Product is sampled, inspected, and released based on
Acceptable Quality Level (AQL) sampling plans.
Product is accepted based on no defectives in a sample.
Product is 100% manually inspected in the process.
Product is 100% manually inspected using go/no-go or
other mistake-proofing gauges.
Some Statistical Process Control (SPC) is used in process
and product is final inspected off-line.
SPC is used and there is immediate reaction to out-ofcontrol conditions.
An effective SPC program is in place with process
capabilities (Cpk) greater than 1.33.
All product is 100% automatically inspected.
The defect is obvious or there is 100% automatic
inspection with regular calibration and preventive
maintenance of the inspection equipment.
24
FMEA “By The Numbers”
• The Risk Priority Number (RPN) is
defined as:
RPN = S x O x D
• The Criticality Index (CI) is defined
as:
CI = S x O
25
FMEA “By The Numbers”
• RPN and CI are used to assess the risk
associated with potential problems in a
product or process design and to prioritize
issues for corrective action.
• The RPN or CI values that require action are
project/product-specific. An approximate rule
of thumb is if RPN ≥ 200 or CI ≥ 40, then that
risk should be mitigated.
• All risks with severity of 10 must be mitigated.
26
Simple FMEA Example
Process
Function
Potential
Failure
Mode
Dispense
Does not
amount of cash
dispense
requested by
cash
customer
Dispenses
too much
cash
Potential
Cause(s)/
Mechanism(s)
of Failure
O
c
c
u
r
Customer
very
disatified
Out of cash
5
Incorrect
entry to
demand
deposit
system
8 Machine jams
Potential
Effect(s) of
Failure
D
e
t
e
c
R
P
N
C
R
I
T
Internal low-cash
alert
5
200
3 Internal jam alert
10
Current
Process
Controls
Responsibility
Recommende and Target
d Action(s)
Completion
Date
Post-Action Results
R
P
N
C
R
I
T
40
0
0
240
24
0
0
10
160
16
0
0
Actions
Taken
S
e
v
O
c
c
D
e
t
Discrepancy
in cash
balancing
Power failure
during
transaction
2 None
Bank loses
money
Bills stuck
together
2
Loading procedure
7
(riffle ends of stacks)
84
12
0
0
Denominations
in wrong trays
3
Two-person visual
verification
4
72
18
0
0
Heavy network
traffic
7 None
10
210
21
0
0
Power
interruption
during
transaction
2 None
10
60
6
0
0
Discrepancy
in cash
balancing
Takes too
long to
dispense
cash
S
e
v
6
Customer
somewhat
annoyed
Source: American Society for Quality (asq.org)
3
27
“Completing” the FMEA
• Use the RPN and CI to set priorities for
corrective actions.
• Assign responsibilities and target
completion dates for the corrective
actions.
• After action is taken, reassess S, O and D
and recalculate RPN and CI to determine
if additional actions are needed.
28
Risk Management Decision Making
• The graphic at right
provides a general
guideline for decisions
about when to mitigate
risks. There is no
single right answer; a
number of similar
guideline tables exist.
• It is strongly suggested
to err on the side of
caution.
29
Green = No mitigation needed
Yellow = Consider mitigating risk
Red = Risk mitigation needed
You’ve launched the product and
there are failures. What now?
• The FMEA process is primarily focused
on pre-release risk management, but
can be used after launch.
• Many other processes, such as
FRACAS, Fault Tree Analysis, Root
Cause Analysis and 5 Whys are
available to assess and manage failures
of commercial products.
30
FRACAS
• FRACAS stands for Failure Reporting and
Corrective Action System.
• FRACAS provides a process for reporting,
classifying, analyzing failures, and
planning corrective actions in response to
those failures.
• Common FRACAS outputs may include:
Field MTBF, MTBR, MTTR, spares
consumption and reliability growth.
31
Sample FRACAS Process Flow
http://www.maintenancephoenix.com/wp-content/uploads/2013/01/FRACAS-MTBF-Process-Map.jpg
32
Root Cause Analysis
• Root cause analysis (RCA) is a method of
problem solving that tries to solve problems
by attempting to identify and correct the root
causes of events, as opposed to simply
addressing their symptoms.
• Focusing correction on root causes has the
goal of preventing problem recurrence.
33
http://withfriendship.com/images/f/25154/a-root-cause-analysis-is.gif
34
Fault Tree Analysis
• Fault tree analysis (FTA) is a top down,
deductive failure analysis in which an
undesired state of a system is analyzed
using Boolean logic to combine a series
of lower-level events.
35
5 Whys
• The 5 Whys is an iterative question-asking
technique used to explore the cause-andeffect relationships underlying a particular
problem.
• The primary goal is to determine the root
cause of a defect or problem.
36
5 Whys Example
• The vehicle will not start. (the problem)
o Why? - The battery is dead. (first why)
o Why? - The alternator is not functioning. (second why)
o Why? - The alternator belt has broken. (third why)
o Why? - The alternator belt was well beyond its useful
service life and not replaced. (fourth why)
o Why? - The vehicle was not maintained according to the
recommended service schedule. (fifth why, a root cause)
• A possible corrective action addressing the fifth Why
and solution to the problem’s root cause is to maintain
the vehicle according to the recommended service
schedule.
Adapted from http://en.wikipedia.org/wiki/5_Whys
37
Risk Management Cycle
38