Reliability The probability that no (system) failure

Download Report

Transcript Reliability The probability that no (system) failure

Reliability
October 26, 2004
1
Today
• DFDC (Design for a Developing Country)
• HW November 2
– detailed design
– Parts list
– Trade-off
• Midterm November 4
• Factory Visit November 16th
2
Midterm
• Presentation Purpose- a midcourse correction
– less than 15 minutes with 5 minutes discussion
– Approx. 7 power point slides- all should participate in
presentation
– Show what you have done
– Show what you are going to do
– Discuss issues, barriers and plans for overcoming
(procedural, team, subject matter, etc.
– Scored on originality, candor, thoughtfullness, etc. not
on total amount accomplished
– Schedule today from 1:00 to 4:00 (speaker at 4:00 PM)
3
Reliability
The probability that no (system) failure will occur
in a given time interval
A reliable system is one that meets the
specifications
Do you accept this?
4
What do Reliability Engineers Do?
• Implement Reliability Engineering
Programs across all functions
–
–
–
–
–
–
Engineering
Research
manufacturing
Testing
Packaging
field service
5
Reliability as a Process module
INPUT
•
•
•
•
•
Reliability Goals
Schedule time
Budget Dollars
Test Units
Design Data
Reliability
Assurance
Module
Product
Assurance
Internal Methods
•Design Rules
•Components Testing
•Subsystem Testing
•Architectural Strategy
•Life Testing
•Prototype testing
•Field Testing
•Reliability Predictions
(models)
6
Early product failure
• Strongest effect on customer satisfaction
– A field day for competitors
• The most expensive to repair
–
–
–
–
Why?
Rings through the entire production system
High volume
Long C/T (cycle time)
• Examples from GE (but problem not confined to GE!)
– GE Variable Power module for House Air Conditioning
– GE Refrigerators
– GE Cellular
7
Early Product Failure
• Can be catastrophic for human life
–
–
–
–
–
–
Challenger, Columbia
Titanic
DC 10
Auto design
Aircraft Engine
Military equipment
8
Reliability as a function of System Complexity
Why computers made of tubes (or discrete transistors)
cannot be made to work
# of components
in Series
Component
Reliability =
99.999%
Component
Reliability =
99.99%
100
250
500
99.9
99.75
99.50
99.01
97.53
95.12
1000
10,000
100,000
99.01
90.48
36.79
90.48
36.79
0.01
9
Three Classifications of
Reliability Failure
Type
• Early (infant mortality)
Old Remedy- Repair mentality
• Burn-in
• Wearout (physical
degradation)
• Maintenance
• Chance (overstress)
• In service testing
10
Bathtub Curve
Failure Rate
#/million hours
Infant
Mortality
Useful life
No memory
No improvement
No wear-out
Random causes
Wear out
Time
11
Reliability
90
80
70
86
70
50
30
19
16
12
5
2
0
0
Prob
60
of dying 50
in the next 40
30
year
(deaths/ 20
10
1000)
Age
From the Statistical Bulletin 79, no 1, Jan-Mar 1998
12
Early failure causes or infant mortality
(Occur at the beginning of life and then disappear)
• Manufacturing Escapes
–
–
–
–
workmanship/handling
process control
materials
contamination
• Improper installation
13
Chance Failures
(Occur throughout the life a product at a constant rate)
•
•
•
•
•
Insufficient safety factors in design
Higher than expected random loads
Human errors
Misapplication
Developing world concerns
14
Wear-out
(Occur late in life and increase with age)
•
•
•
•
•
•
•
Aging
degradation in strength
Materials Fatigue
Creep
Corrosion
Poor maintenance
Developing World Concerns
15
Failure Types
•
•
•
•
Catastrophic
Degradation
Drift
Intermittent
16
Failure Effects
(What customer experiences)
•
•
•
•
•
•
•
•
•
•
•
Noise
Erratic operation
Inoperability
Instability
Intermittent operation
Impaired Control
Impaired operation
Roughness
Excessive effort requirements
Unpleasant or unusual odor
Poor appearance
17
Failure Modes
•
•
•
•
•
•
•
•
•
•
•
•
Cracking
Deformation
Wear
Corrosion
Loosening
Leaking
Sticking
Electrical shorts
Electrical opens
Oxidation
Vibration
Fracturing
18
Reliability Remedies
• Early
• Wearout
• Chance
• Quality
manufacture/Robust
Design
• Physically-based
models, preventative
maintenance, Robust
design (FMEA)
• Tight customer linkages,
testing, HAST
19
Reliability
semi-empirical formulae
Early failure
Chance Failure
 1 (T k1 )
f (T )  k (T  k2 ) e
f (T )  eT
=pdf
1
1  mT
 e
m
k =constant failure rate
m=MTBF
Wear out
1
(T  M )2 / 2 2
f (T ) 
e
 2
20
Failures Vs time as a function of
Stress
High Stress
Medium Stress
Low Stress
21
Highly Accelerated Stress Testing
•
•
•
•
Test to Failure
Fix Failed component
Continue to Test
Appropriate for developing world?
22
Duane Plot
Reinertson p 237
xx
Log
Failures
per 100
hours
Actual Reliability
xx
xx x
xx x
x x
x x
Required Reliability
at Introduction
Predicted
x
Log Cumulative Operating Hours
23
Integration into the Product Development Process
FMEA- Failure Modes and Effects Analysis
Customer
Requirements
Baseline
data from
Previous
Products
Feed results
to Risk Assessment
Process
Brainstorm
potential failures
Summarize
results
(FMEA)
Update
FMEA
Baseline
data from
Previous
Products
Develop Failure
Compensation
Provisions
Probabilities
developed
through analysis
Test Activity
Uncovers new
Failure modes
Failure probthrough test/field
data
Use at
Design
Reviews
24
Risk Assessment process
Assess risk
• Program Risk
• Market Risk
• Technology Risk
– Reliability Risk
• Systems Integration Risk
Devise mitigation Strategy
Re-assess
25
Fault Tree analysis
Seal Regulator
Valve Fails
Valve Fails Open
when commanded
closed
Excessive
leakage
1
Next
Page
Excessive
port leakage
6
Excessive
case leakage
7
Regulates
High
Regulates
Low
Fails closed
when commanded
open
2
3
4
Fails to meet
response time
Excessive
hysteresis
5
Fails to meet
response time
8
Fails to meet
response time
9
26
Fault Tree analysis (cont)
Valve Fails Open
when commanded
closed
1
Valve Fails Open
when commanded
closed
Electrical
Failure of
Selenoid
corosion
Open
Circuit
Solder Joint
Failure
Mechanical
Failure
Selenoid
Transient
electro mechanical
force
Armature
Contamination
Coil short
Insulation
Wire
Broken
seals
Material
selection
wear
Material
selection
Valve
Insuff
Wire
orientation filtering
Broken
27
FMEA
28
FMEA Root Cause Analysis
29
Fault Tree Analysisexample
Example: A solar cell driven LED
30
Reliability Management
•
Redundancy
– Examples
• Computers
• memory chips?
• Aircraft
– What are the problems with this approach
• 1. Design inelegance
– expensive
– heavy
– slow
– complex
• 2. Sub optimization
– Can take the eye off the ball of improving component and system reliability by
reducing defects
– Where should the redundancy be allocated
• system
• subsystem
• board
• chip
• device
• software module
31
• operation
Other “best practices”
•
•
•
•
•
•
•
•
Fewer Components
Small Batch Size (why)
Better material selection
Parallel Testing
Starting Earlier
Module to systems test allocation
Predictive (Duane) testing
Look for past experience
– emphasize re-use
• over-design
– e.g. power modules
• Best: Understand the physics of the failure and model
– e.g. Crack propagation in airframes or nuclear reactors
32
Other suggestions?
33