INFO 631 Prof. Glenn Booker Week 2 – Reliability Models and Customer Satisfaction INFO631 Week 2 www.ischool.drexel.edu.

Download Report

Transcript INFO 631 Prof. Glenn Booker Week 2 – Reliability Models and Customer Satisfaction INFO631 Week 2 www.ischool.drexel.edu.

INFO 631
Prof. Glenn Booker
Week 2 – Reliability Models and
Customer Satisfaction
INFO631 Week 2
1
www.ischool.drexel.edu
Reliability Models
• Reliability Models are used here to
treat software as though we expect
predictable performance from using
it on a regular basis
• Hence this assumes we’re dealing
with fairly stable requirements, and
a well controlled environment
INFO631 Week 2
2
www.ischool.drexel.edu
Why use Reliability Models?
• Determine objective quality of the product
• Use for planning resources needed to fix
problems
INFO631 Week 2
3
www.ischool.drexel.edu
Independent Variable in
Reliability Growth Models
• Typical scope of measurement (X axis)
includes one of these:
– Calendar time (days of testing)
– Cumulative testing effort (hours of testing)
– Computer execution time (e.g. number of
CPU hours)
INFO631 Week 2
4
www.ischool.drexel.edu
Key Dependent Variables for
Reliability Growth Models
• Typical dependent variables (Y) include:
– Number of defects found per life cycle phase,
or total number ever
– Cumulative number of failures over time
– Failure rate over time
– Time between failures
INFO631 Week 2
5
www.ischool.drexel.edu
Terminology
• Reliability – probability that system
functions without failure for a specified
time or number of natural units in a
specified environment
– “Natural unit” is related to an output of a
system
• Per run of a program
• Per hours of CPU execution
• Per transaction (sale, shipment)
INFO631 Week 2
6
www.ischool.drexel.edu
Terminology
• Availability – probability at any given time
that a system functions satisfactorily in a
given environment
• Failure intensity – the number of failures
per natural or time unit
INFO631 Week 2
7
www.ischool.drexel.edu
Software Reliability Modeling
• Requires characterizing and applying
– The required major development
characteristics or goals
•
•
•
•
Reliability
Availability
Delivery date
Life-cycle cost (development,
maintenance, training, etc.)
INFO631 Week 2
8
www.ischool.drexel.edu
Software Reliability Modeling
– The expected relative use of the software’s
functions (i.e. its operational profile)
• Focus resources on functions in proportion to their
use and criticality
INFO631 Week 2
9
www.ischool.drexel.edu
Operational Profile
• Operational profile - a complete set
of operations with their probabilities
of occurrence
– Operation = a major system logical task of
short duration which returns control to the
system when complete, a.k.a. a scenario
INFO631 Week 2
10
www.ischool.drexel.edu
Types of Reliability Models
• Static and Dynamic
• Static
– Uses other product characteristics (size, complexity,
etc.) to estimate number of defects
– Good for module-level estimates (detailed)
– See discussions of size and complexity measures
INFO631 Week 2
11
www.ischool.drexel.edu
Types of Reliability Models
• Dynamic
– Based on statistical distribution; uses current defect
pattern to estimate future reliability
– Good for product-level estimates (large scale)
– Includes the Rayleigh and Exponential models
INFO631 Week 2
12
www.ischool.drexel.edu
Dynamic Reliability Models
• Model entire development process
– Rayleigh model;
• Model back-end formal testing
(after coding)
– Exponential model
• Both are a function of time or life cycle
phase, and are part of the Weibull family
of distributions
“back-end” here refers to the later phases of the life cycle
INFO631 Week 2
13
www.ischool.drexel.edu
Define
• PDF = Probability Density Function, is the
number of defects which will be found per
life cycle phase
• CDF = Cumulative Density Function, is the
total number of defects which will be
found, as a function of life cycle phase
INFO631 Week 2
14
www.ischool.drexel.edu
Weibull Model
Let: m = curve family,
c = shape parameter, t = time
Then:
PDF = (m/t)*(t/c)^m*exp(-(t/c)^m)
CDF = 1 - exp(-(t/c)^m)
What can this look like? Lots of things!
First, fix c=1.5 and look at various ‘m’ values
INFO631 Week 2
15
www.ischool.drexel.edu
Weibull for M=0.5 and 1, c=1.5
1.2
1.0
PDF
.8
.6
.4
M=1
.2
0.0
PDF for M=0.5
M=0.5
.10
.70
.40
PDF for M=1
1.30
1.00
1.90
1.60
2.50
2.20
3.10
2.80
3.70
3.40
4.30
4.00
4.90
4.60
5.50
5.20
TIME
INFO631 Week 2
16
www.ischool.drexel.edu
Weibull for M=1.5 and 2, c=1.5
.7
.6
M=2
.5
PDF
Notice the
Y axis
range is
changing
.4
.3
.2
.1
PDF for M=1.5
M=1.5
0.0
PDF for M=2
.10
.70
.40
1.30
1.00
1.90
1.60
2.50
2.20
3.10
2.80
3.70
3.40
4.30
4.00
4.90
4.60
5.50
5.20
TIME
INFO631 Week 2
17
www.ischool.drexel.edu
Weibull for M=3 and 5, c=1.5
1.4
1.2
M=5
PDF
1.0
.8
.6
.4
.2
PDF for M=3
M=3
0.0
PDF for M=5
.10
.70
.40
1.30
1.00
1.90
1.60
2.50
2.20
3.10
2.80
3.70
3.40
4.30
4.00
4.90
4.60
5.50
5.20
TIME
INFO631 Week 2
18
www.ischool.drexel.edu
Weibull for M=7 and 9, c=1.5
2.5
M=9
2.0
For large ‘m’
values, Weibull
looks like a
normal
distribution
centered on ‘c’
PDF
1.5
1.0
.5
PDF for M=7
M=7
0.0
PDF for M=9
.10
.70
.40
1.30
1.00
1.90
1.60
2.50
2.20
3.10
2.80
3.70
3.40
4.30
4.00
4.90
4.60
5.50
5.20
TIME
INFO631 Week 2
19
www.ischool.drexel.edu
Rayleigh Model
• The history of defect discovery across the
life cycle phases often looks like the
Rayleigh probability distribution
• Rayleigh model is a formal parametric
model, used to produce estimates of the
future defect count
INFO631 Week 2
20
www.ischool.drexel.edu
Rayleigh Model
• Rayleigh model and defect origin/
found analyses deal with the defect
pattern of the entire software development
process
• Is a good tool, since it can provide sound
estimates of defect discovery from fairly
early in the life cycle
INFO631 Week 2
21
www.ischool.drexel.edu
Rayleigh Model
Let: m = 2, c = scale parameter, t = time
Then:
PDF = (2/t)*(t/c)^2*exp(-(t/c)^2)
CDF = Cumulative defect arrival pattern
CDF = 1 - exp(-(t/c)^2)
INFO631 Week 2
22
www.ischool.drexel.edu
Rayleigh Model Assumptions
1. Defect rate during development
is correlated with defect rate after release.
2. If defects are discovered and removed
earlier in development, fewer will remain in
later stages.
In short, “Do it right the first time.”
INFO631 Week 2
23
www.ischool.drexel.edu
Rayleigh Model
• The value of ‘c’ determines when the
curve peaks
– tmax = c/(2)
is the peak
• Area up to tmax is where 39.35% of all
defects will be found (ideally)
• Now look at influence of ‘c’ value on curve
shape
INFO631 Week 2
24
www.ischool.drexel.edu
Weibull for M=2, c=1, 1.5, and 2
1.0
.8
PDF
.6
.4
PDF for M=2, c=1
.2
PDF for M=2, c=1.5
0.0
PDF for M=2, c=2
.10
.70
.40
1.30 1.90 2.50 3.10 3.70 4.30 4.90 5.50
1.00 1.60 2.20 2.80 3.40 4.00 4.60 5.20
TIME
INFO631 Week 2
25
www.ischool.drexel.edu
Rayleigh Model Implementation
• Various tools can model the Rayleigh
curve
– PASW/SPSS (using Regression Module)
– SAS
– SLIM (by Quantitative Software Management)
– STEER (by IBM)
INFO631 Week 2
26
www.ischool.drexel.edu
Rayleigh Model Reliability
• Statistical reliability relates to confidence interval
of the estimate, which is in turn related to
sample size
• Small sample size (only 6 data points per
project) means low statistical reliability, often
underestimating actual later reliability
• Improve this by using other models and
comparing results
INFO631 Week 2
27
www.ischool.drexel.edu
PTR Submodel
• A variation on the Rayleigh model can be
used for predicting defects which will be
found during integration of new software
into a system
– PTR is Program Trouble Report or Problem
Tracking Report, a common mechanism for
defect tracking
• Follows the same idea as Rayleigh
INFO631 Week 2
28
www.ischool.drexel.edu
Reliability Growth Models Exponential Model
• Exponential model is the basic reliability
growth model - i.e. reliability will tend to
increase over time
• Other reliability models include: Time
Between Failure Models and Fault Count
Models
INFO631 Week 2
29
www.ischool.drexel.edu
Exponential Model
• Reliability growth models are based on data
from the formal testing phase
– After the software has been completely
integrated (compiled & built)
– When the software is being tested with test
cases chosen randomly to approximate an
operational (real-world usage) profile
– Testing is customer oriented
INFO631 Week 2
30
www.ischool.drexel.edu
Exponential Model
• Rationale is that defect arrival during
testing is a good indicator of the reliability
of the product when used by customers
• During this testing phase, failures occur,
defects are fixed, software becomes more
stable, and reliability grows over time
INFO631 Week 2
31
www.ischool.drexel.edu
Exponential Model
• Is a Weibull distribution with m = 1
• Let: c = scale parameter,
t = time, l = 1/c
CDF = 1 - exp(-t/c) = 1 - exp(-lt)
PDF = (1/c)*exp(-t/c) = l*exp(-lt)
• l is the error detection rate or hazard rate
• This form also works for light bulb failures, computer
electrical failures, etc.
INFO631 Week 2
32
www.ischool.drexel.edu
Typical Time Between Failure
Model Assumptions
• There are N unknown software faults at
the start of testing
• Failures occur randomly
• All faults contribute equally to failure
• Fix time is negligibly small
• Fix is perfect for each fault
INFO631 Week 2
33
www.ischool.drexel.edu
Time Between Failure Models
• Jelinski-Moranda (J-M) Model
– Assumes random failures, perfect zero time fixes, all
faults equally bad
• Littlewood Models
– Like J-M model, but assumes bigger faults are
found first
• Goel-Okumoto Imperfect Debugging Model
– Like J-M model, but with bad fixes possible
INFO631 Week 2
34
www.ischool.drexel.edu
Fault Count Model Assumptions
• Testing intervals are independent of each
other
• Testing during intervals is reasonably
homogeneous
• Number of defects detected is
independent of each other
INFO631 Week 2
35
www.ischool.drexel.edu
Fault Count Models
• Goel-Okumoto Nonhomogeneous Poisson
Process Model (NHPP)
– # of failures in a time period, exponential
failure rate (i.e. the exponential model!)
• Musa-Okumoto Logarithmic Poisson
Execution Time Model
– Like NHPP, but later fixes have less effect
on reliability
INFO631 Week 2
36
www.ischool.drexel.edu
Cumulative Defects versus
Cumulative Test Hours
Goel Okumoto model:
m(t) = a*(1 - e-b*t)
l(t) = m’(t) = a*b* e-b*t
where:
m(t) = expected number
of failures observed
at time t
l(t) = failure density
a = expected total number
of defects
b = constant
INFO631 Week 2
37
www.ischool.drexel.edu
Fault Count Models
• The Delayed S and Inflection S Models
– Delayed S: Recognizes time between failure
detection and fix
– Inflection S: As failures are detected,
they reveal more failures
INFO631 Week 2
38
www.ischool.drexel.edu
Mean Time to Failure (MTTF)
• Mean Time to Failure is the average
amount of time using the product between
failures
• MTTF = (total run time) /
(number of failures)
INFO631 Week 2
39
www.ischool.drexel.edu
Software Reliability Modeling:
Time Between Failures
• Time between failures is expected to
increase, as failures occur and faults are
fixed
0
Execution Time Line
Failure
INFO631 Week 2
40
www.ischool.drexel.edu
Software Reliability Modeling:
Time Between Failures
Reliability, R(t) - probability of failure free
operation for a specified period of time
Reliability
1.0
Time Since Last Failure (t)
INFO631 Week 2
41
www.ischool.drexel.edu
WARNING
• Reliability models can be wildly inaccurate,
particularly if based on little and/or
irrelevant data (e.g. from other industries,
or using bad assumptions)
• Validate estimates with other models and
common sense
INFO631 Week 2
42
www.ischool.drexel.edu
Reliability Modeling
1. Examine data on a scatter diagram.
Look for trends and level of detail.
2. Select model(s) to fit the data.
3. Estimate the parameters of each model.
4. Obtain fitted model using those parameters.
5. Check goodness-of-fit and reasonableness
of models.
6. Make predictions using fitted models.
INFO631 Week 2
43
www.ischool.drexel.edu
Test Compression Factor
• Defect detection during testing is different
from that by customer usage, hence the
defect rates may change.
• Result is that fewer defects are found just
after product release
• Or, testing is better at finding defects than
customer usage
INFO631 Week 2
44
www.ischool.drexel.edu
Test Compression Factor
• Hence for maintenance, use reliability
models ONLY for defect number or rate,
and look for field defect rate patterns to be
different from those found during
development (number of defects found
drops after release, due to less effective
customer “testing”)
INFO631 Week 2
45
www.ischool.drexel.edu
Customer Satisfaction
46
INFO631 Week 2
www.ischool.drexel.edu
Customer Satisfaction
• Customer evaluation of software is the
most critical “test”
• Want to understand what their priorities
are, in order to obtain and keep their
business
INFO631 Week 2
47
www.ischool.drexel.edu
Total Quality Management
• Expanded from just product quality to maintaining a
long term customer relationship
• 5x cheaper to keep an existing customer than find a
new one
• Unhappy customers tell 7-20 people, versus happy
customers tell only 3-5 people
INFO631 Week 2
48
www.ischool.drexel.edu
Customer Satisfaction Surveys
•
•
•
•
•
Customer call-back after x days
Customer complaints
Direct customer visits
Customer user groups
Conferences
INFO631 Week 2
49
www.ischool.drexel.edu
Customer Satisfaction Surveys
• Want representative sample of all
customers
• Three main methods
– In person interviews
Can note detailed reactions
May introduce interviewer bias
Expensive
INFO631 Week 2
50
www.ischool.drexel.edu
Customer Satisfaction Surveys
– Telephone interviews
Can still be very valid
Cheaper than in person interviews
Lack of interaction
Limited audience
– Mail questionnaires
How representative?
Low response rate
Very cheap
INFO631 Week 2
51
www.ischool.drexel.edu
Sampling Methods
• Often can’t survey entire user population
• Four methods
– Simple random sample
Must be truly random, not just convenient
– Systematic sampling
Use every nth customer from a list
INFO631 Week 2
52
www.ischool.drexel.edu
Stratified Sampling
– Group customers into categories (strata); get
simple random samples from each category
(stratum). Can be very efficient method.
– Can weigh each stratum equally (proportional
s.s.) or unequally (disproportional s.s.)
– For unequal, make fraction ~ standard
deviation of stratum, and ~ 1/ square root
(cost of sampling).
F ~ s/sqrt(cost)
where “sqrt” is “square root”
“~” means “is proportional to”
INFO631 Week 2
53
www.ischool.drexel.edu
Cluster Sampling
• Divide population into (geographic)
clusters, then do simple random samples
within each selected cluster
– Try for representative clusters
– Not as efficient as simple random sampling,
but cheaper
– Typically used for in-person interviews
INFO631 Week 2
54
www.ischool.drexel.edu
Bias
• Look out for sample bias!
• E.g. basing a national voting survey on a
Web-based poll
INFO631 Week 2
55
www.ischool.drexel.edu
Sample Size
• How big is enough?
• Depends on:
– Confidence level (80 - 99%, to get Z)
– Margin of error (B = 3 - 5%)
• For simple random sample, also need
– Estimated satisfaction level (p), and
– Total population size (N = total number
of customers)
INFO631 Week 2
56
www.ischool.drexel.edu
What’s ‘Z’?
• ‘Z’ is the critical Z value for a two-sided
test of means
• Here we are striving for a sample whose
mean customer satisfaction is close
enough to the population’s mean – where
“close enough” is defined by the Z value
INFO631 Week 2
57
www.ischool.drexel.edu
What Confidence Level?
• The results are always subject to
the desired confidence level – since we
are never perfectly sure of our results
– For analysis of medical test results, typically
insist on 99% confidence
– Otherwise 95% is commonly used
– Software tests may use as low as 80%
INFO631 Week 2
58
www.ischool.drexel.edu
Critical Z values
Confidence Level
2-sided critical Z
80%
1.28
90%
1.645
95%
1.96
99%
2.57
INFO631 Week 2
59
www.ischool.drexel.edu
Sample Size
• Sample size is given by
n = [N*Z^2*p*(1-p)]/
[N*B^2 + Z^2*p*(1-p)]
• Note that the sample size depends heavily
on the answer we want to obtain, the
actual level of customer satisfaction (p)!
INFO631 Week 2
60
www.ischool.drexel.edu
Sample Size
• If we choose
– 80% confidence level, then Z = 1.28
– 5% margin of error, then B = 0.05
– and expect 90% satisfaction, then p = 0.90
• n = (N*1.28^2*0.9*0.1)/
(N*0.05^2 + 1.28^2*0.9*0.1)
• n = 0.1475*N/(0.0025*N + 0.1475)
Notice for B and p that percents are converted to decimals!
INFO631 Week 2
61
www.ischool.drexel.edu
Sample Size
Given:
Z
p
B
N
10
20
50
100
200
500
1000
10000
100000
1000000
Infinity
1.28
0.9
0.05
Find:
n
8.550355
14.93558
27.06052
37.09996
45.54935
52.75873
55.69724
58.63655
58.94763
58.97892
58.9824
Hence:
Z^2
p(1-p)
B^2
1.6384
0.09
0.0025
<- Sampling isn’t
very helpful for
small populations!
INFO631 Week 2
62
www.ischool.drexel.edu
Sample Size
• If don’t know customer satisfaction value
‘p’, use 0.5 as worst-case estimate
• Once the real value of ‘p’ is known, solve
for the actual value of B (margin of error)
• Key challenge is finding a truly
representative sample
INFO631 Week 2
63
www.ischool.drexel.edu
Analysis of Customer
Satisfaction Data
• Use five point scale (very satisfied, sat.,
neutral, dissat., very dissat.)
• May convert to numeric scale; 1=very
dissatisfied, 2=dissatisfied, etc.
• Typically use 95% confidence level
(Z=1.96), but 80% may be okay to show
hint of trend
INFO631 Week 2
64
www.ischool.drexel.edu
Presentation of Customer
Satisfaction Data
• Make running plot of % satisfied vs time,
with +/- margin of error (B)
• Some like to plot percent dissatisfied
instead
• May want to break satisfaction into
detailed categories, and track each
of them separately
INFO631 Week 2
65
www.ischool.drexel.edu
Other Satisfaction Notes
• Key issues raised by customers may not
be most needed areas of development
(e.g. documentation vs reliability)
• Can examine correlation of specific
satisfaction attributes to overall
satisfaction; is bad X really an indicator of
dissatisfied customers?
• Use regression analysis to answer this
INFO631 Week 2
66
www.ischool.drexel.edu
CUPRIMDA (per IBM)
•
•
•
•
•
•
•
•
Capability (functionality)
Usability
Performance
Reliability
Installability
Maintainability
Documentation
Availability
INFO631 Week 2
Can measure customer
satisfaction for each of
these areas, plus overall
satisfaction
67
www.ischool.drexel.edu
Multiple Regression
• We have had models with one variable
related to another, e.g.
Schedule = a*(Effort)^b
• Linear and logarithmic regression can also
be done with many variables, like: Overall
Satisfaction = a + b*(Usability Sat.) +
c*(Performance Sat.) + d*(Reliability Sat.)
and so on
INFO631 Week 2
68
www.ischool.drexel.edu
Multiple Regression
• This results in estimates of constants a, b,
etc.
– A linear regression is often better
for real-valued data
– Logistical regression is often better for data
which may only have two values (Yes/No,
T/F)
• Sometimes both are tried to see which
gives the best results
INFO631 Week 2
69
www.ischool.drexel.edu
Now What?
• Plot each factor’s regression coefficient (a,
b, …) vs. the customer satisfaction level
(%) for that factor; then on this plot:
• Determine priorities for improving
customer satisfaction from top to bottom
(then left to right, if there are equal
coefficients)
INFO631 Week 2
70
www.ischool.drexel.edu
Non-product Satisfaction
• Many other areas can affect customer satisfaction
– Technical solutions - product factors, and
technologies used
– Support & Service - availability, knowledge
– Marketing - point of contact, information
– Administration - invoicing, warranty
– Delivery - speed, follow-through
– Company image - stability, trustworthiness
INFO631 Week 2
71
www.ischool.drexel.edu
Next Steps
• Measure and monitor your and competitors’
customer satisfaction
– In order to compete, your satisfaction level
must be better than your competition’s
• Analyze what aspects are most critical to
customer satisfaction
• Determine the root cause of shortcomings
• Set quantitative targets, both overall and for
specific aspects
• Prepare & implement a plan to do the above
INFO631 Week 2
72
www.ischool.drexel.edu