A Study of Estimation Methods for Defect Estimation

Download Report

Transcript A Study of Estimation Methods for Defect Estimation

A Study of Estimation Methods
for Defect Estimation
by
Syed Waseem Haider
and Dr. João W. Cangussu
The University of Texas at Dallas
Outline




Introduction
Classical Approach
Defect Estimators
Conclusion & Future Work
Introduction






The major goal of software testing is to find and fix as many
defects as possible under given constraints so as to release a
product with reasonable reliability.
Time to achieve the established goal and percentage of the goal
achieved up to the moment are important factors to determine
the status.
The clear view of the status of the testing process is crucial to find
a trade-off between releasing a product earlier or investing more
time on testing.
Many defect prediction techniques have addressed this important
problem by estimating the total number of defects.
The availability of an accurate estimation of the number of defects
at early stages of the testing process allows for proper planning of
resource usage, estimation of completion time, and current status
of the process.
Also an estimate of the number of defects in the product by the
time of release allows the inference of required customer support.
Introduction (contd.)

We will discuss various estimation methods which
are used to develop defect estimation techniques.

We will discuss

assumptions that each method makes about the data
model.

probability distribution, and mean and variance of data
and estimator.

statistical efficiency of the estimators developed from
these estimation methods.
Estimation Approaches

Some of the Bayesian defect estimators in the field are



Bayesian Estimation of Defects based on Defect Decay Model
(BayesED3M)
A Bayesian Reliability Growth Model for Computer Software by
B. Littlewood and J. Verrall
Bayesian Extensions to Jelinski-Moranda Model by W. S. Jewell
Classical Approach
For each method we will discuss




Requirements
How to develop estimator
Statistical Performance
Merits & Demerits
Basic ingredients

Collection of data

Data modeling

Likelihood function
Collection of data samples

Any estimation technique needs samples or data from the
ongoing system testing process.

Samples can be in the form of number of defects found
each day or week or any other time unit.

Samples can be total number of defects found by any
instant of time.

Examples of sampling data

In reliability models number of defects discovered per execution time.

Calendar time versions of reliability models also exist.
Estimation of Defects based on Defect Decay Model (ED3M) works for
both calendar time and execution time.

Data modeling




Let  be the parameter to be estimated. It is the total number of defects.
A data model is used to relate  to the data samples drawn from the
system testing.
Data model must also account for random behavior caused by work force
relocation, noise in the testing process, testing of varying complexity
product, among others.
Lets assume that we take nth sample x[n] which contains  corrupted by
random noise w[n] as given by
x[n]    w[n]

Obesrvations of

made in N intervals is given by
x  h  w


(1)
(2)
Note that in Eqs.1 and 2  is linearly related to data.
In Eq.2 h is the observation vector. It can contain information such as
number of testers, failure intensity rate, number of rediscovered faults for
each sample, etc.
Likelihood function

The joint probability distribution of data is given by
p(x[0],x[1],…,x[N-1];  ) or in the vector form p(x;  ) (PDF
of data).

p(x;  ) is the function of both data x and the unknown
 . For example if for a given data set 
changed to  the value of p(x; ) will change.
1
parameter

is
When p(x; ) is seen as the function of  it is called
likelihood function.

Intuitively p(x;  ) provides how accurately we can estimate 
Minimum Variance Unbiased (MVU)
estimator
Minimum Variance Unbiased (MVU)
estimator (contd.)

Probability distribution of the data must be known.

The problem of finding the estimator ˆ is simply to find a
function of data.

ˆ must be unbiased E[ˆ]   .

Variablility of the estimates determines the efficiency of the
estimator.

Among several estimators one with the lowest variance is the
efficient estimator.

Among various methods to determine lower bound on the
variance Cramer-Rao Lower Bound (CRLB) is easier to determine.
Minimum Variance Unbiased (MVU)
estimator (contd.)

CRLB states: it is assumed that the PDF p(x;  ) satisfies the
regularity condition
  ln p(x; ) 
E
 0, 
(3)





where the expectation is taken with respect to p(x;  ). Then
the variance of any unbiased estimator ˆ must satisfy
VAR (ˆ) 

1
 2 ln p (x; )
 E[
]
 2
An estimator which is unbiased, satisfies the CRLB and is
based on linear data model is called an efficient MVU
estimator. It is found using Eq.5
 ln p ( x; )
 I ( )  g ( x)   


( 4)
(5)
The efficient MVU estimator and its variance is given by
Eqs. 6 and 7 respectively
1
ˆ
ˆ  g ( x)
(6)
VAR( ) 
I ( )
(7 )
Minimum Variance Unbiased (MVU)
estimator (contd.)

It may happen that we are able to find an estimator whose
variance is less than other estimators but not less than CRLB. We
will simply call such an estimator MVU estimator.

In other fields such as signal processing and communication
systems where the system or model under investigation is well
defined in terms of physical constraints, it is possible to find
efficient MVU estimator.

In software engineering no model completely captures all the
aspects of a software testing process. Different models are based
on different assumptions and this lack of consistency hints
towards the absence of a mature testing model. Therefore its
unlikely to find an efficient MVU estimator
MVU estimator based on Sufficient Statistic
MVU estimator based on Sufficient Statistic
(contd.)

The minimal data required to make PDF of data p(x; )
independent of unknown parameter  is called sufficient
statistic.

For example a simple estimator
in estimating  .

But if sufficient data x[0],x[1],…,x[N-1] is available then new
sample x[N] will not provide additional information about  .
ˆ = x[n]
will have high variance
p( x[n] | x[0],, x[ N  1]; )  p( x[n] | x[0],, x[ N  1])


(8)
If sufficient statistic exist then p(x; ) can be factorized as given
by Eq.9 according to the Neyman-Fisher Factorization theorem.
p(x; )  g T (x),  h(x)
(9)
In Eq. 9 T(x) is sufficient statistic. A function of T(x) is an MVU
estimator only if it is unbiased E[ˆ]   .
ˆ  f T ( x)
(10)
MVU estimator based on Sufficient
Statistic (contd.)








The PDF of ED3M the technique developed by authors can be factorized as given by
Eq.9. The estimator of ED3M is an unbiased function of T(x) but we do not claim that
estimator of ED3M is based on sufficient statistic for the following reason.
An example of sufficient statistic is that we want to estimate the accuracy of a
surgical precision laser.
We take sufficient samples to estimate the average precision achieved as shown in
the figure.
In software testing as Dijkstra noted, testing shows the presence of defects but not
their absence.
Even though as time elapses rate of finding new defects subsides significantly but
there will be new defects now and then.
Overall testing process can be considered an increasing function of defects (in one
not ‘around’). We can only forecast saturation of finding the defects. A change in
strategy can result in sudden burst of more defects.
Because of this behavior of testing process the notion of sufficient statistic in
software testing is arguable.
Therefore even though ED3M fulfills the mathematical requirements of a sufficient
statistic estimator, we do not claim that its based on this method.
Maximum Likelihood Estimator (MLE)
Maximum Likelihood Estimator (MLE)
(contd.)

Many practical estimators developed are based on MLE.

Important properties of MLE
 MLE is asymptotically (as N  ) an efficient estimator.

For Linear data model as given by Eqs.1 and 2 MLE achieves CRLB for
finite data set.

If an efficient estimator exists MLE will produce it.

The basic idea is to find the value of theta that maximizes
ln p(x;  ) the log-likelihood function for a given x.

If a closed form solution does not exist a numerical method such
as Newton-Raphson can be used to approximate the solution.

Numerical approximation may not necessarily converge to
maximization of ln p(x; ) to produce MLE.

An example of numerical approximation of MLE is Musa-Okumoto
model.

Authors were able to find a closed form solution of MLE for ED3M.
Method of Moments
Method of Moments (contd.)

Method of moments is generally consistent.

Given p(x;theta) if we know that the kth moment of x[n]
of
as given by Eq.11.

 k  E x[n]k   f ( )
  f 1 (  k )
1
ˆ k 
N
ˆ  f
1
k is a function
(11)
(12)
N 1
k
x
 [ n]
(13)
n 0
1

N

x [ n] 

n 0

N 1
k
(14)
ˆ k by taking average of x(k) as

We approximate the kth moment of data x,
given by Eq. 13.

If f is an invertible function as given by Eq.12 then substitution of
Eq.12 results in the estimator ˆ as given by Eq.14.

ˆ k into
Best Linear Unbiased Estimator (BLUE)
Best Linear Unbiased Estimator (BLUE)
(contd.)

BLUE is based on two essential requirements called linearity
conditions



Data model is linear.
Estimator itself is a linear function of data.
The two linearity conditions are given by Eqs.2 and 15.
N 1
ˆ   an x[n]
(15)
n 0
N 1
E[ˆ]   an E x[n]  
Note that second linearity condition is necessary to make ˆ
unbiased as given by Eq.16.
BLUE is a suboptimal estimator because lower bound of its
variance is unknown.
It can be successfully used if its variance is in acceptable
range and it is producing results with reasonable accuracy.
n 0



(16)
Best Linear Unbiased Estimator (BLUE)
(contd.)



A limitation of this method from practical point of view in
software testing is that we have to know the variance of
noise.
In the present day no detailed study has been done which
has investigated the statistical characteristics of noise in
testing process.
A simple way to approximate the variance of noise is to find
the variance of data as given by Eqs.17 and 18.
VAR[x]  VAR[hθ  w]
(17)
VAR[x]  VAR[w]
(18)

However the effects of this approximation on the
performance of the BLUE estimator are unknown with
respect to software testing.
Least Square Error (LSE)
Least Square Error (LSE) (contd.)










It is the most commonly used approximation or estimation method.
The geometrical interpretation of LSE is more intuitive.
If we have data points in space the LSE finds a curve which minimizes the
distance from all these points together.
A weakness of LSE is that it is sensitive to outliers (points which are away
from the group of points).
Due to these outliers the curve may be found away from the vicinity of
points.
A simple way to remedy this situation is to ignore the outliers from the
data set.
Main advantages of LSE is that its simple to develop and no information
about the probability distribution of the data set or noise is needed.
On the other hand the statistical performance of LSE is questionable.
Authors have use LSE to approximate the values of
and
the defect
1
2
decay parameters in ED3M .
From the application of ED3M on several industrial data sets and
simulation data sets the performance of LSE estimator for 1 and 2 was
concluded acceptable.




Defect Estimators

All of these approaches are based on MLE, where MLE requires
assumption about probability distribution.

Therefore each model defines distribution by making some
assumptions while ignoring other facts.

Hence it can be safely deduced that no model will work in all
situations.

Padberg’s Maximum likelihood estimates for the Hypergeometric
software reliability model

Musa-Okumoto logarithmic Poisson execution time model for software
reliability measurement

Estimation of Defects based on Defect Decay Model (ED3M)
Padberg’s Approach

Padberg showed that growth quotient Q(m) of the likelihood function L(m) when
greater than 1 indicates that likelihood function is indeed increasing and provides
maximum likelihood estimates.
Q(m) 
(m  w1 ) (m  wn )
L(m)

L(m  1)
mn1 (m  cn )
(19)

In Eq.19 m is the initial number of faults, wn is the number of newly discovered and
rediscovered faults for the nth test and cn is cumulative number of faults in n tests.

For a give data set cn first find x = cn+1 then Q(x).

If Q(x) > 1 then set x = x+1 and fine Q(x) again.

Keep repeating the steps until Q(x) ≤ 1.

Statistical performance of Q(m) is not discussed.

We do not know if the variance of Q(m) is asymptotically bounded by CRLB in other
words if Q(m) is asymptotically an efficient MVU estimator.

Even though the underlying data model is not known but it can be observed from
Eq.19 that model is nonlinear.

If the data model is nonlinear then it cannot achieve CRLB for finite data records.
Musa-Okumoto logarithmic poison execution time
model

Musa-Okumoto proposed a reliability model based on the assumption that the
expected number of faults  (t )by time t are Poisson distributed as given in Eq.20.

The parameters to be estimated are
0 the initial failure intensity and 
the rate of
reduction in the normalized failure intensity per failure.

The data model in Eq.21 is a nonlinear function of
0 and 
, hence MLE will not
achieve CRLB for finite data set.
[  (t )]m  (t )
PrM (t )  m 
e
m!
1
 (t )  ln(0t  1)
(21)
(20)


A closed form solution of MLE could not be found for Eqs.20 and 21.

Therefore a numerical approximation of MLE is needed.

Whether the approximation of MLE will be asymptotically an efficient MVU estimator
is not guaranteed.
Estimation of Defects based on Defect Decay
Model (ED3M)




The data model of ED3M is given by Eq.22, where D is the defect data
vector, h is the observation vector and w is noise vector. Vectors are of
dimension Nx1.
We have assumed that D is normally distributed and the PDF of D is given
by Eq.24.
The initial number of defects in software is given by R0
The MLE estimator Rˆ0 for R0 is given by Eq.25.
D  R0h  w
h( n)  1 
2
 

e 1n 
2  1
1
p(D; R0 ) 
Rˆ 0  hT h
(22)
N
2 2
e
1
2  1
e  2 n
1


T
  2 2 ( D  R0h ) ( D  R0h ) 


(23)
(24)
(2 )
1
hT D
(25)
As seen in Eq.22 the data model is linear, therefore MLE estimator Rˆ0 in
Eq.25 can achieve CRLB for finite data set and will be an efficient MVU
estimator.
Comparison between ED3M and Padberg's MLE, where the total number of defects is 481
and the total time length is 111 units.
Comparison between ED3M and Musa-Okumoto model, where the total number of defects
is 136 and the total time length is 88.682x103 CPU seconds.
Conclusion and future work

An accurate prediction of total number of software defects helps in
evaluation of the status of testing process.

But the accuracy of the estimator owes to the estimation method which is
used to develop the estimator.

We have tried to provide a general framework of available estimation
methods for researchers who are interested in defect estimation.

Although discussion had been around software testing and defect
estimation but its general enough to be used for other estimation
problems.

We have elicited the requirements of each method.

We have also discussed the statistical efficiency that each method offers.

Even though the discussion is limited to single parameter estimation, it
can be easily extended to a vector of parameters to be estimated.

In future we will extend our discussion to Bayesian Approaches and
expand the analysis of existing estimators to be more comprehensive.