Transcript .ppt

RECOMPUTING COVERAGE
INFORMATION TO ASSIST
REGRESSION TESTING
Written by:
Pavan Kumar Chittimalli
Mary Jean Harrold
Reviewed by:
Joan Baldriche
OUTLINE

Introduction








Background
Code Examples and coverage matrices
Algorithm
ReCover Tool
Empirical Studies








Motivation
Problem Statement
Coverage Data
Subjects
Study 1
Study 2
Study 3
Related work
Conclusions
Paper critique
Questions
INTRODUCTION
MOTIVATION
Software is changed for a variety of reasons, such
as correcting errors, adding new features, and
improving performance.
 After the software is changed, regression testing
is applied to the modified version of the software
to ensure that it behaves as intended, and that
modifications have not adversely impacted its
quality.
 Regression testing is expensive and it could
consume as much as 80% of the testing budget
and up to 50% of the cost of software
maintenance.

INTRODUCTION
PROBLEM STATEMENT



Need to develop a technique that saves time and
resources by not rerunning the entire test suite.
Need to create a procedure that does not use
inaccurate data from outdated coverage data or
estimated coverage data.
Need to produce a method that offers precise
results as if all the test cases in the test suite
were run.
INTRODUCTION
COVERAGE DATA
Coverage data is a measure used in software
testing. It describes the degree to which the
source code of a program has been tested.
 Coverage data collected when testing a version of
software is used by regression testing techniques
to assist in identifying the testing that should be
performed on the new version of the software.
 In this paper we will discuss three kinds of
coverage data: outdated, updated and estimated
coverage data.

INTRODUCTION
COVERAGE DATA



One approach is to reuse the coverage data collected when
the test suite Ti is run on one version of the program Pi for
tasks on Pi+1 and subsequent versions so that the expense
of recomputing it for each subsequent version of Pi is
avoided. This is called outdated coverage data.
The next approach reruns all test cases in Ti on Pi+1 to get
accurate coverage data on Pi+1. This is called updated
coverage data. However, this approach defeats the purpose
of techniques that aim to reduce the number of test cases
that need to be rerun because it reruns all test cases in Ti
on Pi+1.
The last approach estimates coverage data for Pi+1 based on
coverage data for Pi. This is called estimated coverage data.
BACKGROUND




Since it is very expensive to rerun all the test cases
in a test suite during regression testing,
researchers have developed techniques to improve
the efficiency of the retesting.
For example, regression test selection (RTS)
techniques select a subset of Ti ; Ti’ and use it to
test Pi+1.
We use an RTS technique implemented as
DEJAVOO to illustrate the impact of outdated data
on regression testing.
DEJAVOO creates control-flow graphs for the
original (Porig) and modified (Pmod) versions of a
program.
BACKGROUND



The technique performs the traversal in a depth-first order
to identify dangerous edges.
Dangerous edges are edges whose sinks differ and for
which test cases in T that executed the edge in Porig should
be rerun on Pmod because they may behave differently in
Pmod.
An example of these graphs are shown in Fig. 4, from
running DEJAVOO on the program Grade.
CODE EXAMPLES AND COVERAGE
MATRICES
CODE EXAMPLES AND COVERAGE
MATRICES
CODE EXAMPLES AND COVERAGE
MATRICES
CONTROL-FLOW GRAPHS
(a) v0
(b) v1
(c) v2
ALGORITHM
The algorithm, RECOMPUTEMATRIX, shown in
Fig. 5, for recomputing coverage data after
changes are made to a program provides the
same coverage data as rerunning all test cases in
the original test suite but requires running only
those test cases selected to run on the modified
program.
 RECOMPUTEMATRIX takes four inputs: Porig
and Pmod, T and morig.
 RECOMPUTEMATRIX outputs mmod.

ALGORITHM

RECOMPUTEMATRIX consists of five main
steps:





Creating and initializing the coverage matrix mmod for
Pmod (lines 1 and 2)
Identifying T’ the set of test cases in T to rerun on Pmod
and computing the entity mappings entityMap between
Porig and Pmod (line 3)
Creating the selectively instrumented version of Pmod ;
Pmod-inst (line 4)
Running Pmod-inst with T’ to get coverage data for the
affected entities in Pmod (lines 5-7)
Transferring the coverage data for the unaffected parts
of Pmod using the mappings stored in entityMap and the
affected entities in insEntities (lines 8-18)
ALGORITHM
RECOMPUTEMATRIX
ALGORITHM
SELECTIVEINSTRUMENT
ALGORITHM
MATRICES
RECOVER TOOL
EMPIRICAL STUDIES
SUBJECTS
EMPIRICAL STUDIES
STUDY 1

Question:






What are the effects of the three techniques for providing coverage
data—outdated, estimated, and updated—on regression test
selection (RTS)?
To answer this research question, the authors used all six
subjects described earlier. For these subjects, the authors
populated outdated, estimated, and updated coverage data.
Outdated coverage data was obtained by running the test
suite.
Estimated coverage data was obtained using JDIFF.
Updated coverage data was obtained using ReCover.
Then DEJAVOO was run on the subject programs with the
three different coverage data sets.
EMPIRICAL STUDIES
STUDY 1
Jakarta Regexp
ProAX
Assent
EMPIRICAL STUDIES
STUDY 1
NanoXML
EMPIRICAL STUDIES
STUDY 1
JABA
EMPIRICAL STUDIES
STUDY 2

Question:




What is the effect of selective instrumentation in reducing the expense
of running the test suite selected by the RTS algorithm?
To answer this research question, the authors performed
two experiments.
In the first experiment, the authors measured and
compared the number of branches instrumented by the full
instrumentation and by the selective instrumentation.
In the second experiment, the authors measured and
compared the time to run the selected test suite T’ on Pmod
instrumented with full instrumentation and the time to
run T’ on Pmod instrumented with selective
instrumentation.
EMPIRICAL STUDIES
STUDY 2
Experiment 1
Experiment 2
EMPIRICAL STUDIES
STUDY 3

Question:


What is the efficiency of our technique for updating coverage data as
part of a regression testing process?
To answer this question, the authors measured and
compared regression-testing time for four approaches:
Running all test cases in T on all versions of the program P.
 Selecting T’ using DEJAVOO and running the test cases in T’ on all
modified versions of P.
 Selecting T’ and recording mappings using MOD-DEJAVOO, updating
coverage data for T-T’ using RECOVER, instrumenting the modified
versions of P with full instrumentation, and running the test cases in
T’ on the fully instrumented modified versions of P.
 Selecting T’ and recording the mappings using MOD-DEJAVOO,
updating coverage data for T-T’ using RECOVER, instrumenting
modified versions of P using selective instrumentation, and running
test cases in T’ on the selectively instrumented modified versions of P.

EMPIRICAL STUDIES
STUDY 3

The proposed technique using RECOVER with
selective instrumentation saves, on average,
17.35 percent of the regression-testing time for
all of the experimental subjects.
EMPIRICAL STUDIES
STUDY 3
RELATED WORK

To the best of the author’s knowledge, no other
technique has been presented to solve the
problem of providing accurate coverage
information without rerunning all test cases in
the test suite. However, several techniques are
related in that they confirm the existence of the
problem or provide alternative approaches.
CONCLUSIONS
In this paper, was presented a technique that
provides updated coverage data for a modified
program without running all test cases in the test
suite that was developed for the original program
and used for regression testing.
 The technique is safe and precise in that it
computes exactly the same information as if all
test cases in the test suite were rerun.
 It was also shown the results of three empirical
studies on a set of subject programs of varying
sizes, along with versions of those programs and
test suites used to test them.

CONCLUSIONS



The first study confirms that regression test selection
using outdated and estimated coverage data causes
the regression test selection algorithm to both select
unnecessary test cases and omit important test cases.
The second study shows that selective
instrumentation saves in the number of probes that
are required for running the test cases selected by the
regression test selection algorithm. This reduction
results in a savings in the time to run the test cases
selected, and thus, reduces the overall regression
testing time.
The third study shows that the technique with
selective instrumentation reduces the time required
for regression testing over DEJAVOO.
PAPER CRITIQUE

Benefits:
Description of a novel technique that computes
accurate, updated coverage data when a program is
modified, without rerunning unnecessary test cases.
 Discussion of a tool, RECOVER, that implements the
technique and integrates it with RTS.
 Set of empirical studies that show, for the subjects
studied, that the technique provides an effective and
efficient way to update coverage data for use on
subsequent regression-testing tasks.

QUESTIONS