Transcript Document

Locating Causes of
Program Failures
Texas State University
CS 5393 Software Quality Project
Yin Deng
Topics

Introduction

What is the problem?

Overview of major solutions

A Sample Failure

Case Study

Complexity and other issues

Conclusion

Related Material
2
Introduction



Locating Causes of Program Failures

Holger Cleve and Andreas Zeller

ICSE 2005, research papers on Fault
Localization
Holger Cleve is one of the members
in software engineering research
group at Saarland University in
Germany.
http://www.st.cs.uni-sb.de/~cleve/
Andreas Zeller is a full professor
and the chair of software
engineering research group at
Saarland University. His research in
SE concerns especially the analysis
of why large, complex software
systems fail to work as they should.
http://www.st.cs.uni-sb.de/zeller/
3
What’s the Problem?


Definitions

Failure: A program’s behavior doesn’t satisfy its
requirement specification.

Fault / Infection: An incorrect intermediate state that
may be entered during program execution.

Failure  Infection  Defect in code, but not vice versa.
Problem

Why does program fail?

How to find the defects that cause a software failure?
4
Overview of major solutions


Searching in Space

Across a program state to find the infected variable(s),
often among thousands.

Focus on the difference between the program states
where the failure occurs, and the states where the
failure does not occur.

Using Delta Debugging, those initial differences can be
systematically narrowed down to a small set of variables.
Searching in Time

Search over millions of program states to find the
moment when the defect was executed.

Focus on cause transitions (CTS)!
5
Searching in Space
State in r
State in r
Variable with same value

Compare the program
states of a passing run r
and a falling run r at a
certain moment.

Of all different states,
only some may be
relevant for the failure.

How to find a subset of
relevant variables that is
as small as possible?

Delta Debugging, which
behaves very much like a
binary search.
Irrelevant variable with different value
Relevant variable with different value
6
Searching in Time

A cause transition is where a
cause originates. It points to
program code that causes the
transition and hence the failure.

During transitions, some
variables cease to be a failure
cause and other variables begin.

Cause transitions are not only
good locations for fixes, they
actually locate the defects that
cause the failure.
Cause Transition
7
Example – Source Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/* sample.c -- Sample C program */
#include <stdio.h>
#include <stdlib.h>
static void shell_sort(int a[], int size)
{
int i, j;
int h = 1;
do {
h = h * 3 + 1;
} while (h <= size);
do {
h /= 3;
for (i = h; i < size; i++)
{
int v = a[i];
for (j = i; j >= h && a[j - h] > v; j -= h)
a[j] = a[j - h];
if (i != j)
a[j] = v;
}
} while (h != 1);
}
26 int main(int argc, char *argv[])
27 {
28
int i = 0;
29
int *a = NULL;
30
31
a = (int *)malloc((argc - 1) * sizeof(int));
32
for (i = 0; i < argc - 1; i++)
33
a[i] = atoi(argv[i + 1]);
34
35
shell_sort(a, argc);
36
37
for (i = 0; i < argc - 1; i++)
38
printf("%d ", a[i]);
39
printf("\n");
40
41
free(a);
42
return 0;
43 }
8
Example – Running Result

A passing run r
$ sample 9 8 7
789

A falling run r
$ sample 11 14
0 11
What’s wrong?
9
Example – Searching in Space
State differences between r and r. One of these differences causes sample to fail.
10
Example – Searching in Space (cont.)


Procedures

Runs r up to Line 9

Applies half of the
differences on r

Resumes execution and
determines the outcome.
Result

Line 9, a[2] being zero
causes the sample failure.

What causes a[2] be
zero?
11
Example – Searching in Time
12
Example – Searching in Time (cont.)


Procedures

Find an interval of matches to start with;

there was a cause transition between argc in step 1 and a[0]
in Step 44;

Use Delta Debugging to find relevant variables between argc
and a[0] (function calls are preferred), a[2] is isolated;

CTS : Step 26 (a[2] again);

CTS : Step 35 (v).
Result

argc  a[2] in Lines 32–35 (Steps 8–11);

a[2]  v in Line 17 (Step 29);

v  a[0] in Line 21 (Step 36).
13
Example – Debugging Result
14
Case Study: The GCC Failure
The program that crashes GCC
15
Complexity


Searching in space

Best case: Delta Debugging needs 2s log k test runs to
isolate s failure-inducing variables from k state differences.

worst case is k2 + 3k

In practice, Delta Debugging is much more logarithmic
than linear.
Searching in time

A simple binary search over n program steps, repeated for
each cause transition.

For m cause transitions, we need m log n runs of Delta
Debugging.
16
Practical Issues



Accessing state

Currently using GDB, which is painfully slow;

More efficient ways need to be explored.
Capturing accurate states

Several heuristics are used to determine state transferring;

When such heuristics fail, the state cannot be transferred.
Incomparable states

When control flow reaches different points in r and r, the
resulting states are not comparable, simply because the set of
local variables is different.

Some efforts are required to determine when the control flows
of r and r diverge and converge.
17
Conclusion

Cause transitions locate the software defect that
causes a given failure, performing twice as well
as any other technique previously known.

The technique requires an automated test, a
mean to observe and manipulate the program
state, as well as at least one alternate passing
test run.

The technique could be used as an add-on to
running an automated test suite; we not only
know that a test has failed, but also why and
where it failed.
18
Related Material

Isolating cause-effect chains from computer programs.


Simplifying and isolating failure-inducing input.


A. Zeller and R. Hildebrandt. IEEE Transactions on Software
Engineering, 28(2):183–200, Feb. 2002.
Visualizing memory graphs.


A. Zeller. In W. G. Griswold, editor, Proc. Tenth ACM SIGSOFT
Symposium on the Foundations of Software Engineering (FSE-10),
pages 1–10, Charleston, South Carolina, Nov. 2002. ACM Press.
T. Zimmermann and A. Zeller. In S. Diehl, editor, Proc. of the
International Dagstuhl Seminar on Software Visualization, volume
2269 of Lecture Notes in Computer Science, pages 191–204, Dagstuhl,
Germany, May 2002. Springer-Verlag.
Why Programs Fail: A Guide to Systematic Debugging.

A. Zeller. Morgan Kaufmann Publisher, October, 2005.

ISBN 1558608664.
19
Any Question?
20