Software Testing and Reliability

Download Report

Transcript Software Testing and Reliability

Software
Testing
and
Reliability
Southern Methodist University
CSE 7314
“A working program
remains an elusive thing of
beauty”
Robert Dunn
Syllabus
•
•
•
•
•
•
Instructor; Rob Oshana
Office hours: By appointment
Phone; (281) 274-3211
Fax; (214) 768-3085
E-mail; [email protected]
Web site;
www.engr.smu.edu/cse/roshana/cse7314
Syllabus
• Required Text Book: Systematic
Software Testing, by Rick Craig and
Stefan Jaskiel, Artech House, ISBN 158053-508-9
• Practical Guide to Testing ObjectOriented Software by David A. Sykes,
John D. McGregor Addison-Wesley
Pub Co; ISBN: 0201325640
Syllabus
• Student Evaluation; The course
grade will be computed as
follows:
– Midterm Exam
– Final Exam
– Homework
– Project
30%
30%
15%
25%
Course Outcomes
• Upon successful completion of this
course the student will be able to:
• 1. Determine the test techniques
applicable to a given program
• 2. Construct a test suite using the
techniques discussed in class
Course Outcomes
• 3. Determine various test and
quality metrics of a program
• 4. Create and manage an
effective software testing team
Course is / is not
• Is a roadmap approach for test
professionals
• Is not an implementation course
• Is not a software testing tools
course
Outline
Trip
Topics
Readings
1
Overview of the
Craig, Chapters 1-4
testing process
Risk analysis
Master test planning
Detailed test planning
2
Analysis and design
Test implementation
Test execution
Craig, Chapters 5-7
3
Test organization
The software tester
The test manager
Improving the
process
Craig, chapters 8-11
Outline
Trip
Topics
Readings
4
Statistical testing
techniques
Testing OO systems
Notes
Sykes, Chapters 1-6
5
Testing OO systems
Sykes, Chapters 7-10
6
Testing RT systems
Testing safety critical
systems
Testing web based
systems
Notes
Outline
• Selected readings will be sent to
the students on a periodic basis
• Homework;
assignments/schedule will be
posted on the web site shortly
• Project will be discussed next
trip
Testing style
Competence
Test question cues
Knowledge
List, describe
Comprehension
Summarize,
discuss, describe
Explain, compare
Evaluation
Analysis
Analyze, explain,
compare
“Errors are more common,
more pervasive, and more
troublesome in software
than with other
technologies”
David Parnas
Homework 1a
• Please send me a couple
paragraphs describing your
background and experience.
Describe to me what you want to
get out of the course
Homework 1a
• Please read the paper entitled
“Improving Software Testability”
• www.stlabs.com/newsletters/test
net/docs/testability.htm
CSE 7314
Software Testing and
Reliability
What is testing?
• How does testing software
compare with testing students?
What is testing?
• “Software testing is the process
of comparing the invisible to the
ambiguous as to avoid the
unthinkable.” James Bach,
Borland corp.
What is testing?
• Software testing is the process
of predicting the behavior of a
product and comparing that
prediction to the actual results."
R. Vanderwall
Purpose of testing
• Build confidence in the product
• Judge the quality of the product
• Find bugs
Finding bugs can be difficult
A path through the
mine field (use case)
A path through the
mine field (use case)
Mine field
x
x
x
x
x
x
x
x
x
x
x
Why is testing important?
•
•
•
•
Therac25: Cost 6 lives
Ariane 5 Rocket: Cost $500M
Denver Airport: Cost $360M
Mars missions, orbital explorer &
polar lander: Cost $300M
Why is testing so hard?
Reasons for customer
reported bugs
• User executed untested code
• Order in which statements were
executed in actual use different from
that during testing
• User applied a combination of
untested input values
• User’s operating environment was
never tested
Interfaces to your
software
•
•
•
•
Human interfaces
Software interfaces (APIs)
File system interfaces
Communication interfaces
– Physical devices (device drivers)
– controllers
Selecting test scenarios
• Execution path criteria (control)
– Statement coverage
– Branching coverage
• Data flow
– Initialize each data structure
– Use each data structure
• Operational profile
• Statistical sampling….
What is a bug?
• Error: mistake made in translation or
interpretation ( many taxonomies
exist to describe errors)
• Fault: manifestation of the error in
implementation (very nebulous)
• Failure: observable deviation in
behavior of the system
Example
• Requirement: “print the speed,
defined as distance divided by
time”
• Code: s = d/t; print s
Example
• Error; I forgot to account for
t=0
• Fault: omission of code to catch
t=0
• Failure: exception is thrown
Severity taxonomy
•
•
•
•
•
Mild - trivial
Annoying - minor
Serious - major
Catastrophic - Critical
Infectious - run for the hills
What is your taxonomy ?
IEEE 1044-1993
Life cycle
Errors can be introduced at
each of these stages
error
Testing and repair process can be
just as error prone as the development
Process (more so ??)
Requirements
error
error
error
Resolve
Design
Isolate
Code
error
Classify
Testing
error
error
Ok, so lets just design our
systems with “testability” in
mind…..
Testability
• How easily a computer program
can be tested (Bach)
• We can relate this to “design for
testability” techniques applied in
hardware systems
JTAG
A standard Integrated Circuit
Boundary
Scan cells
Boundary
Scan path
Data out
Core
IC
Logic
I/O pads
Test data
in (TDI)
TDI
Data in
Test access port
controller
Test mode
Select (TMS)
cell
Test clock
(TCK)
Test data
out (TDO)
TDO
Operability
• “The better it works, the more
efficiently it can be tested”
– System has few bugs (bugs add
analysis and reporting overhead)
– No bugs block execution of tests
– Product evolves in functional
stages (simultaneous development
and testing)
Observability
• “What you see is what you get”
– Distinct output is generated for each
input
– System states and variables are visible
and queriable during execution
– Past system states are ….. (transaction
logs)
– All factors affecting output are visible
Observability
– Incorrect output is easily identified
– Internal errors are automatically
detected through self-testing
mechanisms
– Internal errors are automatically
reported
– Source code is accessible
Visibility Spectrum
End customer
visibility
Factory
visibility
GPP
visibility
DSP
visibility
Controllability
• “The better we can control the
software, the more the testing
can be automated and
optimized”
– All possible outputs can be
generated through some
combination of input
– All code is executable through
some combination of input
Controllability
– SW and HW states and variables
can be controlled directly by the
test engineer
– Input and output formats are
consistent and structured
Decomposability
• “By controlling the scope of
testing, we can more quickly
isolate problems and perform
smarter testing”
– The software system is built from
independent modules
– Software modules can be tested
independently
Simplicity
• “The less there is to test, the
more quickly we can test it”
– Functional simplicity (feature set is
minimum necessary to meet
requirements)
– Structural simplicity (architecture
is modularized)
– Code simplicity (coding standards)
Stability
• “The fewer the changes, the
fewer the disruptions to testing”
– Changes to the software are
infrequent, controlled, and do not
invalidate existing tests
– Software recovers well from
failures
Understandability
• “The more information we have, the
smarter we will test”
– Design is well understood
– Dependencies between external,
internal, and shared components are
well understood
– Technical documentation is accessible,
well organized, specific and detailed,
and accurate
“Bugs lurk in corners and
congregate at boundaries”
Boris Beizer
Types of errors
• What is a Testing error?
– Claiming behavior is erroneous
when it is in fact correct
– ‘fixing’ this type of error actually
breaks the product
Errors in classification
• What is a Classification error ?
– Classifying the error into the
wrong category
• Why is this bad ?
– This puts you on the wrong path
for a solution
Example Bug Report
• “Screen locks up for 10 seconds
after ‘submit’ button is pressed”
• Classification 1: Usability Error
• Solution may be to catch user events
and present an hour-glass icon
• Classification 2: Performance error
• solution may be a modification to a
sort algorithm (or visa-versa)
Isolation error
• Incorrectly isolating the erroneous
modules
• Example: consider a client server
architecture. An improperly formed
client request results in an
improperly formed server response
• The isolation determined
(incorrectly) that the server was at
fault and was changed
• Resulted in regression failure for
other clients
Resolve errors
• Modifications to remediate the
failure are themselves erroneous
• Example: Fixing one fault may
introduce another
What is the ideal test
case?
• Run one test whose output is
"Modify line n of module i."
• Run one test whose output is "Input
Vector v produces the wrong output"
• Run one test whose output is "The
program has a bug" (Useless, we
know this)
More realistic test case
• One input vector and expected
output vector
– A collection of these make of a Test
Suite
• Typical (naïve) Test Case
– Type or select a few inputs and observe
output
– Inputs not selected systematically
– Outputs not predicted in advance
Test case definition
• A test case consists of;
– an input vector
– a set of environmental conditions
– an expected output.
• A test suite is a set of test cases
chosen to meet some criteria (e.g.
Regression)
• A test set is any set of test cases
Requirements as theory
model
• Suppose we consider a specification
to be a theory describing a program
• How do we test theories?
• By examining the theory and using it
to make predictions
• First Principle of testing. The
expected results of a test should be
known before the test is run
Requirements as theory
• "Accumulating evidence to
support a theory is not the
appropriate way to test it. What
you should do is try to falsify it,
to challenge it with your best
efforts at proving it false."
– Karl Popper
Requirements as theory
• Implications for us doing testing
• Testing should not be used to
build confidence, to easy
• Testing should attempt to find
deviations from the theory, that
is, bugs
• Any other purpose sets up the
wrong goal
Requirements as theory
• "Program testing can be used to
show the presence of bugs, but
never show their absence!" O.-J.
Dahl, E. W. Dijkstra, and C.A.R.
Hoare, Structured Programming,
New York: Academic, 1972.
• "Absence of proof (of bugs) is
not proof of absence."; Logic
101
A few words about
computability
• From the theory of computability, we
know:
• It is undecidable whether a given
program will halt on a given input.
(Halting problem)
• It is undecidable whether two
programs will always output the
same answer for a given input.
(Equivalence)
Implications for testing
• There is no general solution for
the automated oracle problem
– no automatic testing strategy can
be devised that will work in all
cases
• There is no general way to find
the input that causes a specific
line of code to be executed
• Coverage is undecidable
All of the following are
undecidable
• Will a given statement ever be exercised
by any input?
• What input will exercise a given
statement?
• Will a given input exercise some specified
statement?
• Will a given path ever be exercised by any
input?
• What input will exercise a given path?
• Will a given input exercise some specified
path?
Computability
• Note that even though it is in general
undecidable, there is a large class of
programs for which these issues can
be decided
– a large testing tools industry has
emerged because of this
• When examining a tool, make sure
that the class of programs for which
it works is well understood
Reference book
A few words on
combinatorics
• Based on the Cartesian product
of sets, we can count the number
of possible inputs that a program
has, i.e. | I |
Example
• Assume a program has a single
input, Customer ID (CID)
• May be any value in the domain
{00000-99999}
• What is | I | ?
• | I | = 100000
Example
• Now assume we add a second
input, the Order ID (OID)
• This may be any value in the
domain {00000-99999} as well
• Now what is | I |?
• 100000*100000 = 10,000,000,000
Example
• Finally, add a credit card number
to the input
• This is a 12 digit number
• | I | has now reached 10**22
• If we can execute 1 million tests
per second, it will take 1016
seconds, or about 300 million
years!!
Example
• Since we cannot know what
data may exercise a given
statement/path in general, we
may attempt to resort to
exhaustive testing
• This attempt is doomed to fail
due to the combinatorial
explosion
Functional and structural
approaches to testing
Engineering the testing
process
• Any engineered product (and
most other things) can be tested
in one of two ways
– Knowing the specified function
that a product has been designed
to perform, tests can be conducted
that demonstrate each function is
fully operational while at the same
time searching for errors in each
function
Engineering the testing
process
– Knowing the internal workings of a
product, tests can be conducted to
ensure that “all gears mesh”
(internal operations performed
according to specifications)
Structural testing
• Uses knowledge of the internal
workings
• Also known as Clear box/glass
box
• Code based
• Can be useful for finding
interesting inputs
• Misses an entire class of faults,
missing code
Behavioral
• Uses knowledge of the specific
function that is to be performed
• Based solely on the specification
without regard for the internals
• Also known as Black box
• More user oriented
• Misses an entire class of faults,
extra code (surprises) except by
accident
Passing criteria
• How do we know when
• 1. a single test has passed
• 2. when we are done testing
Passing criteria
• A single test passes when its
output is correct
– This requires a specific definition
of correct and ties into the
automated oracle problem
When are we done?
• Conway Criteria:
• No syntactic errors (it compiles)
• No compile errors or immediate
execution failures
• There exists Some set of data for
which the program gives the
correct output
• A typical set of data produces
the correct output
When are we done?
• Difficult sets of data produce the
correct output.
• All possible data sets in the
problem specification produce
the correct output
• All possible data sets and likely
erroneous input succeeds.
• All inputs produce the correct
output
Nature of software defects
• Logic errors and incorrect
assumptions are inversely
proportional to the probability
that a program path will be
executed
Nature of software defects
• We often believe that a logical
path is not likely to be executed
when, in fact, it may be executed
on a regular basis
• Typographical errors are random
More of a case for WHITE box testing……
Summary
• Zeroth Principle of Testing; The
purpose of testing is to find bugs
• Corollary to the zeroth principle;
"The program is wrong"
Summary
• First Principle of Testing; The
results of a test must be known
before the test is run
• Second Principle of Testing:
Testing is difficult
Summary
• Exhaustive testing is doomed by
the combinatorial explosion
• Any other technique is
undecidable
• Third Principle of Testing: No
single technique will suffice for
any non-trivial testing effort
Homework 1b
• Discuss a software failure from
your experience or knowledge
and attempt to explain the role of
testing (or lack thereof) in that
failure
Another reference
• Testing techniques newsletter
• www.testworks.com/News/TTNOnline
More on coverage
Logical coverage
• What does statement coverage
imply ?
• Each statement must be
executed at least once
Example
•
•
•
•
•
•
1 IF (( X > 1) AND ( Y == 1))
2 Z=Z/X
3 END
4 IF ((X == 2) OR ( Z >1))
5 Z=Z+1
6 END
What is required for statement coverage ?
Example
• To achieve statement coverage,
one can choose x = 2, y = 1
• Many possible errors are
missed:
• 1 IF ((X > 1) OR (Y ==1))
• 1 IF ((X >=1) AND (Y ==1))
• 1 IF ((X > 1) AND (Y >=1))
Blindness
• Statement coverage is blind to
several classes of error when
choosing values to achieve
coverage
Assignment blindness
• Due to assignment of a particular
value to a variable, the error
does not propagate
Equality blindness
• Due to an Equality check of a
particular variable, the error
does not propagate
Self blindness
• Conditional itself covers up the
error
Assignment Blindness
example
•
•
•
•
1 IF (<conditional>)
2 X=1
3 END
4 IF (X + Y > 1) //Should have
been (Y > 0)
• 5 ....
• 6 END
Assignment Blindness
example
• In this example, values are
chosen to make the conditional
true, the statement X=1 is
executed and the error in line 4
is not seen
Assignment Blindness
example
• If every path forces the same
assignment, then the 'error'
doesn't really matter, (does it
exist??)
– For instance, if the conditional in
statement 1 always evaluated to
true
Equality Blindness
Example
• 1 IF (A == 2)
• 2 .....
• 3 IF (B > 1) // Should have been
(A + B > 3)
Equality Blindness
Example
• In this example, the value of 2 is
chosen for A to force execution
of the body
• The error in statement 3 is
missed
Self Blindness
• 1 IF (X < 1) // Should have been
(2X < 1)
• In this example, the value of 0 is
chosen for X
Observation
• Statement coverage is the
weakest form of coverage
• Many classes of errors can
escape this testing
• Typical projects have less than
80% statement coverage!!
Branch coverage
• Each branch of a decision must
be executed at least once
• This is stronger that statement
coverage, but it is still weak
• Note: A decision is a logical
combination of one or more
conditions
Branch coverage example
•
•
•
•
•
•
1 IF (( X > 1) AND ( Y == 1))
2 Z=Z/X
3 END
4 IF ((X == 2) OR ( Z >1))
5 Z=Z+1
6 END
Branch coverage example
• To achieve branch coverage, one
option is to choose
– x = 2, y = 1
– x = 0, y = 2
Branch coverage example
• Many possible errors are still
missed:
• 1 IF ((X > 1) OR ( Y ==1)
• 1 IF ((X >=1) AND (Y ==1))
• 1 IF ((X > 1) AND (Y <=1))
Blindness in branch
coverage
• Branch coverage is blind to
several classes of error when
choosing values to achieve
coverage
– Compound decisions (more than
one condition) are weakly tested
– Boundaries of conditions (within a
decision) are not explicitly tested
Condition coverage
• Each condition within a decision
must assume all possible values
• This is 'potentially' stronger than
branch coverage, but not always
• It may, in fact, be weaker
Example of condition
coverage
•
•
•
•
•
•
1 IF (( X > 1) AND ( Y == 1))
2 Z=Z/X
3 END
4 IF ((X == 2) OR ( Z >1))
5 Z=Z+1
6 END
Example of condition
coverage
• To achieve condition coverage
for statement 1, one must
choose values such that:
• X > 1 ~(Y == 1)
• ~(X > 1) Y == 1
Example of condition
coverage
• Both of these vectors miss the
execution of statement 2
• A better choice may be:
• ~(X > 1) ~(Y == 1)
• X > 1 Y == 1
Multi-condition coverage
• Every combination of possible
condition values within a
decision must be chosen
• This is strictly stronger than
branch or condition coverage
– still has weaknesses!
Multi-condition coverage
example
•
•
•
•
•
•
1 IF (( X > 1) AND ( Y == 1))
2 Z=Z/X
3 END
4 IF ((X == 2) OR ( Z >1))
5 Z=Z+1
6 END
Multi-condition coverage
example
• To achieve multi-condition coverage
for statement 1, one must choose
values such that:
•
•
•
•
1. X > 1 Y == 1
2. X > 1 ~(Y == 1)
3. ~(X > 1) Y == 1
4. ~(X > 1) ~(Y == 1)
Multi-condition coverage
example
• To achieve multi-condition coverage
for statement 2, one must choose
values such that:
•
•
•
•
5. X == 2 Z > 1
6. X == 2 ~(Z > 1)
7. ~(X == 2) Z > 1
8. ~(X == 2) ~(Z > 1)
Coverage
X
Y
Z
Covers
1
0
1
1,5
2
1
4
4,8
1
1
3
2,6
2
0
1
3,7
Multi-condition coverage
example
• There are 9 paths through this
code and this test set only
covers 4 of those
Multi-condition coverage
example
• Consider this error
• Line 2 is incorrectly written as Z
= Z –X
• Only the second vector will
execute this line but for the
values chosen, Z-X = Z/X
Multi-condition coverage
example
• Also consider that no vector will
execute both line 2 and line 5
• In a more complex example, it is
easy to imagine variables being
set in line 2 that are then
incorrectly used in line 5
Multi-condition coverage
example
• For instance:
• 2. Z = Z/X; A = 0
• ....
• 5. Z = Z + 1; B = 1/A;
Path coverage
• Every possible path must be
chosen
• Finding vectors to accomplish
this is, in general, undecidable,
and usually difficult
Path coverage
• This still misses boundary
conditions
• If statement 1 was mis-typed as
X >= 1, it is conceivable that the
chosen test set does not contain
a vector with X=1, missing this
error
Backup
Homework
• Suppose we have a routine r1 that
sends its output to routine r2, as in a
unix pipe. Discuss the implications
of R1 != D2, where R1 is the range of
routine r1 and D2 is the domain of
routine r2
• Suppose a program accepts as input,
a 16 bit positive integer, i, and a
single 8 bit character, c. The program
has several faults. the first fault is