Document 7122158

Download Report

Transcript Document 7122158

Software
Testing
Rob Oshana
Southern Methodist
University
Why do we Test ?
• Assess Reliability
• Detect Code Faults
Industry facts
 30-40% of errors detected after deployment
are run-time errors
[U.C. Berkeley, IBM’s TJ Watson
Lab]
 The amount of software in a typical device doubles
every 18 months
[Reme Bourguignon, VP of Philips
Holland]
 Defect densities are stable over the last 20 years :
0.5 - 2.0 sw failures / 1000 lines
[Cigital
Corporation]
 Software testing accounts for 50% of pre-release
costs,
and 70% of post-release costs
[Cigital
Corporation]
Critical SW Applications
Critical software applications which have failed :
 Mariner 1
Missing ‘-’ in ForTran code
NASA
1962
Rocket bound for Venus destroyed
 Therac 25
Data conversion error
Atomic Energy of Canada Ltd 1985-87
Radiation therapy machine for cancer
 Long Distance Service
A single line of bad code
AT&T
1990
Service outages up to nine hours long
 Patriot Missiles
U.S. military
1991
Endurance errors in tracking system 28 US soldiers killed in
barracks
 Tax Calculation Program
Incorrect results
InTuit
1995
SW vendor payed tax penalties for users
Good and successful
testing
• What is a good test case?
• A good test case has a high
probability of finding an as-yet
undiscovered error
• What is a successful test case?
• A successful test is one that uncovers
an as-yet undiscovered error
Who tests the software
better ?
developer
Understands the system
but, will test “gently”
and, is driven by
“delivery”
independent tester
Must learn about the system,
but, will attempt to break it
and, is driven by quality
Testability – can you develop a
program for testability?
• Operability - “The better it works, the more efficiently it
can be tested”
• Observability - the results are easy to see, distinct output
is generated for each input, incorrect output is easily
identified
• Controllability - processing can be controlled, tests can
be automated & reproduced
• Decomposability - software modules can be tested
independently
• Simplicity - no complex architecture and logic
• Stability - few changes are requested during testing
• Understandability - program is easy to understand
Did You Know...
• Testing/Debugging can worsen
reliability?
• We often chase the wrong bugs?
• Testing cannot show the absence of
faults, only the existence?
• The cost to develop software is
directly proportional to the cost of
testing?
– Y2K testing cost $600 billion
Did you also know...
• The most commonly applied
software testing techniques (black
box and white box) were developed
back in the 1960’s
• Most Oracles are human (error
prone)!!
• 70% of safety critical code can be
exceptions
– this is the last code written!
Testing Problems
•
•
•
•
•
•
Time
Faults hides from tests
Test Management costs
Training Personnel
What techniques to use
Books and education
“Errors are more common,
more pervasive, and more
troublesome in software than
with other technologies”
David Parnas
What is testing?
• How does testing software compare
with testing students?
What is testing?
• “Software testing is the process of
comparing the invisible to the
ambiguous as to avoid the
unthinkable.” James Bach, Borland
corp.
What is testing?
• Software testing is the process of
predicting the behavior of a product
and comparing that prediction to the
actual results." R. Vanderwall
Purpose of testing
• Build confidence in the product
• Judge the quality of the product
• Find bugs
Finding bugs can be difficult
A path through the
mine field (use case)
A path through the
mine field (use case)
Mine field
x
x
x
x
x
x
x
x
x
x
x
Why is testing important?
•
•
•
•
Therac25: Cost 6 lives
Ariane 5 Rocket: Cost $500M
Denver Airport: Cost $360M
Mars missions, orbital explorer &
polar lander: Cost $300M
Why is testing so hard?
Reasons for customer
reported bugs
• User executed untested code
• Order in which statements were executed
in actual use different from that during
testing
• User applied a combination of untested
input values
• User’s operating environment was never
tested
Interfaces to your software
•
•
•
•
Human interfaces
Software interfaces (APIs)
File system interfaces
Communication interfaces
– Physical devices (device drivers)
– controllers
Selecting test scenarios
• Execution path criteria (control)
– Statement coverage
– Branching coverage
• Data flow
– Initialize each data structure
– Use each data structure
• Operational profile
• Statistical sampling….
What is a bug?
• Error: mistake made in translation or
interpretation ( many taxonomies exist to
describe errors)
• Fault: manifestation of the error in
implementation (very nebulous)
• Failure: observable deviation in behavior
of the system
Example
• Requirement: “print the speed,
defined as distance divided by time”
• Code: s = d/t; print s
Example
• Error; I forgot to account for
t=0
• Fault: omission of code to catch t=0
• Failure: exception is thrown
Severity taxonomy
•
•
•
•
•
Mild - trivial
Annoying - minor
Serious - major
Catastrophic - Critical
Infectious - run for the hills
What is your taxonomy ?
IEEE 1044-1993
Life cycle
Errors can be introduced at
each of these stages
error
Testing and repair process can be
just as error prone as the development
Process (more so ??)
Requirements
error
error
error
Resolve
Design
Isolate
Code
error
Classify
Testing
error
error
Ok, so lets just design our
systems with “testability” in
mind…..
Testability
• How easily a computer program can
be tested (Bach)
• We can relate this to “design for
testability” techniques applied in
hardware systems
JTAG
A standard Integrated Circuit
Boundary
Scan cells
Boundary
Scan path
Data out
Core
IC
Logic
I/O pads
Test data
in (TDI)
TDI
Data in
Test access port
controller
Test mode
Select (TMS)
cell
Test clock
(TCK)
Test data
out (TDO)
TDO
Operability
• “The better it works, the more
efficiently it can be tested”
– System has few bugs (bugs add
analysis and reporting overhead)
– No bugs block execution of tests
– Product evolves in functional stages
(simultaneous development and testing)
Observability
• “What you see is what you get”
– Distinct output is generated for each input
– System states and variables are visible and
queriable during execution
– Past system states are ….. (transaction logs)
– All factors affecting output are visible
Observability
– Incorrect output is easily identified
– Internal errors are automatically
detected through self-testing
mechanisms
– Internal errors are automatically
reported
– Source code is accessible
Visibility Spectrum
End customer
visibility
Factory
visibility
GPP
visibility
DSP
visibility
Controllability
• “The better we can control the
software, the more the testing can be
automated and optimized”
– All possible outputs can be generated
through some combination of input
– All code is executable through some
combination of input
Controllability
– SW and HW states and variables can be
controlled directly by the test engineer
– Input and output formats are consistent
and structured
Decomposability
• “By controlling the scope of testing,
we can more quickly isolate
problems and perform smarter
testing”
– The software system is built from
independent modules
– Software modules can be tested
independently
Simplicity
• “The less there is to test, the more
quickly we can test it”
– Functional simplicity (feature set is
minimum necessary to meet
requirements)
– Structural simplicity (architecture is
modularized)
– Code simplicity (coding standards)
Stability
• “The fewer the changes, the fewer
the disruptions to testing”
– Changes to the software are infrequent,
controlled, and do not invalidate
existing tests
– Software recovers well from failures
Understandability
• “The more information we have, the
smarter we will test”
– Design is well understood
– Dependencies between external, internal, and
shared components are well understood
– Technical documentation is accessible, well
organized, specific and detailed, and accurate
“Bugs lurk in corners and
congregate at boundaries”
Boris Beizer
Types of errors
• What is a Testing error?
– Claiming behavior is erroneous when it
is in fact correct
– ‘fixing’ this type of error actually breaks
the product
Errors in classification
• What is a Classification error ?
– Classifying the error into the wrong
category
• Why is this bad ?
– This puts you on the wrong path for a
solution
Example Bug Report
• “Screen locks up for 10 seconds after
‘submit’ button is pressed”
• Classification 1: Usability Error
• Solution may be to catch user events and
present an hour-glass icon
• Classification 2: Performance error
• solution may be a modification to a sort
algorithm (or visa-versa)
Isolation error
• Incorrectly isolating the erroneous
modules
• Example: consider a client server
architecture. An improperly formed client
request results in an improperly formed
server response
• The isolation determined (incorrectly) that
the server was at fault and was changed
• Resulted in regression failure for other
clients
Resolve errors
• Modifications to remediate the failure
are themselves erroneous
• Example: Fixing one fault may
introduce another
What is the ideal test case?
• Run one test whose output is "Modify line
n of module i."
• Run one test whose output is "Input
Vector v produces the wrong output"
• Run one test whose output is "The
program has a bug" (Useless, we know
this)
More realistic test case
• One input vector and expected output
vector
– A collection of these make of a Test Suite
• Typical (naïve) Test Case
– Type or select a few inputs and observe output
– Inputs not selected systematically
– Outputs not predicted in advance
Test case definition
• A test case consists of;
– an input vector
– a set of environmental conditions
– an expected output.
• A test suite is a set of test cases chosen
to meet some criteria (e.g. Regression)
• A test set is any set of test cases
Testing Software Intensive
Systems
V&V
• Verification
– are we building the product
right?
• Validation
– are we building the right
product?
– is the customer satisfied?
• How do we do it?
• Inspect and Test
What do we inspect and
test?
•
•
•
•
•
•
All work products!
Scenarios
Requirements
Designs
Code
Documentation
Defect Testing
A Testing Test
• Problem
– A program reads three integer values
from the keyboard separated by spaces.
The three values are interpreted as
representing the lengths of the sides of
a triangle. The program prints a
message that states whether the
triangle is scalene, isosceles or
equilateral.
• Write a set of test cases to
adequately test this program
Static and Dynamic V&V
Static
Verification
Requirements
specification
Prototype
High-Level
design
Detailed
design
Program
Dynamic
Verification
Techniques
• Static Techniques
– Inspection
– Analysis
– Formal verification
• Dynamic Techniques
– Testing
SE-CMM
PA 07: Verify & Validate
System
• Verification: perform comprehensive
evaluations to ensure that all work
products meet requirements
– Address all work products: from user
needs an expectations through
production and maintenance
• Validation - meeting customer needs
- continues throughout product
lifecycle
V&V Base Practices
• Establish plans for V&V
– objectives, resources, facilities, special
equipment
– come up with master test plan
• Define the work products to be tested
(requirements, design, code) and the
methods (reviews, inspections, tests)
that will be used to verify
• Define verification methods
– test case input, expected results, criteria
– connect requirements to tests
V&V Base Practices...
• Define how to validate the system
– includes customer as user/operator
– test conditions
– test environment
– simulation conditions
• Perform V&V and capture results
– inspection results; test results;
exception reports
• Assess success
– compare results against expected
results
– success or failure?
Testing is...
• The process of executing a program
with the intent of finding defects
• This definition implies that testing is
a destructive process - often going
against the grain of what software
developers do, i.e.. construct and
build software
• A successful test run is NOT one in
which no errors are found
Test Cases
• A successful test case finds an error
• An unsuccessful test case is one that
causes the program to produce the
correct result
• Analogy: feeling ill, going to the
doctor, paying $300 for a lab test
only to be told that you’re OK!
Testing demonstrates the
presence not the absence of
faults
Iterative Testing Process
Unit
testing
User Testing
Acceptance
testing
Module
testing
Sub system
testing
System
testing
Component Testing
Integration Testing
It is impossible to
completely test a program
Testing and Time
• Exhaustive testing is impossible for
any program with low to moderate
complexity
• Testing must focus on a subset of
possible test cases
• Test cases should be systematically
derived, not random
Testing Strategies
• Top Down testing
– use with top-down programming; stubs
required; difficult to generate output
• Bottom Up testing
– requires driver programs; often combined
with top-down testing
• Stress testing
– test system overload; often want system to
fail-soft rather than shut down
– often finds unexpected combinations of
events
Test-Support Tools
• Scaffolding
– code created to help test the software
• Stubs
– a dummied-up low-level routine so it
can be called by a higher level routine
Stubs Can Vary in
Complexity
•
•
•
•
Return, no action taken
Test the data fed to it
Print/echo input data
Get return values from interactive
input
• Return standard answer
• Burn up clock cycles
• Function as slow, fat version of
ultimate routine
Driver Programs
• Fake (testing) routine that calls other
real routines
• Drivers can:
– call with fixed set of inputs
– prompt for input and use it
– take arguments from command line
– read arguments from file
• main() can be a driver - then
“remove” it with preprocessor
statements. Code is unaffected
System Tests Should be
Incremental
Modules
A
B
test 1
test 2
System Tests Should be
Incremental
Modules
A
B
test 1
test 2
test 3
C
System Tests Should be
Incremental
Modules
A
B
C
D
test 1
test 2
test 3
test 4
Not Big-Bang
Approaches to Testing
• White Box testing
– based on the implementation - the
structure of the code; also called
structural testing
• Black Box testing
– based on a view of the program as a
function of Input and Output; also called
functional testing
• Interface Testing
– derived from program specification and
knowledge of interfaces
White Box (Structural)
Testing
• Testing based on the structure of the
code
…
if x
then j = 2
else k = 5
…...
start with actual
program code
White Box (Structural)
Testing
• Testing based on the structure of the
code
Test data
Tests
…
if x
then j = 2
else k = 5
…...
Test output
White Box Technique:
Basis Path Testing
• Objective– test every independent execution path
through the program
• If every independent path has been
executed then every statement will be
executed
• All conditional statements are tested
for both true and false conditions
• The starting point for path testing is the
flow graph
Flow Graphs
if then else
Flow Graphs
if then else
loop while
Flow Graphs
if then else
loop while
case of
How many paths thru this
program?
1)j = 2;
2) k = 5;
3) read (a);
4) if a=2
5) then j = a
6) else j = a*k;
7) a = a + 1;
8) j = j + 1;
9) print (j);
How many paths thru this
program?
1, 2, 3
1) j = 2;
2) k = 5;
3) read (a);
4) if a=2
5) then j = a
6) else j = a*k;
7) a = a + 1;
8) j = j + 1;
9) print (j);
4
5
6
7,8,9
How Many Independent
Paths?
• An independent path introduces at
least one new statement or condition to
the collection of already existing
independent paths
• Cyclomatic Complexity (McCabe)
• For programs without GOTOs,
Cyclomatic Complexity
= Number of decision nodes + 1
also called predicate nodes
The Number of Paths
• Cyclomatic Complexity gives an
upper bound on the number of tests
that must be executed in order to
cover all statements
• To test each path requires
– test data to trigger the path
– expected results to compare against
1, 2, 3
1) j = 2;
2) k = 5;
3) read (a);
4) if a=2
5) then j = a
6) else j = a*k;
7) a = a + 1;
8) j = j + 1;
9) print (j);
4
5
6
7,8,9
Test 1
input: 2
expected output: 3
1) j = 2;
2) k = 5;
3) read (a);
4) if a=2
5) then j = a
6) else j = a*k;
7) a = a + 1;
8) j = j + 1;
9) print (j);
1, 2, 3
4
5
6
7,8,9
Test 1
input: 2
Test 2
input: 10
expected output: 3
expected output: 51
1) j = 2;
2) k = 5;
3) read (a);
4) if a=2
5) then j = a
6) else j = a*k;
7) a = a + 1;
8) j = j + 1;
9) print (j);
1, 2, 3
4
5
6
7,8,9
What Does Statement
Coverage Tell You?
so?
• All statements have
been executed at
least once
What Does Statement
Coverage Tell You?
• All statements have
been executed at least
once
Coverage testing may lead
to the false illusion that the
software has been
comprehensively tested
The Downside of Statement
Coverage
• Path testing results in the execution
of every statement
• BUT, not all possible combinations of
paths thru the program
• There are an infinite number of
possible path combinations in
programs with loops
The Downside of Statement
Coverage
• The number of paths is usually
proportional to program size making
it useful only at the unit test level
Black Box Testing
Forget the code details!
Forget the code details!
Treat the
program as a
Black Box
In
Out
Black Box Testing
• Aim is to test all functional
requirements
• Complementary, not a replacement
for White Box Testing
• Black box testing typically occurs
later in the test cycle than white box
testing
Defect Testing Strategy
Input Test
Data
Output
Locate inputs
causing
erroneous
output
Output
indicating
defects
Black Box Techniques
• Equivalence partitioning
• Boundary value testing
Equivalence Partitioning
•
•
•
•
Data falls into categories
Positive and Negative Numbers
Strings with & without blanks
Programs often behave in a
comparable way for all values in a
category -- also called an
equivalence class
invalid input
valid input
System
Choose test cases from
partitions
invalid input
valid input
System
Specification determines
Equivalence Classes
• Program accepts 4 to 8 inputs
• Each is 5 digits greater than 10000
Specification determines
Equivalence Classes
• Program accepts 4 to 8 inputs
• Each is 5 digits greater than 10000
less than 4
4 thru 8
more than
8
Specification determines
Equivalence Classes
• Program accepts 4 to 8 inputs
• Each is 5 digits greater than 10000
less than 4
less than
10000
4 thru 8
10000 thru
99999
more than
8
more than
99999
Specification determines
Equivalence Classes
• Program accepts 4 to 8 inputs
• Each is 5 digits greater than 10000
less than 4
less than
10000
4 thru 8
10000 thru
99999
more than
8
more than
99999
Specification determines
Equivalence Classes
• Program accepts 4 to 8 inputs
• Each is 5 digits greater than 10000
less than 4
less than
10000
4 thru 8
10000 thru
99999
more than
8
more than
99999
Boundary Value Analysis
• Complements equivalence
partitioning
• Select test cases at the boundaries
of a class
• Range Boundary a..b
– test just below a and just above b
• Input specifies 4 values
– test 3 and 5
• Output that is limited should be
tested above and below limits
Other Testing Strategies
•
•
•
•
•
Array testing
Data flow testing
GUI testing
Real-time testing
Documentation testing
Arrays
• Test software with arrays of one
value
• Use different arrays of different sizes
in different tests
• Derive tests so that the first, last and
middle elements are tested
Data Flow testing
• Based on the idea that data usage is
at least as error-prone as control
flow
• Boris Beizer claims that at least half
of all modern programs consist of
data declarations and initializations
Data Can Exist in One of
Three States
• Defined
– initialized, not used
a=2
• Used
x = a * b + c;
z = sin(a)
• Killed
free (a)
– end of for loop or block where is was
defined
Entering & Exiting
• Terms describing context of a routine
before doing something to a variable
• Entered
– control flow enter the routine before
variable is acted upon
• Exited
– control flow leaves routine immediately
after variable is acted upon
Data Usage Patterns
• Normal
– define variable; use one or more times;
perhaps killed
• Abnormal Patterns
– Defined-Defined
– Defined-Exited
• if local, why?
– Defined-Killed
• wasteful if not strange
More Abnormal Patterns
• Entered-Killed
• Entered-Used
– should be defined before use
• Killed-Killed
– double kills are fatal for pointers
• Killed-Used
– why really are you using?
• Used-Defined
– what’s its value?
Define-Use Testing
if (condition-1)
x = a;
else
x = b;
Path Testing
Test1:
condition-1 TRUE
condition-2 TRUE
Test2:
condition-1 FALSE
condition-2 FALSE
if (condition-2)
y = x + 1;
else
y = x - 1;
WILL EXERCISE EVERY LINE
OF CODE … BUT will NOT test
the DEF-USE combinations
x=a / y = x-1
x=b/ y = x + 1
GUIs
• Are complex to test because of their
event driven character
• Windows
– move, resized and scrolled
– regenerate when overwritten and then
recalled
– menu bars change when window is
active
– multiple window functionality available?
GUI.. Menus
•
•
•
•
•
Menu bars in context?
Submenus - listed and working?
Are names self explanatory?
Is help context sensitive?
Cursor changes with operations?
Testing Documentation
• Great software with lousy
documentation can kill a product
• Documentation testing should be
part of every test plan
• Two phases
– review for clarity
– review for correctness
Documentation Issues
• Focus on functional usage?
• Are descriptions of interaction
sequences accurate
• Examples should be used
• Is it easy to find how to do
something
• Is there trouble shooting section
• Easy to look up error codes
• TOC and index
Real-Time Testing
• Needs white box and black box PLUS
– consideration of states, events,
interrupts and processes
• Events will often have different
effects depending on state
• Looking at event sequences can
uncover problems
Real-Time Issues
• Task Testing
– test each task independently
• Event testing
– test each separately; then in context of
state diagrams;
– scenario sequences and random
sequences
• Intertask testing
– Ada rendezvous
– message queuing; buffer overflow
Other Testing Terms
• Statistical testing
– running the program against expected
usage scenarios
• Regression testing
– retesting the program after modification
• Defect testing
– trying to find defects (aka bugs)
• Debugging
• the process of discovering and
removing defects
Summary
V&V
• Verification
– Are we building the system right?
• Validation
– Are we building the right system?
• Testing is part of V&V
• V&V is more than testing...
• V&V is plans, testing, reviews,
methods, standards, and
measurement
Testing Principles
• The necessary part of a test case is a
definition of the expected output or result
– the eye often sees what it wants to see
• Programmers should avoid testing their
own code
• Organizations should not test their own
programs
• Thoroughly inspect the results of each
test
Testing Principles
• Test invalid as well as valid
conditions
• The portability of errors in a section
code is proportional to the number of
errors already found there
Testing Principles
• Tests should be traceable to customer
requirements
• Tests should be planned before testing
begins
• The Pareto principle applies - 80% of all
errors is in 20% of the code
• Begin small, scale up
• Exhaustive testing is not possible
• The best testing is done by a 3rd party
Guidelines
• Testing capabilities is more
important than testing components
– users have a job to do; tests should
focus on things that interfere with
getting the job done, not minor
irritations
• Testing old capabilities is more
important then testing new features
• Testing typical situations is more
important than testing boundary
conditions
System Testing
Ian Summerville
System Testing
• Testing the system as a whole to
validate that it meets its specification
and the objectives of its users
Development testing
• Hardware and software components
should be tested;
– as they are developed
– as sub-systems are created.
• These testing activities include:
– Unit testing.
– Module testing
– Sub-system testing
Development testing
• These tests do not cover:
– Interactions between components or
sub-systems where the interaction
causes the system to behave in an
unexpected way
– The emergent properties of the system
System testing
• Testing the system as a whole
instead of individual system
components
• Integration testing
– As the system is integrated, it is tested
by the system developer for
specification compliance
• Stress testing
– The behavior of the system is tested
under conditions of load
System testing
• Acceptance testing
– The system is tested by the customer to
check if it conforms to the terms of the
development contract
• System testing reveals errors which
were undiscovered during testing at
the component level
System Test Flow
Requirements
specification
System
specification
System
Integration
test plan
Acceptance
test plan
Service
System
design
Acceptance
test
System
Integration
test
Detailed
design
Sub-system
Integration
test plan
Sub-system
Integration
test
Unit code
and test
Integration testing
• Concerned with testing the system
as it is integrated from its
components
• Integration testing is normally the
most expensive activity in the
systems integration process
Integration testing
• Should focus on
– Interface testing where the interactions
between sub-systems and components
are tested
– Property testing where system
properties such as reliability,
performance and usability are tested
Integration Test Planning
• Integration testing is complex and
time-consuming and planning of the
process is essential
• The larger the system, the earlier this
planning must start and the more
extensive it must be
• Integration test planning may be the
responsibility of a separate IV&V
(verification and validation) team
– or a group which is separate from the
development team
Test planning activities
• Identify possible system tests using
the requirements document
• Prepare test cases and test
scenarios to run these system tests
• Plan the development, if required, of
tools such as simulators to support
system testing
• Prepare, if necessary, operational
profiles for the system
• Schedule the testing activities and
estimate testing costs
Interface Testing
• Within a system there may be literally
hundreds of different interfaces of
different types. Testing these is a
major problem.
• Interface tests should not be
concerned with the internal operation
of the sub-system although they can
highlight problems which were not
discovered when the sub-system
was tested as an independent entity.
Two levels of interface
testing
• Interface testing during development
when the developers test what they
understand to be the sub-system
interface
• Interface testing during integration
where the interface, as understood
by the users of the subsystem, is
tested.
Two levels of interface
testing
• What developers understand as the
system interface and what users
understand by this are not always
the same thing.
Interface Testing
Test
Cases
A
B
C
Interface Problems
• Interface problems often arise
because of poor communications
within the development team or
because of poor change
management procedures
• Typically, an interface definition is
agreed but, for good reasons, this
has to be chanegd during
development
Interface Problems
• To allow other parts of the system to
cope with this change, they must be
informed of it
• It is very common for changes to be
made and for potential users of the
interface to be unaware of these
changes
– problems arise which emerge during
interface testing
What is an interface?
• An agreed mechanism for
communication between different
parts of the system
• System interface classes
– Hardware interfaces
• Involving communicating hardware units
– Hardware/software interfaces
• Involving the interaction between hardware
and software
What is an interface?
– Software interfaces
• Involving communicating software
components or sub-systems
– Human/computer interfaces
• Involving the interaction of people and the
system
– Human interfaces
• Involving the interactions between people in
the process
Hardware interfaces
• Physical-level interfaces
– Concerned with the physical connection
of different parts of the system e.g.
plug/socket compatibility, physical
space utilization, wiring correctness,
etc.
• Electrical-level interfaces
– Concerned with the electrical/electronic
compatibility of hardware units i.e. can
a signal produced by one unit be
processed by another unit
Hardware interfaces
• Protocol-level interfaces
– Concerned with the format of the
signals communicated between
hardware units
Software interfaces
• Parameter interfaces
– Software units communicate by setting
pre-defined parameters
• Shared memory interfaces
– Software units communicate through a
shared area of memory
– Software/hardware interfaces are
usually of this type
Software interfaces
• Procedural interfaces
– Software units communicate by calling
pre-defined procedures
• Message passing interfaces
– Software units communicate by passing
messages to each other
Parameter Interfaces
Subsystem 2
Subsystem 1
Parameter
list
Shared Memory Interfaces
SS1
SS2
SS3
Shared memory area
Procedural Interfaces
Subsystem 1
Defined procedures
(API)
Subsystem 2
Message Passing Interfaces
Subsystem 1
Subsystem 2
Exchanged
messages
Interface errors
• Interface misuse
– A calling component calls another
component and makes an error in its
use of its interface e.g. parameters in
the wrong order
• Interface misunderstanding
– A calling component embeds
assumptions about the behavior of the
called component which are incorrect
Interface errors
• Timing errors
– The calling and called component
operate at different speeds and out-ofdate information is accessed
Stress testing
• Exercises the system beyond its
maximum design load
– The argument for stress testing is that
system failures are most likely to show
themselves at the extremes of the
system’s behavior
• Tests failure behavior
– When a system is overloaded, it should
degrade gracefully rather than fail
catastrophically
Stress testing
• Particularly relevant to distributed
systems
– As the load on the system increases, so
too does the network traffic. At some
stage, the network is likely to become
swamped and no useful work can be
done
Acceptance testing
• The process of demonstrating to the
customer that the system is
acceptable
• Based on real data drawn from
customer sources. The system must
process this data as required by the
customer if it is to be acceptable
Acceptance testing
• Generally carried out by customer
and system developer together
• May be carried out before or after a
system has been installed
Performance testing
• Concerned with checking that the
system meets its performance
requirements
• Number of transactions processed
per second
– Response time to user interaction
– Time to complete specified operations
Performance testing
• Generally requires some logging
software to be associated with the
system to measure its performance
• May be carried out in conjunction
with stress testing using simulators
developed for stress testing
Reliability testing
• The system is presented with a large
number of ‘typical’ inputs and its
response to these inputs is observed
• The reliability of the system is based
on the number of incorrect outputs
which are generated in response to
correct inputs
• The profile of the inputs (the
operational profile) must match the
real input probabilities if the
reliability estimate is to be valid
Security testing
• Security testing is concerned with
checking that the system and its data
are protected from accidental or
malicious damage
• Unlike other types of testing, this
cannot really be tested by planning
system tests. The system must be
secure against unanticipated as well
as anticipated attacks
Security testing
• Security testing may be carried out
by inviting people to try to penetrate
the system through security
loopholes
Some Costly and Famous
Software Failures
Mariner 1 Venus probe loses
its way: 1962
Mariner 1
• A probe launched from Cape
Canaveral was set to go to Venus
• After takeoff, the unmanned rocket
carrying the probe went off course,
and NASA had to blow up the rocket
to avoid endangering lives on earth
• NASA later attributed the error to a
faulty line of Fortran code
Mariner 1
• “... a hyphen had been dropped from
the guidance program loaded aboard
the computer, allowing the flawed
signals to command the rocket to
veer left and nose down…
• The vehicle cost more than $80
million, prompting Arthur C. Clarke
to refer to the mission as "the most
expensive hyphen in history."
Therac 25 Radiation
Machine
Radiation machine kills
four: 1985 to 1987
• Faulty software in a Therac-25
radiation-treatment machine made by
Atomic Energy of Canada Limited
(AECL) resulted in several cancer
patients receiving lethal overdoses
of radiation
• Four patients died
Radiation machine kills
four: 1985 to 1987
• A lesson to be learned from the
Therac-25 story is that focusing on
particular software bugs is not the
way to make a safe system,”
• "The basic mistakes here involved
poor software engineering practices
and building a machine that relies on
the software for safe operation."
AT&T long distance service
fails
AT&T long distance service
fails: 1990
• Switching errors in AT&T's callhandling computers caused the
company's long-distance network to
go down for nine hours, the worst of
several telephone outages in the
history of the system
• The meltdown affected thousands of
services and was eventually traced
to a single faulty line of code
Patriot missile
Patriot missile misses: 1991
• The U.S. Patriot missile's battery was
designed to head off Iraqi Scuds
during the Gulf War
• System also failed to track several
incoming Scud missiles, including
one that killed 28 U.S. soldiers in a
barracks in Dhahran, Saudi Arabia
Patriot missile misses: 1991
• The problem stemmed from a
software error that put the tracking
system off by 0.34 of a second
• System was originally supposed to
be operated for only 14 hours at a
time
– In the Dhahran attack, the missile
battery had been on for 100 hours
– errors in the system's clock
accumulated to the point that the
tracking system no longer functioned
Pentium chip
Pentium chip fails math
test: 1994
• Pentium chip gave incorrect answers
to certain complex equations
– bug occurred rarely and affected only a
tiny percentage of Intel's customers
• Intel offered to replace the affected
chips, which cost the company $450
million
• Intel then started publishing a list of
known "errata," or bugs, for all of its
chips
New Denver airport
New Denver airport misses
its opening: 1995
• The Denver International Airport was
intended to be a state-of-the-art
airport, with a complex,
computerized baggage-handling
system and 5,300 miles of fiber-optic
cabling
• Bugs in the baggage system caused
suitcases to be chewed up and drove
automated baggage carts into walls
New Denver airport misses
its opening: 1995
• The airport eventually opened 16
months late, $3.2 billion over budget,
and with a mainly manual baggage
system
The millennium bug: 2000
• No need to discuss this !!
Ariane 5 Rocket
Ariane 5
• The failure of the Ariane 501 was
caused by the complete loss of
guidance and attitude information 37
seconds after start of the main
engine ignition sequence (30
seconds after lift- off)
• This loss of information was due to
specification and design errors in the
software of the inertial reference
system
Ariane 5
• The extensive reviews and tests
carried out during the Ariane 5
Development Programme did not
include adequate analysis and
testing of the inertial reference
system or of the complete flight
control system, which could have
detected the potential failure
More on Testing
From Beatty – ESC 2002
Agenda
•
•
•
•
•
Introduction
Types of software errors
Finding errors – methods and tools
Embedded systems and RT issues
Risk management and process
Introduction
• Testing is expensive
• Testing progress can be hard to
predict
• Embedded systems have different
needs
• Desire for best practices
Method
• Know what you are looking for
• Learn how to effectively locate
problems
• Plan to succeed – manage risk
• Customize and optimize the process
Entomology
• What are we looking for ?
• How are bugs introduced?
• What are their consequences?
Entomology – Bug
Frequency
•
•
•
•
Rare
Less common
More common
Common
Entomology – Bug severity
•
•
•
•
Non functional; doesn’t affect object code
Low: correct problem when convenient
High: correct as soon as possible
Critical: change MUST be made
– Safety related or legal issue
Domain Specific !
Entomology - Sources
• Non-implementation error sources
– Specifications
– Design
– Hardware
– Compiler errors
• Frequency; common – 45 to 65%
• Severity; Non-functional to critical
Entomology - Sources
• Poor specifications and designs are
often;
– Missing
– Ambiguous
– Wrong
– Needlessly complex
– Contradictory
Testing can fix these problems !
Entomology - Sources
• Implementation error sources;
– Algorithmic/processing bugs
– Data bugs
– Real-time bugs
– System bugs
– Other bugs
Bugs may fit in more than one category !
Entomology – Algorithm
Bugs
• Parameter passing
– Common only in complex invocations
– Severity varies
• Return codes
– Common only in complex functions or libraries
• Reentrance problem
– Less common
– Critical
Entomology – Algorithm
Bugs
• Incorrect control flow
– Common
– Severity varies
• Logic/math/processing error
– Common
– High
• Off by “1”
– Common
– Varies, but typically high
Example of logic error
If (( this AND that ) OR ( that AND other )
AND NOT ( this AND other ) AND NOT
( other OR NOT another ))
Boolean operations and mathematical
calculations can be easily misunderstood
In complicated algorithms!
Example of off by 1
for ( x = 0;, x <= 10; x++)
This will execute 11 times, not 10!
for ( x = array_min; x <= array_max; x++)
If the intention is to set x to array_max
on the last pass through the loop, then
this is in error!
Be careful when switching between 1 based
language (Pascal, Fortran) to zero based (C)
Entomology – Algorithm
bugs
• Math underflow/overflow
– Common with integer or fixed point
math
– High severity
– Be careful when switching between
floating point and fixed point
processors
Entomology – Data bugs
• Improper variable initialization
– Less common
– Varies; typically low
• Variable scope error
– Less common
– Low to high
Example - Uninitialized data
int some_function ( int some_param ) {
int j;
if (some_param >= 0)
{
for ( j=0; j<=3; j++)
{
/* iterate through some process */
}
} else
{
if (some_param <= -10)
{
some_param += j; /* j is uninitialized */
}
return some_param;
}
return 0;
}
Entomology – Data bugs
• Data synchronization error
– Less common
– Varies; typically high
Example – synchronized
data
struct state {
GEAR_TYPE gear;
U16 speed;
U16 speed_limit;
U8 last_error_code;
} snapshot;
/* an interrupt will trigger */
/* sending snapshot in a message */
snapshot.speed = new_speed; /* …somewhere in code */
snapshot.gear = new gear;
/* somewhere else */
snapshot.speed_limit = speed_limit_tb[ gear ];
Interrupt splitting these two would be bad
Entomology – Data bugs
• Improper data usage
– Common
– Varies
• Incorrect flag usage
– Common when hard-coded constants
used
– varies
Example – mixed math error
unsigned int a = 5;
int b = -10;
/* somewhere in code */
if ( a + b > 0 )
{
a+b is not evaluated as –5 !
the signed int b is converted to an unsigned int
Entomology – Data bugs
• Data/range overflow/underflow
– Common in asm and 16 bit micro
– Low to critical
• Signed/unsigned data error
– Common in asm and fixed point math
– High to critical
• Incorrect conversion/type cast/scaling
– Common in complex programs
– Low to critical
Entomology – Data bugs
• Pointer error
– Common
– High to critical
• Indexing problem
– Common
– High to critical
Entomology – Real-time
bugs
• Task synchronization
– Waiting, sequencing, scheduling, race
conditions, priority inversion
– Less common
– Varies
• Interrupt handling
– Unexpected interrupts
– Improper return from interrupt
• Rare
• critical
Entomology – Real-time
bugs
• Interrupt suppression
– Critical sections
– Corruption of shared data
– Interrupt latency
• Less common
• critical
Entomology – System bugs
• Stack overflow/underflow
– Pushing, pulling and nesting
• More common in asm and complex designs
• Critical
• Resource sharing problem
– Less common
– High to critical
– Mars pathfinder
Entomology – System bugs
• Resource mapping
– Variable maps, register banks,
development maps
– Less common
– Critical
• Instrumentation problem
– Less common
– low
Entomology – System bugs
• Version control error
– Common in complex or mismanaged
projects
– High to critical
Entomology – other bugs
• Syntax/typing
– if (*ptr=NULL) Cut&paste errors
– More common
– Varies
• Interface
– Common
– High to critical
• Missing functionality
– Common
– high
Entomology – other bugs
• Peripheral register initialization
– Less common
– Critical
• Watchdog servicing
– Less common
– Critical
• Memory allocation/de-allocation
– Common when using malloc(), free()
– Low to critical
Entomology – Review
• What are you looking for ?
• How are bugs being introduced ?
• What are their consequences ?
Form your own target list!
Finding the hidden errors…
• All methods use these basic
techniques;
– Review; checking
– Tests; demonstrating
– Analysis; proving
These are all referred to as “testing” !
Testing
• “Organized process of identifying
variances between actual and
specified results”
• Goal: zero significant defects
Testing axioms
• All software has bugs
• Programs cannot be exhaustively
tested
• Cannot prove the absence of all
errors
• Complex systems often behave
counter-intuitively
• Software systems are often brittle
Finding spec/design
problems
•
•
•
•
Reviews / Inspections / Walkthroughs
CASE tools
Simulation
Prototypes
Still need consistently effective methods !
Testing – Spec/Design
Reviews
• Can be formal or informal
– Completeness
– Consistency
– Feasibility
– Testability
Testing – Evaluating
methods
• Relative costs
–
–
–
–
None
Low
Moderate
High
• General
effectiveness
–
–
–
–
Low
Moderate
High
Very high
Testing – Code reviews
• Individual review
– Effectiveness: high
– Cost; Time – low,
material - none
• Group inspections
– Effectiveness: very
high
– Cost; Time –
moderate, material
- none
Testing – Code reviews
• Strengths
–
–
–
–
Early detection of errors
Logic problems
Math errors
Non-testable requirement or paths
• Weaknesses
– Individual preparation and experience
– Focus on details, not “big picture”
– Timing and system issues
Step by step execution
• Exercise every line of code or every
branch condition
• Look for errors
– Use simulator, ICE, logic analyzer
– Effectiveness; moderate – dependent
on tester
– Cost; time is high, material is low or
moderate
Functional (Black Box)
• Exercise inputs and examine outputs
• Test procedures describe expected
behavior
• Subsystems tested and integrated
– Effectiveness is moderate
– Cost; time is moderate, material varies
Tip; where functional testing finds problems
look deeper in that area !
Functional (Black Box)
• Strengths
–
–
–
–
Requirements problems
Interfaces
Performance issues
Most critical/most used features
• Weaknesses
– Poor coverage
– Timing and other problems masked
– Error conditions
Functional test process
• ID requirements to test
• Choose strategy
– 1 test per requirement
– Test small groups of requirements
– Scenario; broad sweep of many requirements
• Write test cases
– Environment
– Inputs
– Expected outputs
• Traceability
Structural (White box)
•
•
•
•
Looks at how code works
Test procedures
Exercise paths using many data values
Consistency between design and
implementation
– Effectiveness; high
– Cost; time is high, material low to moderate
Structural (White box)
• Strengths
–
–
–
–
Coverage
Effectiveness
Logic and structure problems
Math and data errors
• Weaknesses
–
–
–
–
Interface and requirements
Focused; may miss “big picture”
Interaction with system
Timing problems
Structural (White box)
• Test rigor based on 3 levels of Risk
(FAA)
• C – reduced safety margins or
functionality
– Statement coverage
– Invoke every statement at least once
Structural (White box)
• Test rigor based on 3 levels of Risk
(FAA)
• B – Hazardous – Decision Coverage
– Invoke every statement at least once
– Invoke every entry and exit
– Every control statement takes all
possible outcomes
– Every non-constant Boolean expression
evaluated to both a True and a False
result
Structural (White box)
• Test rigor based on 3 levels of Risk
(FAA)
• A – Catastrophic – Modified
Condition Decision Coverage
– Every statement has been invoked
– Every point of entry and exit has been
invoked
Structural (White box)
– Every control statement has taken all possible
outcomes
– Every Boolean expression has evaluated to
both a True and a False result
– Every condition in a Boolean expression has
evaluated to both True/False
– Every condition in a Boolean expression has
been shown to independently affect that
expression’s outcome
Unit test standards
•
•
•
•
What is the white box testing plan?
What do you test?
When do you test it?
How do you test it?
Structural test process
•
•
•
•
ID all inputs
ID all outputs
ID all paths
Set up test cases
– Decision coverage
– Boundary value analysis
– Checklist
– Weaknesses
Structural test process
• Measure worst case execution time
• Determine worst case stack depth
• Bottom up
Integration
• Combines elements of white and
block box
– Unexpected return codes or
acknowledgements
– Parameters – boundary values
– Assumed initial conditions/state
– Unobvious dependencies
– Aggregate functionality
Integration
• Should you do this…when?
– Depends on the complexity of the
system
– Boundary values of parameters in
functions
– Interaction between units
– “interesting” paths
• Errors
• Most common
Verification
• Verify the structural integrity of the
code
• Find errors hidden at other levels of
examination
• Outside of requirements
• Conformance to standards
Verification
• Detailed inspection, analysis, and
measurement of code to find common
errors
• Examples
–
–
–
–
–
Stack depth analysis
Singular use of flags/variables
Adequate interrupt suppression
Maximum interrupt latency
Processor-specific constraints
Verification
• Strengths
– Finds problems that testing and inspection
can’t
– Stack depth
– Resource sharing
– Timing
• Weaknesses
– Tedious
– Focused on certain types of errors
Verification
• Customize for your
process/application
– What should be checked
– When
– How
– By whom
Stress/performance
• Load the system to maximum…and
beyond!
• Helps determine “factor of safety”
• Performance to requirements
Stress/performance
• Examples
– Processor utilization
– Interrupt latency
– Worst time to complete a task
– Periodic interrupt frequency jitter
– Number of messages per unit time
– Failure recovery
Other techniques
• Fault injection
• Scenario testing
• Regression
–
–
–
–
Critical functions
Most functionality with the least tests
Automation
Risk of not re-testing is higher than the cost
• Boundary value testing
Tools
ICE
Simulator Logic
analyzer
X
X
Step
through
code
X
Control
execution
Modifying
data
X
X
X
X
Coverage
X
X
X
Timing
analysis
X
X
X
Code Inspection Checklist
Code Inspection Checklist
• Code correctly implements the
document software design
• Code adheres to coding standards
and guidelines
• Code is clear and understandable
• Code has been commented
appropriately
• Code is within complexity guidelines
– Cyclomatic complexity < 12
Code Inspection Checklist
• Macro formal parameters should not
have side affects (lint message 665)
• Use parenthesis to enhance code
robustness, use parenthesis around
all macro parameters (665, 773)
• Examine all typecasts for correct
operation
• Examine affects of all implicit type
conversions (910-919)
Code Inspection Checklist
• Look for off-by-one errors in loop
counters, arrays, etc
• Assignment statements within
condition expressions (use cchk)
• Guarantee that a pointer can never
be Null when de-referencing it
• Cases within a switch should end in
a break (616)
Code Inspection Checklist
• All switch statements should have a
default case (744)
• Examine all arguments passed to
functions for appropriate use of pass
by value, pass by reference, and
const
• Local variables must be initialized
before use
• Equality test on floating point
numbers may never be True (777)
Code Inspection Checklist
• Adding and subtracting floats of
different magnitudes can result in
lost precision
• Insure that division by zero cannot
occur
• Sequential multiplications and
divisions may produce round-off
errors
Code Inspection Checklist
• Subtracting nearly equal values can
produce cancellation errors
• C rounds towards zero – is this
appropriate here ?
• Mathematical underflow/overflow
potential
• Non-deterministic timing constructs
Unit test standards
Unit test standards
• 1. Each test case must be capable of
independent execution, i.e. the setup and
results of a test case shall not be used by
subsequent test cases
• 2. All input variables shall be initialized for
each test case. All output variables shall
be given an expected value, which will be
validated against the actual result for each
test case
Unit test standards
• 3. Initialize variables to valid values taking
into account any relationships among
inputs. In other words, if the value of a
variable A affects the domain of variable
B, select values for A and B which satisfy
the relationship
• 4. Verify that the minimum and maximum
values can be obtained for each output
variable (i.e. select input values that
produce output values as close to the
max/min as possible)
Unit test standards
• 5. Initialize output variables
according to the following;
– If an input is expected to change, set its
initial value to something other than the
expected result
– If an output is not expected to change,
set its initial value to its expected value
• 6. Verify loop entry and exit criteria
Unit test standards
• 7. Maximum loop iterations should
be executed to provide worst case
timing scenarios
• 8. Verify that the loss of precision
due to multiplication or division is
within acceptable tolerance
Unit test standards
• 9. The following apply to conditional
expressions
– “OR” expressions are evaluated by
setting all predicates “FALSE” and then
setting each one “TRUE” individually
– “AND” expressions are evaluated by
setting all predicates “TRUE” and then
setting each one “FALSE” individually
Unit test standards
• 10. Do not stub any functions that
are simple enough to include within
the unit test
• 11. Non-trivial tests should include
an explanation of what is being
tested
Unit test standards
• 12. Unit test case coverage is complete
when the following criteria are satisfied
(where applicable)
–
–
–
–
–
–
–
100% function and exit coverage
100% call coverage
100% statement block coverage
100% decision coverage
100% loop coverage
100% basic condition coverage
100% modified condition coverage
Unit test checklist
Common coding error Status
checks
Note(s)
Mathematical
expression
underflow/overflow
<Pass/Fail/NA>
Off-by-one errors in
loop counters
<Pass/Fail/NA>
Assignment
statements within
conditional
expressions
<Pass/Fail/NA>
May be detected by
compiler, lint, cchk
Floats are not
compared solely for
equality
<Pass/Fail/NA>
Lint message 777
Variables and
calibrations use
correct precision and
ranges in calculations
<Pass/Fail/NA>
Unit test checklist
Common coding error Status
checks
Pointers initialized
and de-referenced
properly
<Pass/Fail/NA>
Intermediate
calculations are not
stored in global
variables
<Pass/Fail/NA>
All declared local
variables are used in
the function
<Pass/Fail/NA>
Typecasting has been
done correctly
<Pass/Fail/NA>
Unreachable code
has been removed
<Pass/Fail/NA>
Note(s)
May be detected by
compiler or lint
Lint message 527
Common coding error Status
checks
All denominators are
guaranteed to be zero
(no divide by 0)
<Pass/Fail/NA>
Switch statement
handle every case of
the control variable
(have DEFAULT
paths). Any cases
that “fall through” to
the next case are
intended to do so
<Pass/Fail/NA>
Static variables are
used for only one
purpose
<Pass/Fail/NA>
All variables have
been properly
initialized before
being used assume a
value of “0” after
power –up
<Pass/Fail/NA>
Note(s)
Lint message 744,
787
Fall through 616