Lecture 11 Testing, Verification, Validation and Certification CS 540 – Quantitative Software Engineering

Download Report

Transcript Lecture 11 Testing, Verification, Validation and Certification CS 540 – Quantitative Software Engineering

CS 540 – Quantitative Software Engineering
Lecture 11 Testing,
Verification, Validation and
Certification
You can’t test in quality
Independent system testers
Software Quality vs. Software Testing

Software Quality Management (SQM) refers to
processes designed to engineer value, functional
conformance, and minimize faults, failures, and
defects
•

Includes processes throughout the software life cycle
(inspections, reviews, audits, validations, etc.
Software testing is an activity performed for
evaluating quality (and improving it) by
identifying defects and problems (SWEBOK)
SWEBOK Software Testing

“Software testing consists of dynamic verification
of the behavior of a program on a finite set of test
cases, suitably selected from the usually infinite
executions domain, against the expected
behavior”
•
•
•
•
Dynamic (means software execution vs. static inspections,
reviews, etc.)
Finite (Trade-off of resources)
Selected: Techniques vary on how they select tests (purpose)
Expected behavior: functional and operational
SWEBOK Software Testing Topics

Fundamentals:
•
•

Test levels:
•
•

Definitions, standards, terminology, etc.
Keys issues: looking for defects vs. verify and validate
Unit test Beta test
Objectives: conformance, functional, acceptance, installation,
performance/stress, reliability, stress, usability, etc
Test Techniques:
•
Ad-hoc, exploratory, specification-based, boundary-value
SWEBOK Software Testing Topics

Test Techniques:
•
•
•
•
•
•
•
•
•
Ad-hoc
Exploratory
Specification-based
Boundary-value analysis
Decision table
Finite state/Model
Random Generation
Code based (control flow vs. data flow)
Application/Technology based: GUI, OO, Protocol, safety,
certification,
SWEBOK Software Testing Topics

Test Effectiveness Metrics
•
•
•
•
•
•
Fault types and categorization
Fault density
Statistical estimates of find/fix rates
Reliability modeling (failure occurrences)
Coverage measures
Fault seeding
Testing Metrics

Test Case Execution Metrics
•

Defect Rates
•
•
•




Percent Planned, Executed, Passed
Defect rates based on NCLOC
Predicted defect detection (upper/lower control limits)
Fault Density/fault criticality (software control board)
Fault types, classification, and root cause analysis
Fault on fault, breakage, regression test failures
Reliability, Performance Impact
Field faults/ Prediction of deficiencies
Software Testing Axioms





Dijkstra “Testing can show the presence of bugs
but not their absence!”
Independent testing is a necessary but not
sufficient condition for trustworthiness.
Good testing is hard and occupies 20% of the
schedule
Poor testing can dominate 40% of the schedule
Test to assure confidence in operation; not to find
bugs
Software Quality and Testing Axioms





It is impossible to completely test software.
Software testing is a risk based exercise.
All software contains faults and defects.
The more bugs you find, the more there are.
“A relatively small number of causes will
typically produce a large majority of the problems
or defects (80/20 Rule).” --Pareto principle
Types of Tests










Unit
Interface
Integration
System
Scenario
Reliability
Stress
Verification
Validation
Certification
When to Test

Boehm- errors discovered in the operational
phase incur cost 10 to 90 times higher than design
phase
•
•

Over 60% of the errors were introduced during
design
2/3’s of these not discovered until operations
Test requirements specifications, architectures
and designs
Testing Approaches






Coverage based - all statements must be executed at least once
Fault based- detect faults, artificially seed and determine whether tests
get at least X% of the faults
Error based - focus on typical errors such as boundary values (off by
1) or max elements in list
Black box - function, specification based,test cases derived from
specification
White box - structure, program based, testing considering internal
logical structure of the software
Stress Based – no load, impulse, uniform, linear growth, exponential
growth by 2’s.
Testing Vocabulary






Error - human action producing incorrect result
Fault is a manifestation of an error in the code
Failure – a system anomaly, executing a fault induces a
failure
Verification “The process of evaluating a system or
component to determine whether the products of a given
development phase satisfy conditions imposed at the start
of the phase” e.g., ensure software correctly implements a
certain function- have we built the system right
Validation “The process of evaluating a system or
component during or at the end of development process to
determine whether it satisfies specified requirements”
Certification “The process of assuring that the solution
solves the problem.
IEEE 829 IEEE Standard for Software Test
Documentation









Test Case Specification
Test suite
Test Scripts
Test Scenarios
Test Plans
Test Logs
Test Incident Report
Test Item Transmittal Report
Test Summary Report
Test Process
Program or Doc
Expected output
Prototype
Or model
input
Test
strategy
Subset of input
compare
Acutal output
Subset of input
Execute
Test
results
Fault Detection vs. Confidence Building




Testing provokes failure behavior - a good strategy for
fault detection but does not inspire confidence
User wants failure free behavior - high reliability
Automatic recovery minimizes user doubts.
Test team results can demoralize end users, so report only
those impacting them.
A project with no problems is in deep trouble.
Cleanroom



Developer does not execute code - convinced of
correctness through static analysis
Modules are integrated and tested by independent
testers using traffic based input profiles.
Goal: Achieve a given reliability level
considering expected use.
Testing requirements


Review or inspection to check that all aspects of the
system have been described
• Scenarios with prospective users resulting in
functional tests
Common errors in a specification:
• Missing information
• Wrong information
• Extra information
Boehm’s specification criteria





Completeness- all components present and described
completely - nothing pending
Consistent- components do not conflict and specification
does not conflict with external specifications --internal
and external consistency. Each component must be
traceable
Feasibility- benefits must outweigh cost, risk analysis
(safety-robotics)
Testable - the system does what’s described
Roots of ICED-T
Traceability Tables





Features - requirements relate to observable
system/product features
Source - source for each requirement
Dependency - relation of requirements to each other
Subsystem - requirements by subsystem
Interface requirements relation to internal and external
interfaces
Traceability Table: Pressman
SUBSYSTEM
R
E
Q
U
I
R
E
M
E
N
T
S
S01
R01
X
R02
X
R03…
S02
S03…
X
X
Maintenance Testing



More than 50% of the project life is spent in maintenance
Modifications induce another round of tests
Regression tests
•
•
•
Library of previous test plus adding more (especially if the fix
was for a fault not uncovered by previous tests)
Issue is whether to retest all vs selective retest, expense related
decision (and state of the architecture/design related decision –
when entropy sets test thoroughly!)
Cuts testing interval in half.
V&V planning and documentation






IEEE 1012 specifies what should be in Test Plan
Test Design Document specifies for each software feature the details
of the test approach and lists the associated tests
Test Case Document lists inputs, expected outputs and execution
conditions
Test Procedure Document lists the sequence of action in the testing
process
Test Report states what happened for each test case. Sometimes these
are required as part of the contract for the system delivery.
In small projects many of these can be combined
IEEE 1012
Purpose
Referenced Documents
Definitions
V&V overview
• Organization
• Master schedule
• Resources summary
• Responsibilities
• Tools, techniques and
methodologies
Life cycle V&V
• Management of V&V
• Requirements phase V&V
3.
4.
5.
6.
Design phase V&V
Implementation V&V
Test phase V&V
Installation and checkout
phase V&V
7. O&M V&V
Software V&V Reporting
V&V admin procedures
1. Anomaly reporting and
resolution
2. Task iteration policy
3. Deviation policy
4. Control procedures
5. Standard practices and
conventions
Human static testing





Reading - peer reviews (best and worst technique)
Walkthroughs and Inspections
Scenario Based Evaluation (SAAM)
Correctness Proofs
Stepwise Abstraction from code to spec
Inspections


Sometimes referred to as Fagan inspections
Basically a team of about 4 folks examines code,
statement by statement
•
•
•
•
•
Code is read before meeting
Meeting is run by a moderator
2 inspectors or readers paraphrase code
Author is silent observer
Code analyzed using checklist of faults: wrongful use of data,
declaration, computation, relational expressions, control flow,
interfaces
Results in problems identified that author corrects and
moderator reinspects
Constructive attitude essential; do not use for
programmer's performance reviews

Walk throughs





Guided reading of code using test data to run a
“simulation”
Generally less formal
Learning situation for new developers
Parnas advocates a review with specialized roles
where the roles define questions asked - proven to
be very effective - active reviews
Non-directive listening
The Value of Inspections/Walk-Thoughs
(Humphrey 1989)
Inspections can be 20 times more efficient than
testing.

Code reading detects twice as many defects/hour
as testing

80% of development errors were found by
inspections

Inspections resulted in a 10x reduction in cost of
finding errors
Beware bureaucratic code reviews drive away
gurus.

SAAM




Software Architecture Analysis Method
Scenarios that describe both current and future behavior
Classify the scenarios by whether current architecture
directly (full support) or indirectly supports it
Develop a list of changes to architecture/high level design
- if semantically different scenarios require a change in
the same component, this may indicate flaws in the
architecture
•
•
Cohesion glue that keeps modules together - low=bad
» Functional cohesion all components contribute to the single
function of that module
» Data cohesion - encapsulate abstract data types
Coupling strength of inter module connections, loosely coupled
modules are easier to comprehend and adapt, low=good
Coverage based Techniques
(unit testing)



Adequacy of testing based on coverage, percent
statements executed, percent functional requirements
tested
All paths coverage is an exhaustive testing of code
Control flow coverage:
•
•
•
•
All nodes coverage, all statements coverage recall Cyclomatic
complexity graphs
All edge coverage or branch coverage, all branches chosen at
least once
Multiple condition coverage or extended branch coverage
covers all combinations of elementary predicates
Cyclomatic number criterion tests all linearly independent paths
Coverage Based Techniques -2
Data Flow Coverage - considers definitions and use
of variables
•
•
•
•
A variable is defined if it is assigned a value in a
statement
A definition is alive if the variable is not reassigned at
an intermediate statement and it is a definition clear
path
Variable use P-use (as a predicate) C-use (as anything
else)
Testing each possible use of a definition is all-uses
coverage
Requirements coverage



Transform the requirements into a graph
• nodes denoting elementary requirements
• edges denoting relations between elementary
requirements
Derive test cases
Use control flow coverage
Fault Seeding to estimate faults in a program

Artificially seed faults, test to discover both seeded and
new faults:
Total faults = ((total faults found – total seeded faults found)[
total seeded faults/total seeded faults found





Assumes real and seeded errors have same distribution but
manually generating faults may not be realistic
Alternative: use two groups: real faults found by X become
seeded faults for Y
Trust results when most faults found are seeded.
Many real faults found is negative. Redesign module.
Probability of more faults in a module is proportional to the
number of errors already found!
Orthogonal Array Testing


Intelligent selection of test cases
Fault model being tested is that simple interactions are a
major source of defects
•
Independent variables - factors and number of values they can take -- if
you have four variables, each of which could have 3 values, exhaustive
testing would be 81 tests (3x3x3x3) whereas OATS technique would
only require 9 tests yet would test all pair-wise interactions
Humphrey, 1989
Top-down and Bottom-up
Bottom-up
Top-down
Major
Features
Allows early testing
Modules can be integrated in various
clusters as desired.
Major emphasis is on module
functionality and performance.
The control program is tested first.
Modules are integrated one at a
time.
Major emphasis is on interface
testing
Advantages
No test stubs are needed
It is easier to adjust st5ffing needs
Errors in critical modules are found
early
No test drivers are needed
The control program plus a few
modules forms a basic early
prototype
Interface errors are discovered early
Modular features aid debugging
Disadvantages
Test drivers and harness are needed
Many modules must be integrated
before a working program is available
Test stubs are needed
The extended early phases dictate a
slow staff buildup
Errors in critical modules at low
levels are found late
Interface errors are discovered late
Some Specialized Tests






Testing GUIs
Testing with Client/Server architectures
Testing documentation and help facilities
Testing real time systems
Acceptance test
Conformance test
Software Testing Footprint
Tests
Completed
Planned
Rejection
point
Poor
Module
Quality
Time
Tests run successfully
Test Status
Customer Interests
Before
• Features
• Price
• Schedule
I
N
S
T
A
L
L
A
T
I
O
N
After
• Reliability
• Response Time
• Throughput
Why bad things happen to good systems
• Customer buys
off-the-shelf
• System works
with 40-60%
flow- through
• Developers complies
with enhancements
•
BUT
Customer refuses
critical Billing
Module
• Customer demands
33 enhancements
and tinkers with
database
• Unintended
system
consequences
Mindset
Move from a culture of minimal change to one of maximal
change.
Move to "make it work, make it work right, make it work
better" philosophy through prototyping and delaying code
optimization.
Give the test teams the "right of refusal" for any code that was
not reasonably tested by the developers.
Productivity
Productivity =
F {people,
system nature,
customer relations,
capital investment}
Software Testing Summary


Software testing Body of Knowledge very advanced (in
terms of standards, literature, etc.)
Software testing is very expensive; statistical risk analysis
must be utilized
•
•

Testing techniques vary according to operational
environment and application functionality
•



Cost of field faults vs. schedule slips
Release readiness criteria procedures required
No magic methods
Organizational conflict of interest between development
and test and project management and test
Involve testers throughout project
Hardest PM decision ship/don’t ship due to quality