Transcript Software

Ten Quality Methods which probably won't
improve product quality; and ten quality
methods that more probably will succeed
- for all aspects of quality”
Tom Gilb
www.Gilb.com
[email protected]
Result Planning Limited
MASTER Version May 8th 2002
Slide 2
“won’t improve”
(means, in this talk)
• You do not end up getting ‘as good as you
expected’, when you invested in the method.
• You would not have used the method, if that were
the result
• ALTERNATIVE TALK TITLE
• "Ten Quality Methods
– which probably won't improve product quality,
– and ten quality methods that more probably will
succeed – for all aspects of quality - not just bugs. "
Slide 3
Software
• All elements of the software (not just code)The
code
–
–
–
–
–
–
–
–
–
–
The updates
The data and databases
The online user instruction
The interfaces
The user manuals
The training materials
The development and maintenance documentation
The test planning
The test scripts
Anything else which is not clearly hardware
Slide 4
Quality
• All stakeholder valued aspects of system performance,
including quality and savings
– Speed
– Capacity
– Adaptability
– Maintainability
– Availability
– Reliability
– Portability
– Reusability
– Testability
– Usability
– And very many more
Slide 5
Here are some popular methods or
approaches which people expect
some software quality from, but I
suggest they will in practice be
disappointed - often because of poor
teaching and implementation - often
because of lack of quality focus
Slide 6
1. Go for CMM Level X
• The Software
Engineering
Institute’s Capability
Maturity Models
• CMM and CMMI
• Levels 2 to 5
• Results of Level 3
• Why Not?
– Not “quality” oriented
– CMM Bureaucracy
overwhelms any idea of
quality
– Intended mainly to put
reasonable software
engineering processes in
place, but
– Does not directly address
any quality aspect of a
system
• Maybe you can get
quality in spite of CMM
– but not because of it.
Slide 7
2. Demand ‘Better’
(Conventional) Testing
• Conventional Software
Testing is not normally
directed towards product or
system quality levels
– It looks for bugs (to
oversimplify quite a bit!)
• Conventional Testing is
‘function’ oriented (not
quality-oriented)
– It does not measure multiple
quality type levels
• Conventional Testing is too
late in the development cycle
– You get quality by designing
it in, not testing it in!
• Test can prove presence
of bugs/defects but cannot
prove their absence
• (Note I will suggest
Evolutionary Testing as
a way of improving
software quality later.
Evo testing is not
conventional, yet!)
Slide 8
3. Use Cases
• Use cases are not directed
to qualities of a system
• Use cases cannot express
quality requirements
• Use cases are not judged
on the degrees of quality
they deliver to an
architecture
• There is
– no evidence published
– about the relationship
– between Use cases and
any sort of quality
• I’d be happy to be
informed of evidence
I have overlooked!
The list of problems with Use
Cases and UML
I have no intention of going through
this in detail during my talk, but I
wanted to make the details available
to the participant - to lend more
credibility to my point.
The details are at the end of these
slides.
Slide 10
A Use Case Critique Summary
By Don Mills [Mills01]
•
This Appendix lists the “problems with use cases” that I found in my brief, and
unscientific, survey of “the literature” (a mixture of books on my and my employer’s
shelves, with articles found by browsing the Internet). The first eight entries come
from the UI Design.net editorial for October 1999
(http://www.uidesign.net/1999/imho/oct_imho.html).
•
Solutions to all of the problems exist, but not within the RUP or the UML (or only
clumsily, ambiguously, or inconsistently), while outside those strictures many
competing solutions have been proposed.
•
Note that this is not intended as an exhaustive list ..
• DETAILS AT END OF THESE SLIDES.
Slide 11
4. RUP, RUP SE
“System Quality – Provides the views to support addressing system
quality issues in an architecture driven
process” [RUP SE]
• “In RUP SE, [RUP]
– this idea is carried forward,
adding systems engineers to
the mix.
– Their area of concern is the
design and specification of the
hardware and system
deployment to ensure that the
overall system requirements
are addressed.”
• Rational Unified
Process never did
address quality.
• RUP SE (Systems
Engineering) is a
belated, but weak (TG
Opinion) attempt to
patch that hole in
RUP
Slide 12
RUP SE Example of ‘dealing with quality’ [RUP]
Slide 13
5. Conventional Inspection, Peer
reviews, Reviews
• Reviews do not generally focus
on quality.
• Specific reviews may attempt
to address quality. But in my
view not professionally
(quantified!).
• Conventional Inspections as
they are usually done
– will fail to deal with quality in
general,
– and will be very cost
ineffective for quality in terms
of bugs
• Why are ‘Conventional
inspections a failure
route?
– They focus on clean up of
bad work (high bug
injection rates)
– Their effectiveness for
bugs is maximum 60%
(one pass)
– They are rarely done at full
effect ( likely effect 10%30%)
Slide 14
5. (continued) Inspections, to
deal with quality, must:
• Deal with all aspects of quality engineering
including quality requirements, quality
design
• Define required quality practices in terms
of
– process ‘Rules’ (failed rule = defect, detected
by Inspection) Like:
• “All quality requirements will be defined with a
scale of measure”
• “All design specification will be evaluated
quantitatively on an impact estimation table”
Slide 15
6. Extreme Programming XP
• XP has no direct focus • XP can’t hurt you but
on quality
it does not pretend to
• But there are several
solve the larger
mechanisms which
quality attribute
can help reduce
problem
injection of bugs in
XP
• Click here for XP
• That does not deal
development method
with many other types
of quality.
Kent Beck XP
16
XP Pair Programming
IEEE Software July/Aug 2000
As Beck writes, “Even if you weren’t
more productive, you would still want to
pair, because the resulting code quality is so
much higher.”10
By working in
tandem, the pairs completed
their assignments 40% to
50% faster.
17
18
Different View 12 March 2002

dear tom,

browsing through your presentation "10 guaranteed ways ..."
that i did not have the opportunity to listen to, i noticed
that you also have a slide concerning the XP practice of pair
programming. you might be interested in a new study on
pair programming to be found at








http://dialspace.dial.pipex.com/town/drive/gcd54/conference2001/papers/nawro
cki.pdf .
the study is essentially contradicting earlier findings by
laurie williams.

i actually set up a paper "Extreme Programming Considered
Harmful for Reliable Software Development" that you can
find at

http://www.avoca-vsm.com/Dateien-Download/ExtremeProgramming.pdf

and you might want to have a look at it.

regards,

gerold keefer

=====================================================================
AVOCA GmbH - Advanced Visioning of Components and Architectures
Kronenstrasse 19
D-70173 Stuttgart
fo +49 711 2271374 fa +49 711 2271375
http://www.avoca-vsm.com mailto:[email protected]







Woodward asks about XP 1/3
19
Questions on XP from [email protected]
In response to http://www.extremeprogramming.org
1.
How do you manage required changes in Software Architecture? Not all programmers are
architects and not all architects are programmers so who does the work and what do the
programmers do while the architecture is changed?
2. It seems to assume that all team members are equally experienced and skilled i.e. can make
changes to the system with equal levels of confidence and competence. Otherwise, who is
responsible for the integrity of the system, data models etc?
3. Who specifies the requirements? How are they specified? Or do the programmers have free
reign to interpret often-fuzzy statements by the users however they want to?
4. What does the Project Manager do?
5. Why is XP different to what is known as RAD? OR DSDM? Or Evo? Or RUP?
6. XP promotes good practice, right? So where is the Process?
7. How does a system programmed via XP allow changing requirements to be implemented more
easily than in other methods? Getting early feedback will not itself provide the answers.
8. How does XP help to prevent bugs getting into code in the first place? You cannot test quality
into software; you must build it in.
9. It assumes very close contact with end users, right? This is rarer than you might think. And who
co-ordinates and organises and presents the user requirements? Who checks them and makes
sure that they do not invalidate the integrity of the system, current or proposed?
10. All the XP documentation that I have seen seems to set it up as the only way to handle changing
requirements. I refer again to point 5 above.
11. How does XP mitigate risk?
Woodward on XP 2/3
20
12. How can XP handle projects with many man-years of estimated effort? Or many
and complex interfaces?
13. (deleted as redundant)
14. How are the goals of XP different to those of any other method i.e. to produce
software to the customer on time and to budget? Why should XP have different
goals (if they do)? (Possibly redundant SW)
15. Why should XP make it any easier to produce quality products than any other
method? Why should software engineering be easy just because the rules are?
(Possibly redundant SW)
16. What’s difference between User Stories (XP) and Use Cases + UML? Why
should XP be better in this respect?
17. What is refactoring and how does it product the most effective architecture? How
does this differ to what we do already?
18. Is XP telling me that programmers can do effective functional testing in pairs or
otherwise? How? What does XP see as the purpose of testing?
19. If the Customers are expected to write User Stories and they do not use some
form of precise language then where is the quality, accuracy, consistency etc
built in? Is this not a recipe for getting all the ambiguities into the code i.e.
hacking?
Woodward on XP 3/3
21
20. Don't bother dividing the project velocity by the length of the iteration or the number
of developers. This number isn't any good to compare two project's productivity
because each project team will have a different bias to estimating stories and tasks,
some estimate high, some estimate low. It doesn't matter in the long run. Tracking
the total amount of work done during each iteration is the key to keeping the project
on an even keel. I agree – you must measure and compare estimates with actuals to
learn!
21. Iterative Development adds agility to the development process. Divide your
development schedule into about a dozen iterations of 1 to 3 weeks in length. Gilb
says 2%. I think this is arbitrary and a natural size develops (environmental factors).
Team size plays a part – see OMAR.
22. Don't schedule your programming tasks in advance. Instead have an iteration
planning meeting at the beginning of each iteration to plan out what will be done. It is
also against the rules to look ahead and try to implement anything that it is not
scheduled for this iteration. There will be plenty of time to implement that
functionality when it becomes the most important story in the release plan. When
you never add functionality early and practice just-in-time planning it is easy to stay
on top of changing user requirements. YUP!
23. What if the real customers cannot be available?
Woodward on XP 3/3
22
20. Don't bother dividing the project velocity by the length of the iteration or the number
of developers. This number isn't any good to compare two project's productivity
because each project team will have a different bias to estimating stories and tasks,
some estimate high, some estimate low. It doesn't matter in the long run. Tracking
the total amount of work done during each iteration is the key to keeping the project
on an even keel. I agree – you must measure and compare estimates with actuals to
learn!
21. Iterative Development adds agility to the development process. Divide your
development schedule into about a dozen iterations of 1 to 3 weeks in length. Gilb
says 2%. I think this is arbitrary and a natural size develops (environmental factors).
Team size plays a part – see OMAR.
22. Don't schedule your programming tasks in advance. Instead have an iteration
planning meeting at the beginning of each iteration to plan out what will be done. It is
also against the rules to look ahead and try to implement anything that it is not
scheduled for this iteration. There will be plenty of time to implement that
functionality when it becomes the most important story in the release plan. When
you never add functionality early and practice just-in-time planning it is easy to stay
on top of changing user requirements. YUP!
23. What if the real customers cannot be available?
Stuart Woodward comments XP 23
[email protected]
Slide 24
7. Better Programmers
• Programmers do not design
quality into systems
• Designers, engineers,
architects do
• Good Programmers will
correctly program low quality
into a system to meet bad
requirements or design on time
Slide 25
8. Outsourcing
• Outsourcing will not
• You have to contract
in itself give you
for it
better software quality • You have to specify
the levels you want
• You have to confirm
you got it
Slide 26
Evolutionary Project Management Contract Modifications 1/2
Design idea: designed to work within the scope of present contract with minimum modification. An Evo step is
considered a step on the path to delivering a phase.
You can choose to declare this paragraph has priority over conflicting statements, or to clean up other conflicting
statements.
§30. Evolutionary Result Delivery Management.
30.1 Precedence. This paragraph has precedence over conflicting paragraphs.
30.2 Steps of a Phase. The Society may optionally undertake to specify, accept and pay for evolutionary usable
increments of delivery, of the defined Phase, of any size. These are hereafter called “Steps”.
30.3 Step Size. Step size can vary as needed and desired by the Society, but is assumed to usually be based on a
regular weekly cycle duration.
30.4 Intent. The intent of this evolutionary project management method is that the Society shall gain several
benefits: earlier delivery of prioritised system components, limited risk, ability to improve specification
after gaining experience, incremental learning of use of the new system, better visibility of project progress,
and many other benefits. This method is the best known way to control software projects (now US DoD
Mil Standard 498. 1994).
30.5 Specification Improvement. All specification of requirements and design for a phase will be considered a
framework for planning, not a frozen definition. The Society shall be free to improve upon such
specification in any way that suits their interests, at any time. This includes any extension, change or
retraction of framework specification which the Society needs.
Slide 27
Evolutionary Project Management Contract Modifications 2/2
30.6 Payment for Acceptable Results. Estimates given in proposals are based on initial requirements,
and are for budgeting and planning purposes. Actual payment will be based on successful
acceptable delivery to the Society in Evolutionary Step deliveries, fully under Society Control. The
Society is not obliged to pay for results which do not conform to the Society-agreed Step
Requirements Specification.
30.7 Payment Mechanism. Invoicing will be on a Step basis triggered by end of Step preliminary
(same day) signed acceptance that the Step is apparently as defined in Step Requirements. If
Society experience during the 30 day payment due period demonstrates that there is a breach of
specified Step requirements, and this is not satisfactorily resolved by the Company, then a Stop
Payment signal for that Step can be sent and will be respected until the problem is resolved to meet
specified Step Requirements.
30.8 Invoicing Basis. The documented time and materials will be the basis for invoicing a Step. An
estimate of the Step costs will be made by
the Company in advance and form a part of the Step Plan, approved by the Society.
30.9 Deviation. Deviation plus or minus of up to 100% from Step cost and times estimates will
normally be acceptable (because they are small in absolute terms), as long as the Step
Requirements are met. (The Society prioritises quality above cost). Larger deviations must be
approved by the Society in writing before proceeding with the Step or its invoicing.
30.9 Scope. This project management and payment method can include any aspect of work which the
Company delivers including software, documentation and training, maintenance, testing and any
requested form of assistance.
A Subcontracting Policy
• 1. Specifications are to made to give both us, and the
suppliers, the highest degree of flexibility ( for changes
and unforeseen things) to carry out the real intent of
the contract.
– For example: we shall avoid giving detailed design or feature
lists, when we can control the product or service quality
and performance better by a higher level statement which
forces all necessary detail to happen.
– For: instead of a list of usability features, we should make
sure we have the measurable testable usability quality
requirements specified.
– If necessary the proposed detail can be a variable attachment
which itself is not mandatory but for guidance.
Policy Quality Control
• All contracts, Requests for proposal and
attached technical specifications will be
Inspected using a rigorous inspection process
against our current specification rules for
contracts or whatever document types we are
using.
• Exit (for signing or reviewing) will be given
when it is measured that there are less than 0.1
major defects/Logical page probably remaining.
Evo Form for quantified stepwise specs of the quality levels you want
Buyer Requirements
Functional Requirements
Benefit/Quality/Performance Requirements
Tag:____________
GIST: __________
SCALE:_____
METER [END STEP ACCEPTANCE TEST] ___
PAST[WHEN?, WHERE?] ___
MUST [when?, where?]____________
PLAN[when?, where?]____________
Tag:____________
AMBITION LEVEL: __________
SCALE:_____
METER [END STEP ACCEPTANCE TEST] ___
PAST[WHEN?, WHERE?] ___
MUST [when?, where?]____________
PLAN[when?, where?]____________
Resource Constraints:
Calendar Time:
Work-Hours:
Qualified People:
Money (Specific Cost Constraints for this step):
Other Constraints
Design Constraints
Legal Constraints
Generic Cost Constraints
Quality Constraints
Assumptions:
Dependencies:
Design:
Technical Design (for Benefit Cost requirements)
Tag:
Description (or pointer to tags defining it):
Expected impacts:
Evidence (for expected level of impacts)
Source (of evidence)
Slide 31
9. Deadline Pressure
• When the
• Deadline will
deadline is
win
clear and holy, • You will fail to
but the quality
get the quality
is not clear and
you want
not holy
Slide 32
10. Define ‘Quality’ in terms of
Bugs in code
• Do you define
food quality in
terms of bugs per
liter?
• The qualities you and
your stakeholders
want are many and
varied, and bugs is
only one measure, and
not the most
important one.
Slide 33
11. Re-usable software
• One client of
mine invested
on a very large
scale in
reusable
modules
• But when it
came time to
reuse them over
60% of the
modules had far
too many bugs in
them to use at
all.
• What is the
lesson?
Slide 34
Summary of 10+1 Ways to Fail
at Improving Software Quality
•
•
•
•
•
•
•
•
•
•
•
1. Go for CMM Level X
2. Demand Better Testing
3. Use Cases
4. RUP
5. Inspection, Peer reviews, Reviews
6. Extreme Programming
7. Better Programmers
8. Outsourcing
9. Deadline Pressure
10. Define ‘Quality’ in terms of Bugs in code
11. Re-usable software
Slide 35
Ten Better Approaches to
Improve Software Quality
• Better?
– More effective
– More efficient
(effect/cost)
– Better proven
documented track
record available
– More direct attack on
measurable quality
levels themselves
• Improve?
– Quantitative increase
in quality levels
attainable at a given
cost.
– Significant increase
Slide 36
10. Evolutionary Testing
• What is it?
– All quality attributes can
be measured at each Evo
step
– There are many steps
(about 50 steps)
– Delivered quality levels
are compared to numeric
plans
– Tracking is done on an
impact estimation table
– Delivery steps are to real
stakeholders, not just
testers -
• Why is it better?
– Focus is on total
system ( people, data,
platforms, real work)
not code alone
– Early and frequent
measurement
– Opportunity to learn
from small failures
and to prevent big
ones
Philips Evo Pilot May 2001
# Jobs Week
[- 5%,+10%]
6
wk 8
11
wk 9
19
wk 10
6
25
wk 11
6
25
wk 12
42
wk 13
55
wk 14
55
wk 15
55
wk 16
55
wk 17
[-10%,+20%]
[-15%,+30%]
3637
out of range
5
1
3
1
Frank van Latum,
The Manager
7
7
3
4
3
6
9
17
3
5
31
3
37
6
11
39
37
2
9
48
50
1
6
1
6
4
1 2
4
1
The GxxLine PXX Optimizer EVO team proudly presents the success of the Timing Prediction Improvement EVO steps.
Shown are the results of the test set used to monitor the improvement process.
The size of the test set has grown, as can be seen in the first column. (In the second column the week number is shown.)
We measured the quality of the timing prediction in percentages, in which –5% means that the prediction by the optimizer is 5% too
optimistic.
Excellent quality (–5% to +10%) is given the color green, very good quality quality is yellow, good quality is orange, & the rest is red.
The results are for the ToXXXz X(i) and EXXX X(i), and are accomplished by thorough analysis of the machines, and appropriate
adaptation of the software.
The GXXline Optimiser Team presented the word document below to the Business Creation Process review team.
The results were received with great applause. The graphics are based on the timing accuracy scale of
measure that was defined with Jan verbakel. Classification:
Unclassified
Erieye Project: Inspection Cleanup per Evo Delivery.
Getting all causes of bad quality at early stages
38
The deliveries in the graph below are ordered in time. Observe also that the deliveries differ quite a
lot in size (e.g. numbers 6 and 20 are very small).
Corrected Ma jors per Net Page per Deliv ery
2,50
2,00
1,50
1,00
0,50
0,00
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Delivery
The graph shows the total Major defects/page for all documents types for all
inspections in each delivery. The total number of inspections is 994.
Source: Leif Nyberg, Project manager, Ericsson Sweden, in a case study [Personal
Communication to TG]
39
Value delivery in Omar Project
OMAR Case delivery value vs Waterfall (1998)
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22
Pro ject Mo n th
Project FF Cumulative Delivered Functionality
Project FF Benefit / Cost
OMAR Cumulative Delivered Functionality
OMAR Benefit / Cost
Using Evolutio
& Services Ltd.
nary Project Management, To
Get More Quality, From Fewer Resou
rces, In Less Time;
© [email protected] 1999
By Stuart Woodward, DoubleHelix Software
5
An example of a typical one-week Evo
cycle at the HP Manufacturing Test
Division during a project. [MAY96]
Development Team
Monday


Tuesday


Wednesday


Thursday
Friday



System Test and Release
Version N
Decide What to Do for Version
N+1
Design Version N+1
Develop Code
Develop Code
Meet with users to Discuss
Action Taken Regarding
Feedback From Version N-1
Complete Code
Test and Build Version N+1
Analyze Feedback From Version
N and Decide What to Do Next
40
Users


Use Version N and Give
Feedback
Meet with developers to Discuss
Action Taken Regarding
Feedback From Version N–1
Impact Table for Step Management 41
Reliability
99%99.9%
Perf orm
-ance
11sec.-1
sec.
Usabili ty
30 min.
-30 sec.
Capital
Cost
1 mill.
Enginee
-ring
Hours
10,000
Calendar Time
Step #1
Plan
A:
{DesignX,
Function
-Y}
50%
±50%
Step
#1
Actual
Differe
-nce.
-is
bad
+ is
good
Total
Step 1
Step #2
Plan
B:
{Design
Z,
Design
F}
30%
±20%
Step #2
Actual
Step #2
Difference
Total
Step
1+2
Step #3
Next
step
plan
40%
-10%
40%
20%
-10%
60%
0%
80%
±40%
40%
-40
40
30%
±50%
30%
0
70%
30%
10%
±20%
12%
+2%
12%
20%
±15%
5%
-15%
17%
83%
20%
±1%
10%
+10%
10%
5%
±2%
10%
-5%
20%
5%
2%
±1%
4%
-2%
4%
10%
±2.5%
3%
+7%
7%
5%
1 week
2
weeks
-1week
2
weeks
1 week
0.5
weeks
+0.5
wk
2.5
weeks
1 week
Evo and Requirements, Conceptually
‘Design’ is what delivers performance, and costs resource
Design X
(done on step 1)
1
One or more constraints
1
2Storage 1
Other
1 Resources
2
Terminal
(functions)
1 Storage
2 2
Reliability
2
Other1 Performance
2
1
Evo development
gradually delivers performance,
while eating up resources by
Implementing ‘design’ Design _
Design Y
(done on step 2)
(done on step n)
2Usability
n
n
Multiple Test Levels of Microsoft Evo 43
Office 2002 Level
Vital 3rd
6->10 Weeks
Vital 3rd
6->10 Weeks
Reference: Cusomano: Microsoft Secrets. Drawing by TG
See reference [MacCormack2001]
44
Intel View of Industrial Evo cycle
Project
Milestones:
Evolutionary
Product Life
Cycle
Ver. 0.6
Development
Investment
Approval
(DIA)
Exploration
Planning
High
Level
Design
(HLD)
Product
Documentation:
Product
Overview
Proposal
(POP)
Product
Iteration Plan
Approval
(IPA)
Mid Iteration
Review
(MIR)
Iteration
Release
Approval
(IRA)
Iterations and Releases
Iteration
Requirements
Product
Iteration
Plan
(PIP)
Iteration
Estimate
Iteration
Data
Model
Figure
1. The Evolutionary
Product LifIntel
e CycleOregon
Courtesy:
Erik Simmons,
Product
Discontinuance
Approval
(PDA)
Maintenance
Install
Checklist
Iteration
Test Plan
Product
Reference
Manual
Post
Project
Review
(PPR)
Slide 45
9. Defect Prevention Process
DPP
• What is DPP?
– CMM Level 5
– Continuous Process learning
– Maybe 2000 small changes
per year (IBM MN)
– Avoiding defect injection (bad
doesn’t happen!)
– 13x more cost effective than
defect removal (Inspection).
– 50% to 95% of all defects can
be prevented
• Why is it better for
Quality?
– It attacks upstream
(requirements, design,
contracts)
– It is completely
general (deals with all
quality aspects, not
just bugs)
For more detail on DPP see Gilb, Software
Inspection, Ch 7 & 17 (by Robert Mays) DPP
Inventor
46
46
The Bottom Line
The Bottom Line for Process Improvement ...
50
40
Start Improvement Initiative
Cost of
rework
30
$15.8 million
Savings in rework alone
20
Appraisal cost
10
Prevention cost
1987
1988
ROI = 770%
1989
1990
Raymond Dion, Process Improvement and the Corporate Balance Sheet (July 93)
IEEE Software, July 1993, pp 28-35
1991
1992
46
47
47
Reduced Cost of Quality
50%
Cost Of Quality = COConformance + CONonConformance
40%
COC=Appraisal + Prevention
CONC= cost of “fix and check fix” (“rework”)
30%
COC (Cost for doing it right)
20%
10%
CONC (Cost of doing it wrong)
Philip Crosby’s
“Cost Of Quality”
0%
1988
∑65%
1989
1990
1991
1992
1993
∑ Cost of Quality=COC
1994
1995
∑23%
47
48
48
Defect Prevention Experiences:
Most defects can be prevented from getting
in there at all
Cleanroom levels: approach zero def.
IBM MN 99.99%+ fixes:Key= "DPP"
90%
80%
70%
50%
% of usual
defects
prevented
Mays 1993, User 1996 "72% in 2 years" <-tg
Mays & Jones (IBM) 1990
1
2
3
4
5
6
•Years of continuous improvement effort
North Carolina
IBM Research Triangle Half-day
ParkInspection
Networking
Laboratory
Economics. [email protected]
48
49
49
Prevention + Pre-test Detection
is the most effective and efficient
90%
80%
70%
50%
100%
95%
cumulative
detection
Use
by Inspection (state of the art limit)
Test 70% Detection
by Inspection
<- Mays 1993, 70% prevented
"Detected
\
Cheaply"
<-Mays & Jones 50% prevented(IBM) 1990
Prevented"
"
1
•
•
2
3
4
5
6
Prevention data based on state of the art prevention experiences (IBM RTP),
Others (Space Shuttle IBM SJ 1-95) 95%+ (99.99% in Fixes)
Cumulative Inspection detection data based on state of the art Inspection (in an
environment where prevention is also being used, IBM MN, Sema UK, IBM UK)
Half-day Inspection Economics. [email protected]
49
Slide 50
8. Motivate by Reward for
Quality
• What is motivation
by reward?
– Connecting actual
delivery of specific
quality levels to
some sort of
personal and team
rewards (not
necessarily
money).
• Why is it better?
– We don’t
normally do this
at all
– We reward on
time delivery of
bad qualities
Slide 51
8. Reward Quality
(see the contracts in earlier slides)
• Example:
• Define the quality you
want in ‘Planguage’ [see
refs CE, Posem]
• Maintainability:
– Scale: Average minutes to
find, correct and regression
test for a random bug.
– Meter [Evo Step
Acceptance] at least 10
average bugs and 2
qualified maintainers.
– Plan [Contract, Each Evo
Step] 60 minutes.
• Then stipulate:
• In a sub-supplier contract:
– Payment invoice-able
when all defined quality
levels are proven delivered.
• For in-house team
– Delivery can only be
considered as ‘done’ when
the defined tests prove that
the defined levels of all
qualities due are in fact
delivered
– No quality? You are late!
Slide 52
7. Entry Level Defect Control:
No Garbage In
• What is it?
– All software engineering
processes (contracting to
coding) will make sure that
the specifications they get
are reasonably ‘good’.
– Good practice is defined by
a set of ‘Rules’ (like Clear,
Complete, Consistent)
– A sample ( 1 or more pages)
of incoming information will
be taken (Inspection)
– A measure of Major Defects
per page will be taken
– A maximum level of defects
will be allowed used
• Why is it better?
– Right now we have a
Major defect level of about
150 ± 100 Major
defects/page against a
simple basic set of rules
– Acceptance levels should
be at less than 1.0
– Average cost of a Major
defect is about 3-10 hours
project time lost
– Current levels of Major
Defects have delayed real
projects by 2 years (Ohio
Case).
Slide 53
7. No Garbage In (continued)
• Policy
– “No software process
shall use input
specifications with
more than one major
defect per page (300
non commentary
words)”
– “exceptions shall be
documented and
approved formally”
• Practice (how to
measure garbage
level)
– 1. Rules agreed (3 go
a long way)
– 2. Sample size set (1
page is fine)
– 3. Processes are
officially redefined to
include this Entry
control
– 4. Time level is set (up
to 30 minutes is fine)
“Rules”:
Best Practice Strong Advice
Introduce the following three rules for inspecting a requirements document:
Three Rules for Requirements:
– 1. Unambiguous to intended
Readership
– 2. Clear enough to test.
– 3. No Design specs (= ‘how to- be good’)
mixed in
• Mixed up with Requirements (= ‘how
good - to be’)
Report for page 82
(reported inspection results on requirements document, 4 managers)
•
•
•
•
•
Total Defects, Majors, Design (part of Total and M&m)
M+m Maj. Design
--------------------------41,
24, D=1
33,
15, D=5
•
44,
30,
•
24,
3,
•
•
•
Team would log unique Majors about ~2x30=60 (2X high score)
Which is 30% of total , so total this page is about ~180 Majors
If we attempt to fix 60 we log, and correctly fix 5/6 then ~10 are failed fixes,
so:
The total remaining after inspection and editing = 10+120 = ~130 Majors per
page.
•
D=10
D=5
Extrapolation to
Total Majors in Whole Document
• Page 81: 120 majors/p (3/4 page checked by 4 other managers)
• Page 82: 180 Majors/p
• Average 150 Majors/physical page x 82 pages = 12,300
Majors in the document.
----------------• If a Major has 1/3 chance of causing loss downstream
– 4,100 majors will cause a loss
• And each loss is avg. 10 hours;
– (9.6 hours median at one client for 1,000 majors)
– then total project Rework cost is about 41,000 hours loss.
• (This project was in reality over a year late)
– 1 year = 2,000 hours for 10 people
More feedback
• “Love the slides on in-process document review.
• We are using this with requirements documents, and
have been able to double the quality of the
documents with only a few hours of effort.”
• " Erik Simmons, Intel, Oregon "
"[email protected]
• January 9th 2002
Slide 58
6. Exit Level Defect Control:
No Garbage Out
• What is Exit about?
– Same as Entry control
except you do quality
control on your own
work
• You check a spec against
your rules for good
specs
• You determine the
defect density (defect
injection rate)
– We can perform checks
using samples
– During a work, so we
don’t get surprised at the
end
• Why is it better?
– It discovers problems very
early
– It works at all levels of the
development and
maintenance processes
• Not just test and operate
for code
– It can impact all types of
quality (not just ‘bugs’)
– Very inexpensive and fast
(10-30 minutes/check)
Slide 59
The NO ‘G.O.’ Policy
• Policy (kept simple)
– We will not release any
work which has
unacceptable defect
density.
– We will check our work as
it emerges, not just at the
end.
– If bad work is being
produced, we will change
‘whatever it takes’ to avoid
defect injection.
– (=CMM5 DPP)
• Practical
Implementation
– Exit Condition:
• “Maximum 1 Major
defect/300 NC words”
– Sampling Rate:
• Check a page about
every 10 pages
– Checkers: author
and/or one colleague
How to Inspect a large
amount of specification or
code!
Sampling for Dummies
“Do a page and then
decide what to do.”
60
61
Sample “During” Authoring 1
Sample
4 Majors
Then sample one
page with Inspection
Too
many
defects
Good
Enough
The Author is
expected to write
about 45 pages.
First we write
only 5 of these.
Write New Pages
4 Majors
Sample
Exit?
Re-Write all 5 pages
No
62
Sample “During” Authoring 2
Exited Pages
5 Major
Sample
Then sample one
page with Inspection
The Author is
expected to write
about 45 pages
Now the Author can
write 5 more pages.
I’ve been driving for 2
hours without an
accident, so I can now
close my eyes while
driving.
Too
many
defects
Good
Enough
Write New Pages
5 Major
Sample
Exit? Yes
No
Re-Write all 5 pages
Slide 63
5. Quantify Quality
Requirements
• What does that mean?
– Specify a number, on a
scale of measure
indicating how much
quality you want
– Do this for all types of
quality you want to
manage (reliability,
maintainability, usability)
– Use ‘Planguage’ [CE
reference back of slides]
for example as a format.
• How do we do it?
– Identify the critical quality
types (name tag it)
• Availability:
– Define a scale of measure
for them
• Scale: Hours MTBF
– Decide on a good enough
level of quality for the
application
• Plan [First One] 30,000
Slide 64
5. Quantify Quality 2
• Policy
– All critical quality
requirements will
always be specified
quantitatively
– We will measure the
level of quality
actually delivered
• During development
• At acceptance
• In operation
• Practical
– Train people in
Planguage
– Make specification
templates (next slide)
available
– Make knowledge of
good scales of
measure and practical
meters (tests)
available.
Scalar Requirements Template + <Hints>
65
<name tag of the objective>
Ambition: <give overall real ambition level in 5-20 words>
Type: <quality|objective|constraint>
Stakeholder: { , , } “who can influence your profit, success or failure?”
Scale:
<a defined units of measure, with [parameters] if you like>
Meter [ <for what test level?>]
==== Scalar Benchmarks ============= the Past
Past [ ] <estimate of past> <--<source>
Record [ <where>, <when record set> <estimate of record level> ]
<-- <source of record data>
Trend [ <future date>, <where?> ] <prediction of level>
<- <source of prediction>
========= Scalar Constraints ============== Fail borders
Limit [ ] <- Source of Limit
Must [ ] <-- <source>
===== Scalar Targets ============= the future value and needs
Wish [ ] <- <source of wish>
Plan […] <target level> <-- Source
Stretch [ ] <motivating ambition level> <- <source of level>
Min:
Erieye Project: Usability.Intuitiveness
Requirement (Real Example)
66
Usability. Intuitiveness
Ambition: High probability in % that operator will <immediately> within a specified time from deciding the need
to perform the task (without reference to handbooks or help facility) find a way to accomplish their desired
task.
Scale: Probability that an <intuitive>, TRAINED operator will
• find a way to do whatever they need to do,
• without reference to any written instructions (i.e. on paper or on-line in the system,
• other than help or guidance instructions offered by the system on the screen during
operation of the system)
• within 1 second of deciding that there is a necessity to perform the task.
• <-- MAB "I'm not sure if 1 second is acceptable or realistic, it's just a guess"
Meter:
To be defined. Not crucial this 1st draft - TG
Past [GRAPES] ~80% ?  LN
Record
[MAC] 99%?  TG
Assumption: we have human operators!
Must
[TRAINED, RARETASKS [{<1/week,<1/year}] ] 50 - 90%? MAB
Plan
[TASKS DONE [<1/week (but more than 1/Month)]] 99% ? LN
[TASKS DONE [<1/year]] 20% ? - JB
[Turbulence, TASKS DONE [<1/year] ] 10% ? - TG
Min:
Slide 67
4. Contract Towards Quality
• What does that mean?
– When you contract for
software work, you will
define the work partly by
quantified quality levels
expected.
– This is the same as the
quantified qualities in the
last point, just that we do it
in legal contracts.
• It gets taken more
seriously than mere
requirements!
• Why is it better for
software quality?
– You are more likely to get
the quality levels you want
• At least you shouldn’t
pay if you don’t!
– All aspects of the
development process will
have to find a way to
deliver the contracted
levels.
Slide 68
Symbolic ‘Quality” Contract
• The Availability will be at
– 99.98%
• The Maintainability will be
– 60 minutes/bug to find, fix and test.
• The Usability will be at
– 30 seconds for average task
familiarization.
Slide 69
3. Reuse Known Quality
• What does that mean?
– The various quality
dimensions of a
reusable software
component are
•
•
•
•
•
known,
measured,
predictable,
quantified,
documented
• Why is it better for
quality?
– The qualities you get
are ‘by selection’,
rather than ‘by
process’.
– This is a conventional
engineering paradigm
(use known
components with
known attributes)
Slide 70
2. Evolve Towards Quality
• What does that mean?
– It means that your projects
should be divided up into
small (2%) stakeholderresult-delivery increments
• Each one to deliver at
planned quantified levels
• Optionally going initially
for the ‘final quality
levels’ at an initially low
level of functionality
– It means that you have to
prove you know how
• to get your quality levels,
• early and frequently.
• Why is it (Evo) better for
quality?
– You have to prove all
mechanisms; early and
frequently for
•
•
•
•
•
•
•
•
Contracts
Requirements
Design
Reused components
Development process
Staff
Subcontractors
Stakeholder reactions
Microsoft IE 3.0
Source: MacCormack, Product-Development Practices That Work:
How Internet Companies Build Software
in WINTER 2001 MIT SLOAN MANAGEMENT REVIEW
71
Linux Evolution
Source: MacCormack, Product-Development Practices That Work:
How Internet Companies Build Software
in WINTER 2001 MIT SLOAN MANAGEMENT REVIEW
72
Slide 73
1. Design to Quality
• What does that mean?
– It means we get the qualities
we want by actively
designing/engineering and
architecting
– That means by choosing the
design ideas which predictably
will give us the qualities we
require.
– It means defining all critical
quality dimensions
quantitatively
– It means evaluating all design
options quantitatively in
relation to our quality
requirements levels.
• Why is it better for
software quality?
– Because your design
process is then
• focused on the qualities
you want and
• on the designs which
will give those qualities.
– Because this is the
historically proven way to
get quality in engineering
and architectural
disciplines
– Because current socalled ‘software
engineering’ (example
CMM, RUP) does not
even have this ‘design’
idea on the agenda!
Design process example:
An example of considering two alternatives, based on
their impacts on qualities, their cost, and their risk.
74
(Impact Estimation tool [CE, Posem])
Reliability 99%-99.9%
Performance 11sec.-1
sec.
Usability 30 min.-30
sec.
Capital Cost 1 mill.
Engineering Hours
10,000
Worst Case B/C ratio
“Worst Worst” case
considering estimate
credibility factor
Step Candidate A:
{Design-X, Function-Y}
50% ±50%
80% ±40%
Step Candidate B:
{Design Z, Design F}
100% ±20%
30% ±50%
-10% ±20%
20% ±15%
20% ±1%
2% ±1%
5% ±2%
10% ±2.5%
(0+40-30)/(21+3) =0.42
(80-20+5)/(7+12.5) =
3.33
0.2 x 3.33= 0.67
0.8 x 0.42= 0.33
A’s
Credibility=0.8
B’s
Credibility=0.2
(High)
(Low)
See slide note
for explanation
The Head:Body Model of Evo.
Architecture-level design
combined with step level design.
Project
Architecture
and
Management
Level
75
Requirements "Head"
and Architecture Plan/Study/Act
A Step
Requirements
Design
"Body”
or
“micro-project”
PLAN
Quality Control
(Construction/Acquisition)
Testing
Integration
DO
Delivery -> Stakeholder
Measure & Study Results
S
Study
Slide 76
Some Better Ways to Get Software Quality
you might like to learn more about.
•
•
•
•
•
•
•
•
•
•
10. Evolutionary Testing
9. Defect Prevention Process
8. Motivate by Reward for Quality
7. Entry Level Defect Control: No Garbage In
6. Exit Level Defect Control: No Garbage Out
5. Quantify Quality Requirements
4. Contract Towards Quality
3. Reuse Known Quality
2. Evolve Towards Quality
1. Design to Quality
End of Talk!
Next slides are for extra detail later.
Slide 78
A Use Case Critique Summary
By Don Mills [Mills01]
•
This Appendix lists the “problems with use cases” that I found in my brief,
and unscientific, survey of “the literature” (a mixture of books on my and my
employer’s shelves, with articles found by browsing the Internet). The first
eight entries come from the UI Design.net editorial for October 1999
(http://www.uidesign.net/1999/imho/oct_imho.html).
•
Solutions to all of the problems exist, but not within the RUP or the UML (or
only clumsily, ambiguously, or inconsistently), while outside those strictures
many competing solutions have been proposed.
•
Note that this is not intended as an exhaustive list ...
Slide 79
Use Cases ? 1
•
[The precise role of use cases is defined in The UML User Guide to be the
description of a set of actions performed by a system to deliver value to a
user: that is, system process design (at the user interface level).]
Understanding the problem -- the business and its rules -- must happen first.
Defining business process, system operating procedures or lines of
communication is secondary. Use Cases lead to definition of procedures
without proper understanding of the problem domain.
•
Developing Use Cases with a User Group or Business Analyst group leads to
premature interaction design by unskilled practitioners.
•
It’s hard to determine the completeness of Use Cases because of their “single
path” nature. This can lead to developers using their imagination to
complete exception handling cases or rarely taken paths. This can quickly
ruin a good Interaction Design.
•
Use Cases do not lend themselves to OO development due to their nature as
procedural descriptions of functional decomposition.
Slide 80
Use Cases ? 2
•
The User Group defining them are required to second guess the future system operation. They find
this difficult or even impossible. This leads to new systems which don’t make an adequate improvement
in operations procedures and can miss the opportunity to simplify a process and remove unnecessary
people.
•
Use Cases because of their procedural nature lend themselves to action-object User Interface designs. If
you need or want to have an object-action UI Design (aka OOUI) then Use Cases are a poor foundation.
•
Use Cases can end up as the repository for the whole requirements. Everything goes into the Use Cases
and the Business Analyst group will claim, “the design is done already, now write the code”. This is very
very bad for Interaction Design.
•
Use Cases are poor input for Object Modeling. They can lead to poor definition of classes from noun
extraction as you may otherwise be hoping to eliminate some of the domain terms used within the object
model.
•
The UML Specification is so non-specific and lacking in obligatory integrity checking that it is easy to
produce fragmentary, inconsistent, ambiguous use cases while still following an arguably correct
interpretation of all of the UML’s requirements. Cockburn identified 18 different definitions of Use
Cases, yielding over 24 different combinations of Use Case semantics.
Slide 81
Use Cases ? 3
•
Use cases do not require backward or forward traceability of requirements.
•
Standard UML specifications of use cases, together with descriptions in the Rational Object
Technology Series of publications, lack a number of important testability elements, such as
domain definitions for input and output variables, testable specifications of input-output
relationships, and sequential and interactional constraints and dependencies between use
cases.
•
Use cases, by definition in the UML Specification, emphasise ordering (“sequences of
messages exchanged ... [and] actions performed by the system”, V1.3). Physical sequence of
operations is normally a process restriction, not a true requirement, and when truly required
can be defined more abstractly by preconditions. Early emphasis on ordering is among the
worst mistakes an O-O project can make, but is hard to avoid if use cases are relied on for
analysis, since the UML Specification provides no standard way of expressing the common
situation of optional or flexible sequences of action.
•
Because the UML can neither express structure between use cases nor a structural hierarchy
of use cases in an easy and straightforward way, use cases are developed as an
“uncoordinated sprawl” of (by definition) discrete and unrelated functions. This creates a
loose collection of separate partial models, addressing narrow areas of the system
requirements, and presenting problems of relating these partial models and keeping them
consistent with each other.
Slide 82
Use Cases ? 4
•
The UML Specification provides no clear semantics of what a use case really
is (“representing a coherent unit of functionality” — but representing in what
way(s)?), and no consistent guidelines on how it should be described. This
“flexibility” may be seen as a good thing, but as the scale of design problems
rises, with larger design teams and more and more use cases, the sort of
“studied sloppiness” that can be beneficial for rapid design of modest
problems begins to become a stumbling block.
•
The UML Specification requires a use case to “represent” “actions performed
by the system”, but (despite a popular interpretation) does not restrict these
to externally visible actions. It is not clear what kind of events we should
concentrate on while describing use cases: external-stimuli and responses
only, or internal system activities as well.
•
Use cases may not overlap, occur simultaneously, or influence one another,
although actual uses of a computer system may do all of these.
•
The level of abstraction of use cases, and their length, are a matter of
arbitrary choice — “just enough detail, but not too much”. The only level of
detail that is “enough” is a level that removes all ambiguity.
Slide 83
Use Cases ? 5
•
Furthermore, no modularisation concepts are given to manage large use case models. The
include and extend concepts are presented as a means to provide extensibility, but no rigorous
semantics are provided for these concepts, allowing for multiple disparate interpretations and
uses.
•
Use cases in general are descriptions of specific business processes from the perspective of a
particular actor. As such they do not give a clear picture of the overall business context and
imperatives that actually generate the requirements for these business processes. This means
that they can be quite incomprehensible to non-domain experts.
•
For the same reasons, the important business requirements and imperatives underlying the use
case model become invisible when taken out of business context and expressed in discrete use
cases. Subsequent readers of the use case model may be quite unable to explain the forces
and business requirements that shaped the model.
•
Developing Use Cases with a User Group or Business Analyst group leads to a focus on how
users see the system’s operation. But the system doesn’t exist yet. (A previous system might
exist, but if it were fully satisfactory you would not be asked to change or rewrite it.) So the
system picture that use cases will present is based on existing processes, computerised or not.
The system builder’s task is to come up with new, better scenarios, not to perpetuate
antiquated modes of operation.
Slide 84
Use Cases ? 6 of 6 slides
• A UML use case model can’t specify interaction
requirements where the system initiates an interaction
between the system and an external actor.
• Because the UML Specification forbids interactions
between actors, use cases cannot model a rich system
context involving such interactions.
• The UML requires use cases to be independent of one
another, which means that it offers no way to model
persistent state across use cases, or to identify how the
initial system state required by a use case (specified in
Pre-conditions) is to be achieved.
Slide 85
References 1
• RPL: www.result-planning.com (Gilb site)
–
–
–
–
Requirements Slides
Evo method slides
Inspection slides and papers
Planguage Glossary (part of CE book)
• CE: Competitive Engineering book by Tom Gilb
– Forthcoming 2002 Addison Wesley
– A systems engineering and software engineering handbook, based on
Planguage. (parts at www.result-planning.com)
• Inspection:
– GG: Gilb and Graham: “Software Inspection” (1993)
– RR: Ronald A. Radice: “High Quality Low Cost Software Inspections”
2002, Paradoxicon Publishing, Andover MA, USA
• PoSEM: Gilb: Principles of Software Engineering Management
– (1988, Addison Wesley)
Slide 86
References 2
•
RUPSE: Rational Unified Process for Systems Engineering
– RUP• SE1.0
–
–
–
–
A Rational Software White Paper (possibly avialble via www.rational.com?)
TP 165, 8/01
This paper attempts to tackle the problem of system architecture for multiple quantified quality
requirements. TG
It fails in that it is not dealing with multiple quality requirements simultaneously, and is not doing
much more than arm waving.It does not do what I would calla good job of quantifying quality. It
does not do a good job of what I would consider showing the releation between a design and
multiple qualities and costs. But it is the best attempt to recognize the need and the problem to
come out of Rational so far. TG
• Mills01:”What’s the Use of a Use Case?”
– Don Mills
Copyright © Software Education Associates Ltd Wellington, New Zealand, 2001
– Should be available at www.softed.com
•
[MacCormack2001] Evo in MIT Sloan Review Winter 2001
– Product-Development Practices That Work: How Internet Companies
Build Software
•
Slides added after printed
documentation made for
conference
Kent Beck eXtreme Programming
88
(QUOTED WITH PERMISSION)
On 18/01/02 14:25, "Kent Beck" <[email protected]> wrote:
> I think you are conflating two concepts--how you create a process and how
> you create a community to use the process.
>
> I was quite "scientific" in my creation of XP. First I read voraciously and
> asked lots of questions about a topic. Then I experimented with a technique
> myself, generally to extremes so I understood the range of possible
> behavior. Whatever worked best for me I taught to a few people I trusted. If
> they reported good results I taught it to people I didn't know. Only if they
> reported good results would I begin recommending the practice in speeches
> and in print. I tried combinations of practices (not exhaustively, but I
> tried to be aware of interactions when they occurred).
>
> I put "scientific" in quotes above, because it isn't science like physics is
> science, but it is science as described by Sir Francis Bacon, and as
> contrasted to Aristotelian pure reasoning. My notebooks certainly wouldn't
> survive review by a physical scientist. But we aren't in the physical
> science business.
>
> Now I had some tested ideas, and I was ready to see them implemented on a
> large scale (we can get into motivation later). Given my resources, viral
> marketing driven by storytelling was the only option.
>
> Does that answer your question?
Return to
main
sequence
CMM Level 3 Results
89
Slide 90
This is the last slide of the set of
slides!