Transcript Software
Ten Quality Methods which probably won't improve product quality; and ten quality methods that more probably will succeed - for all aspects of quality” Tom Gilb www.Gilb.com [email protected] Result Planning Limited MASTER Version May 8th 2002 Slide 2 “won’t improve” (means, in this talk) • You do not end up getting ‘as good as you expected’, when you invested in the method. • You would not have used the method, if that were the result • ALTERNATIVE TALK TITLE • "Ten Quality Methods – which probably won't improve product quality, – and ten quality methods that more probably will succeed – for all aspects of quality - not just bugs. " Slide 3 Software • All elements of the software (not just code)The code – – – – – – – – – – The updates The data and databases The online user instruction The interfaces The user manuals The training materials The development and maintenance documentation The test planning The test scripts Anything else which is not clearly hardware Slide 4 Quality • All stakeholder valued aspects of system performance, including quality and savings – Speed – Capacity – Adaptability – Maintainability – Availability – Reliability – Portability – Reusability – Testability – Usability – And very many more Slide 5 Here are some popular methods or approaches which people expect some software quality from, but I suggest they will in practice be disappointed - often because of poor teaching and implementation - often because of lack of quality focus Slide 6 1. Go for CMM Level X • The Software Engineering Institute’s Capability Maturity Models • CMM and CMMI • Levels 2 to 5 • Results of Level 3 • Why Not? – Not “quality” oriented – CMM Bureaucracy overwhelms any idea of quality – Intended mainly to put reasonable software engineering processes in place, but – Does not directly address any quality aspect of a system • Maybe you can get quality in spite of CMM – but not because of it. Slide 7 2. Demand ‘Better’ (Conventional) Testing • Conventional Software Testing is not normally directed towards product or system quality levels – It looks for bugs (to oversimplify quite a bit!) • Conventional Testing is ‘function’ oriented (not quality-oriented) – It does not measure multiple quality type levels • Conventional Testing is too late in the development cycle – You get quality by designing it in, not testing it in! • Test can prove presence of bugs/defects but cannot prove their absence • (Note I will suggest Evolutionary Testing as a way of improving software quality later. Evo testing is not conventional, yet!) Slide 8 3. Use Cases • Use cases are not directed to qualities of a system • Use cases cannot express quality requirements • Use cases are not judged on the degrees of quality they deliver to an architecture • There is – no evidence published – about the relationship – between Use cases and any sort of quality • I’d be happy to be informed of evidence I have overlooked! The list of problems with Use Cases and UML I have no intention of going through this in detail during my talk, but I wanted to make the details available to the participant - to lend more credibility to my point. The details are at the end of these slides. Slide 10 A Use Case Critique Summary By Don Mills [Mills01] • This Appendix lists the “problems with use cases” that I found in my brief, and unscientific, survey of “the literature” (a mixture of books on my and my employer’s shelves, with articles found by browsing the Internet). The first eight entries come from the UI Design.net editorial for October 1999 (http://www.uidesign.net/1999/imho/oct_imho.html). • Solutions to all of the problems exist, but not within the RUP or the UML (or only clumsily, ambiguously, or inconsistently), while outside those strictures many competing solutions have been proposed. • Note that this is not intended as an exhaustive list .. • DETAILS AT END OF THESE SLIDES. Slide 11 4. RUP, RUP SE “System Quality – Provides the views to support addressing system quality issues in an architecture driven process” [RUP SE] • “In RUP SE, [RUP] – this idea is carried forward, adding systems engineers to the mix. – Their area of concern is the design and specification of the hardware and system deployment to ensure that the overall system requirements are addressed.” • Rational Unified Process never did address quality. • RUP SE (Systems Engineering) is a belated, but weak (TG Opinion) attempt to patch that hole in RUP Slide 12 RUP SE Example of ‘dealing with quality’ [RUP] Slide 13 5. Conventional Inspection, Peer reviews, Reviews • Reviews do not generally focus on quality. • Specific reviews may attempt to address quality. But in my view not professionally (quantified!). • Conventional Inspections as they are usually done – will fail to deal with quality in general, – and will be very cost ineffective for quality in terms of bugs • Why are ‘Conventional inspections a failure route? – They focus on clean up of bad work (high bug injection rates) – Their effectiveness for bugs is maximum 60% (one pass) – They are rarely done at full effect ( likely effect 10%30%) Slide 14 5. (continued) Inspections, to deal with quality, must: • Deal with all aspects of quality engineering including quality requirements, quality design • Define required quality practices in terms of – process ‘Rules’ (failed rule = defect, detected by Inspection) Like: • “All quality requirements will be defined with a scale of measure” • “All design specification will be evaluated quantitatively on an impact estimation table” Slide 15 6. Extreme Programming XP • XP has no direct focus • XP can’t hurt you but on quality it does not pretend to • But there are several solve the larger mechanisms which quality attribute can help reduce problem injection of bugs in XP • Click here for XP • That does not deal development method with many other types of quality. Kent Beck XP 16 XP Pair Programming IEEE Software July/Aug 2000 As Beck writes, “Even if you weren’t more productive, you would still want to pair, because the resulting code quality is so much higher.”10 By working in tandem, the pairs completed their assignments 40% to 50% faster. 17 18 Different View 12 March 2002 dear tom, browsing through your presentation "10 guaranteed ways ..." that i did not have the opportunity to listen to, i noticed that you also have a slide concerning the XP practice of pair programming. you might be interested in a new study on pair programming to be found at http://dialspace.dial.pipex.com/town/drive/gcd54/conference2001/papers/nawro cki.pdf . the study is essentially contradicting earlier findings by laurie williams. i actually set up a paper "Extreme Programming Considered Harmful for Reliable Software Development" that you can find at http://www.avoca-vsm.com/Dateien-Download/ExtremeProgramming.pdf and you might want to have a look at it. regards, gerold keefer ===================================================================== AVOCA GmbH - Advanced Visioning of Components and Architectures Kronenstrasse 19 D-70173 Stuttgart fo +49 711 2271374 fa +49 711 2271375 http://www.avoca-vsm.com mailto:[email protected] Woodward asks about XP 1/3 19 Questions on XP from [email protected] In response to http://www.extremeprogramming.org 1. How do you manage required changes in Software Architecture? Not all programmers are architects and not all architects are programmers so who does the work and what do the programmers do while the architecture is changed? 2. It seems to assume that all team members are equally experienced and skilled i.e. can make changes to the system with equal levels of confidence and competence. Otherwise, who is responsible for the integrity of the system, data models etc? 3. Who specifies the requirements? How are they specified? Or do the programmers have free reign to interpret often-fuzzy statements by the users however they want to? 4. What does the Project Manager do? 5. Why is XP different to what is known as RAD? OR DSDM? Or Evo? Or RUP? 6. XP promotes good practice, right? So where is the Process? 7. How does a system programmed via XP allow changing requirements to be implemented more easily than in other methods? Getting early feedback will not itself provide the answers. 8. How does XP help to prevent bugs getting into code in the first place? You cannot test quality into software; you must build it in. 9. It assumes very close contact with end users, right? This is rarer than you might think. And who co-ordinates and organises and presents the user requirements? Who checks them and makes sure that they do not invalidate the integrity of the system, current or proposed? 10. All the XP documentation that I have seen seems to set it up as the only way to handle changing requirements. I refer again to point 5 above. 11. How does XP mitigate risk? Woodward on XP 2/3 20 12. How can XP handle projects with many man-years of estimated effort? Or many and complex interfaces? 13. (deleted as redundant) 14. How are the goals of XP different to those of any other method i.e. to produce software to the customer on time and to budget? Why should XP have different goals (if they do)? (Possibly redundant SW) 15. Why should XP make it any easier to produce quality products than any other method? Why should software engineering be easy just because the rules are? (Possibly redundant SW) 16. What’s difference between User Stories (XP) and Use Cases + UML? Why should XP be better in this respect? 17. What is refactoring and how does it product the most effective architecture? How does this differ to what we do already? 18. Is XP telling me that programmers can do effective functional testing in pairs or otherwise? How? What does XP see as the purpose of testing? 19. If the Customers are expected to write User Stories and they do not use some form of precise language then where is the quality, accuracy, consistency etc built in? Is this not a recipe for getting all the ambiguities into the code i.e. hacking? Woodward on XP 3/3 21 20. Don't bother dividing the project velocity by the length of the iteration or the number of developers. This number isn't any good to compare two project's productivity because each project team will have a different bias to estimating stories and tasks, some estimate high, some estimate low. It doesn't matter in the long run. Tracking the total amount of work done during each iteration is the key to keeping the project on an even keel. I agree – you must measure and compare estimates with actuals to learn! 21. Iterative Development adds agility to the development process. Divide your development schedule into about a dozen iterations of 1 to 3 weeks in length. Gilb says 2%. I think this is arbitrary and a natural size develops (environmental factors). Team size plays a part – see OMAR. 22. Don't schedule your programming tasks in advance. Instead have an iteration planning meeting at the beginning of each iteration to plan out what will be done. It is also against the rules to look ahead and try to implement anything that it is not scheduled for this iteration. There will be plenty of time to implement that functionality when it becomes the most important story in the release plan. When you never add functionality early and practice just-in-time planning it is easy to stay on top of changing user requirements. YUP! 23. What if the real customers cannot be available? Woodward on XP 3/3 22 20. Don't bother dividing the project velocity by the length of the iteration or the number of developers. This number isn't any good to compare two project's productivity because each project team will have a different bias to estimating stories and tasks, some estimate high, some estimate low. It doesn't matter in the long run. Tracking the total amount of work done during each iteration is the key to keeping the project on an even keel. I agree – you must measure and compare estimates with actuals to learn! 21. Iterative Development adds agility to the development process. Divide your development schedule into about a dozen iterations of 1 to 3 weeks in length. Gilb says 2%. I think this is arbitrary and a natural size develops (environmental factors). Team size plays a part – see OMAR. 22. Don't schedule your programming tasks in advance. Instead have an iteration planning meeting at the beginning of each iteration to plan out what will be done. It is also against the rules to look ahead and try to implement anything that it is not scheduled for this iteration. There will be plenty of time to implement that functionality when it becomes the most important story in the release plan. When you never add functionality early and practice just-in-time planning it is easy to stay on top of changing user requirements. YUP! 23. What if the real customers cannot be available? Stuart Woodward comments XP 23 [email protected] Slide 24 7. Better Programmers • Programmers do not design quality into systems • Designers, engineers, architects do • Good Programmers will correctly program low quality into a system to meet bad requirements or design on time Slide 25 8. Outsourcing • Outsourcing will not • You have to contract in itself give you for it better software quality • You have to specify the levels you want • You have to confirm you got it Slide 26 Evolutionary Project Management Contract Modifications 1/2 Design idea: designed to work within the scope of present contract with minimum modification. An Evo step is considered a step on the path to delivering a phase. You can choose to declare this paragraph has priority over conflicting statements, or to clean up other conflicting statements. §30. Evolutionary Result Delivery Management. 30.1 Precedence. This paragraph has precedence over conflicting paragraphs. 30.2 Steps of a Phase. The Society may optionally undertake to specify, accept and pay for evolutionary usable increments of delivery, of the defined Phase, of any size. These are hereafter called “Steps”. 30.3 Step Size. Step size can vary as needed and desired by the Society, but is assumed to usually be based on a regular weekly cycle duration. 30.4 Intent. The intent of this evolutionary project management method is that the Society shall gain several benefits: earlier delivery of prioritised system components, limited risk, ability to improve specification after gaining experience, incremental learning of use of the new system, better visibility of project progress, and many other benefits. This method is the best known way to control software projects (now US DoD Mil Standard 498. 1994). 30.5 Specification Improvement. All specification of requirements and design for a phase will be considered a framework for planning, not a frozen definition. The Society shall be free to improve upon such specification in any way that suits their interests, at any time. This includes any extension, change or retraction of framework specification which the Society needs. Slide 27 Evolutionary Project Management Contract Modifications 2/2 30.6 Payment for Acceptable Results. Estimates given in proposals are based on initial requirements, and are for budgeting and planning purposes. Actual payment will be based on successful acceptable delivery to the Society in Evolutionary Step deliveries, fully under Society Control. The Society is not obliged to pay for results which do not conform to the Society-agreed Step Requirements Specification. 30.7 Payment Mechanism. Invoicing will be on a Step basis triggered by end of Step preliminary (same day) signed acceptance that the Step is apparently as defined in Step Requirements. If Society experience during the 30 day payment due period demonstrates that there is a breach of specified Step requirements, and this is not satisfactorily resolved by the Company, then a Stop Payment signal for that Step can be sent and will be respected until the problem is resolved to meet specified Step Requirements. 30.8 Invoicing Basis. The documented time and materials will be the basis for invoicing a Step. An estimate of the Step costs will be made by the Company in advance and form a part of the Step Plan, approved by the Society. 30.9 Deviation. Deviation plus or minus of up to 100% from Step cost and times estimates will normally be acceptable (because they are small in absolute terms), as long as the Step Requirements are met. (The Society prioritises quality above cost). Larger deviations must be approved by the Society in writing before proceeding with the Step or its invoicing. 30.9 Scope. This project management and payment method can include any aspect of work which the Company delivers including software, documentation and training, maintenance, testing and any requested form of assistance. A Subcontracting Policy • 1. Specifications are to made to give both us, and the suppliers, the highest degree of flexibility ( for changes and unforeseen things) to carry out the real intent of the contract. – For example: we shall avoid giving detailed design or feature lists, when we can control the product or service quality and performance better by a higher level statement which forces all necessary detail to happen. – For: instead of a list of usability features, we should make sure we have the measurable testable usability quality requirements specified. – If necessary the proposed detail can be a variable attachment which itself is not mandatory but for guidance. Policy Quality Control • All contracts, Requests for proposal and attached technical specifications will be Inspected using a rigorous inspection process against our current specification rules for contracts or whatever document types we are using. • Exit (for signing or reviewing) will be given when it is measured that there are less than 0.1 major defects/Logical page probably remaining. Evo Form for quantified stepwise specs of the quality levels you want Buyer Requirements Functional Requirements Benefit/Quality/Performance Requirements Tag:____________ GIST: __________ SCALE:_____ METER [END STEP ACCEPTANCE TEST] ___ PAST[WHEN?, WHERE?] ___ MUST [when?, where?]____________ PLAN[when?, where?]____________ Tag:____________ AMBITION LEVEL: __________ SCALE:_____ METER [END STEP ACCEPTANCE TEST] ___ PAST[WHEN?, WHERE?] ___ MUST [when?, where?]____________ PLAN[when?, where?]____________ Resource Constraints: Calendar Time: Work-Hours: Qualified People: Money (Specific Cost Constraints for this step): Other Constraints Design Constraints Legal Constraints Generic Cost Constraints Quality Constraints Assumptions: Dependencies: Design: Technical Design (for Benefit Cost requirements) Tag: Description (or pointer to tags defining it): Expected impacts: Evidence (for expected level of impacts) Source (of evidence) Slide 31 9. Deadline Pressure • When the • Deadline will deadline is win clear and holy, • You will fail to but the quality get the quality is not clear and you want not holy Slide 32 10. Define ‘Quality’ in terms of Bugs in code • Do you define food quality in terms of bugs per liter? • The qualities you and your stakeholders want are many and varied, and bugs is only one measure, and not the most important one. Slide 33 11. Re-usable software • One client of mine invested on a very large scale in reusable modules • But when it came time to reuse them over 60% of the modules had far too many bugs in them to use at all. • What is the lesson? Slide 34 Summary of 10+1 Ways to Fail at Improving Software Quality • • • • • • • • • • • 1. Go for CMM Level X 2. Demand Better Testing 3. Use Cases 4. RUP 5. Inspection, Peer reviews, Reviews 6. Extreme Programming 7. Better Programmers 8. Outsourcing 9. Deadline Pressure 10. Define ‘Quality’ in terms of Bugs in code 11. Re-usable software Slide 35 Ten Better Approaches to Improve Software Quality • Better? – More effective – More efficient (effect/cost) – Better proven documented track record available – More direct attack on measurable quality levels themselves • Improve? – Quantitative increase in quality levels attainable at a given cost. – Significant increase Slide 36 10. Evolutionary Testing • What is it? – All quality attributes can be measured at each Evo step – There are many steps (about 50 steps) – Delivered quality levels are compared to numeric plans – Tracking is done on an impact estimation table – Delivery steps are to real stakeholders, not just testers - • Why is it better? – Focus is on total system ( people, data, platforms, real work) not code alone – Early and frequent measurement – Opportunity to learn from small failures and to prevent big ones Philips Evo Pilot May 2001 # Jobs Week [- 5%,+10%] 6 wk 8 11 wk 9 19 wk 10 6 25 wk 11 6 25 wk 12 42 wk 13 55 wk 14 55 wk 15 55 wk 16 55 wk 17 [-10%,+20%] [-15%,+30%] 3637 out of range 5 1 3 1 Frank van Latum, The Manager 7 7 3 4 3 6 9 17 3 5 31 3 37 6 11 39 37 2 9 48 50 1 6 1 6 4 1 2 4 1 The GxxLine PXX Optimizer EVO team proudly presents the success of the Timing Prediction Improvement EVO steps. Shown are the results of the test set used to monitor the improvement process. The size of the test set has grown, as can be seen in the first column. (In the second column the week number is shown.) We measured the quality of the timing prediction in percentages, in which –5% means that the prediction by the optimizer is 5% too optimistic. Excellent quality (–5% to +10%) is given the color green, very good quality quality is yellow, good quality is orange, & the rest is red. The results are for the ToXXXz X(i) and EXXX X(i), and are accomplished by thorough analysis of the machines, and appropriate adaptation of the software. The GXXline Optimiser Team presented the word document below to the Business Creation Process review team. The results were received with great applause. The graphics are based on the timing accuracy scale of measure that was defined with Jan verbakel. Classification: Unclassified Erieye Project: Inspection Cleanup per Evo Delivery. Getting all causes of bad quality at early stages 38 The deliveries in the graph below are ordered in time. Observe also that the deliveries differ quite a lot in size (e.g. numbers 6 and 20 are very small). Corrected Ma jors per Net Page per Deliv ery 2,50 2,00 1,50 1,00 0,50 0,00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Delivery The graph shows the total Major defects/page for all documents types for all inspections in each delivery. The total number of inspections is 994. Source: Leif Nyberg, Project manager, Ericsson Sweden, in a case study [Personal Communication to TG] 39 Value delivery in Omar Project OMAR Case delivery value vs Waterfall (1998) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Pro ject Mo n th Project FF Cumulative Delivered Functionality Project FF Benefit / Cost OMAR Cumulative Delivered Functionality OMAR Benefit / Cost Using Evolutio & Services Ltd. nary Project Management, To Get More Quality, From Fewer Resou rces, In Less Time; © [email protected] 1999 By Stuart Woodward, DoubleHelix Software 5 An example of a typical one-week Evo cycle at the HP Manufacturing Test Division during a project. [MAY96] Development Team Monday Tuesday Wednesday Thursday Friday System Test and Release Version N Decide What to Do for Version N+1 Design Version N+1 Develop Code Develop Code Meet with users to Discuss Action Taken Regarding Feedback From Version N-1 Complete Code Test and Build Version N+1 Analyze Feedback From Version N and Decide What to Do Next 40 Users Use Version N and Give Feedback Meet with developers to Discuss Action Taken Regarding Feedback From Version N–1 Impact Table for Step Management 41 Reliability 99%99.9% Perf orm -ance 11sec.-1 sec. Usabili ty 30 min. -30 sec. Capital Cost 1 mill. Enginee -ring Hours 10,000 Calendar Time Step #1 Plan A: {DesignX, Function -Y} 50% ±50% Step #1 Actual Differe -nce. -is bad + is good Total Step 1 Step #2 Plan B: {Design Z, Design F} 30% ±20% Step #2 Actual Step #2 Difference Total Step 1+2 Step #3 Next step plan 40% -10% 40% 20% -10% 60% 0% 80% ±40% 40% -40 40 30% ±50% 30% 0 70% 30% 10% ±20% 12% +2% 12% 20% ±15% 5% -15% 17% 83% 20% ±1% 10% +10% 10% 5% ±2% 10% -5% 20% 5% 2% ±1% 4% -2% 4% 10% ±2.5% 3% +7% 7% 5% 1 week 2 weeks -1week 2 weeks 1 week 0.5 weeks +0.5 wk 2.5 weeks 1 week Evo and Requirements, Conceptually ‘Design’ is what delivers performance, and costs resource Design X (done on step 1) 1 One or more constraints 1 2Storage 1 Other 1 Resources 2 Terminal (functions) 1 Storage 2 2 Reliability 2 Other1 Performance 2 1 Evo development gradually delivers performance, while eating up resources by Implementing ‘design’ Design _ Design Y (done on step 2) (done on step n) 2Usability n n Multiple Test Levels of Microsoft Evo 43 Office 2002 Level Vital 3rd 6->10 Weeks Vital 3rd 6->10 Weeks Reference: Cusomano: Microsoft Secrets. Drawing by TG See reference [MacCormack2001] 44 Intel View of Industrial Evo cycle Project Milestones: Evolutionary Product Life Cycle Ver. 0.6 Development Investment Approval (DIA) Exploration Planning High Level Design (HLD) Product Documentation: Product Overview Proposal (POP) Product Iteration Plan Approval (IPA) Mid Iteration Review (MIR) Iteration Release Approval (IRA) Iterations and Releases Iteration Requirements Product Iteration Plan (PIP) Iteration Estimate Iteration Data Model Figure 1. The Evolutionary Product LifIntel e CycleOregon Courtesy: Erik Simmons, Product Discontinuance Approval (PDA) Maintenance Install Checklist Iteration Test Plan Product Reference Manual Post Project Review (PPR) Slide 45 9. Defect Prevention Process DPP • What is DPP? – CMM Level 5 – Continuous Process learning – Maybe 2000 small changes per year (IBM MN) – Avoiding defect injection (bad doesn’t happen!) – 13x more cost effective than defect removal (Inspection). – 50% to 95% of all defects can be prevented • Why is it better for Quality? – It attacks upstream (requirements, design, contracts) – It is completely general (deals with all quality aspects, not just bugs) For more detail on DPP see Gilb, Software Inspection, Ch 7 & 17 (by Robert Mays) DPP Inventor 46 46 The Bottom Line The Bottom Line for Process Improvement ... 50 40 Start Improvement Initiative Cost of rework 30 $15.8 million Savings in rework alone 20 Appraisal cost 10 Prevention cost 1987 1988 ROI = 770% 1989 1990 Raymond Dion, Process Improvement and the Corporate Balance Sheet (July 93) IEEE Software, July 1993, pp 28-35 1991 1992 46 47 47 Reduced Cost of Quality 50% Cost Of Quality = COConformance + CONonConformance 40% COC=Appraisal + Prevention CONC= cost of “fix and check fix” (“rework”) 30% COC (Cost for doing it right) 20% 10% CONC (Cost of doing it wrong) Philip Crosby’s “Cost Of Quality” 0% 1988 ∑65% 1989 1990 1991 1992 1993 ∑ Cost of Quality=COC 1994 1995 ∑23% 47 48 48 Defect Prevention Experiences: Most defects can be prevented from getting in there at all Cleanroom levels: approach zero def. IBM MN 99.99%+ fixes:Key= "DPP" 90% 80% 70% 50% % of usual defects prevented Mays 1993, User 1996 "72% in 2 years" <-tg Mays & Jones (IBM) 1990 1 2 3 4 5 6 •Years of continuous improvement effort North Carolina IBM Research Triangle Half-day ParkInspection Networking Laboratory Economics. [email protected] 48 49 49 Prevention + Pre-test Detection is the most effective and efficient 90% 80% 70% 50% 100% 95% cumulative detection Use by Inspection (state of the art limit) Test 70% Detection by Inspection <- Mays 1993, 70% prevented "Detected \ Cheaply" <-Mays & Jones 50% prevented(IBM) 1990 Prevented" " 1 • • 2 3 4 5 6 Prevention data based on state of the art prevention experiences (IBM RTP), Others (Space Shuttle IBM SJ 1-95) 95%+ (99.99% in Fixes) Cumulative Inspection detection data based on state of the art Inspection (in an environment where prevention is also being used, IBM MN, Sema UK, IBM UK) Half-day Inspection Economics. [email protected] 49 Slide 50 8. Motivate by Reward for Quality • What is motivation by reward? – Connecting actual delivery of specific quality levels to some sort of personal and team rewards (not necessarily money). • Why is it better? – We don’t normally do this at all – We reward on time delivery of bad qualities Slide 51 8. Reward Quality (see the contracts in earlier slides) • Example: • Define the quality you want in ‘Planguage’ [see refs CE, Posem] • Maintainability: – Scale: Average minutes to find, correct and regression test for a random bug. – Meter [Evo Step Acceptance] at least 10 average bugs and 2 qualified maintainers. – Plan [Contract, Each Evo Step] 60 minutes. • Then stipulate: • In a sub-supplier contract: – Payment invoice-able when all defined quality levels are proven delivered. • For in-house team – Delivery can only be considered as ‘done’ when the defined tests prove that the defined levels of all qualities due are in fact delivered – No quality? You are late! Slide 52 7. Entry Level Defect Control: No Garbage In • What is it? – All software engineering processes (contracting to coding) will make sure that the specifications they get are reasonably ‘good’. – Good practice is defined by a set of ‘Rules’ (like Clear, Complete, Consistent) – A sample ( 1 or more pages) of incoming information will be taken (Inspection) – A measure of Major Defects per page will be taken – A maximum level of defects will be allowed used • Why is it better? – Right now we have a Major defect level of about 150 ± 100 Major defects/page against a simple basic set of rules – Acceptance levels should be at less than 1.0 – Average cost of a Major defect is about 3-10 hours project time lost – Current levels of Major Defects have delayed real projects by 2 years (Ohio Case). Slide 53 7. No Garbage In (continued) • Policy – “No software process shall use input specifications with more than one major defect per page (300 non commentary words)” – “exceptions shall be documented and approved formally” • Practice (how to measure garbage level) – 1. Rules agreed (3 go a long way) – 2. Sample size set (1 page is fine) – 3. Processes are officially redefined to include this Entry control – 4. Time level is set (up to 30 minutes is fine) “Rules”: Best Practice Strong Advice Introduce the following three rules for inspecting a requirements document: Three Rules for Requirements: – 1. Unambiguous to intended Readership – 2. Clear enough to test. – 3. No Design specs (= ‘how to- be good’) mixed in • Mixed up with Requirements (= ‘how good - to be’) Report for page 82 (reported inspection results on requirements document, 4 managers) • • • • • Total Defects, Majors, Design (part of Total and M&m) M+m Maj. Design --------------------------41, 24, D=1 33, 15, D=5 • 44, 30, • 24, 3, • • • Team would log unique Majors about ~2x30=60 (2X high score) Which is 30% of total , so total this page is about ~180 Majors If we attempt to fix 60 we log, and correctly fix 5/6 then ~10 are failed fixes, so: The total remaining after inspection and editing = 10+120 = ~130 Majors per page. • D=10 D=5 Extrapolation to Total Majors in Whole Document • Page 81: 120 majors/p (3/4 page checked by 4 other managers) • Page 82: 180 Majors/p • Average 150 Majors/physical page x 82 pages = 12,300 Majors in the document. ----------------• If a Major has 1/3 chance of causing loss downstream – 4,100 majors will cause a loss • And each loss is avg. 10 hours; – (9.6 hours median at one client for 1,000 majors) – then total project Rework cost is about 41,000 hours loss. • (This project was in reality over a year late) – 1 year = 2,000 hours for 10 people More feedback • “Love the slides on in-process document review. • We are using this with requirements documents, and have been able to double the quality of the documents with only a few hours of effort.” • " Erik Simmons, Intel, Oregon " "[email protected] • January 9th 2002 Slide 58 6. Exit Level Defect Control: No Garbage Out • What is Exit about? – Same as Entry control except you do quality control on your own work • You check a spec against your rules for good specs • You determine the defect density (defect injection rate) – We can perform checks using samples – During a work, so we don’t get surprised at the end • Why is it better? – It discovers problems very early – It works at all levels of the development and maintenance processes • Not just test and operate for code – It can impact all types of quality (not just ‘bugs’) – Very inexpensive and fast (10-30 minutes/check) Slide 59 The NO ‘G.O.’ Policy • Policy (kept simple) – We will not release any work which has unacceptable defect density. – We will check our work as it emerges, not just at the end. – If bad work is being produced, we will change ‘whatever it takes’ to avoid defect injection. – (=CMM5 DPP) • Practical Implementation – Exit Condition: • “Maximum 1 Major defect/300 NC words” – Sampling Rate: • Check a page about every 10 pages – Checkers: author and/or one colleague How to Inspect a large amount of specification or code! Sampling for Dummies “Do a page and then decide what to do.” 60 61 Sample “During” Authoring 1 Sample 4 Majors Then sample one page with Inspection Too many defects Good Enough The Author is expected to write about 45 pages. First we write only 5 of these. Write New Pages 4 Majors Sample Exit? Re-Write all 5 pages No 62 Sample “During” Authoring 2 Exited Pages 5 Major Sample Then sample one page with Inspection The Author is expected to write about 45 pages Now the Author can write 5 more pages. I’ve been driving for 2 hours without an accident, so I can now close my eyes while driving. Too many defects Good Enough Write New Pages 5 Major Sample Exit? Yes No Re-Write all 5 pages Slide 63 5. Quantify Quality Requirements • What does that mean? – Specify a number, on a scale of measure indicating how much quality you want – Do this for all types of quality you want to manage (reliability, maintainability, usability) – Use ‘Planguage’ [CE reference back of slides] for example as a format. • How do we do it? – Identify the critical quality types (name tag it) • Availability: – Define a scale of measure for them • Scale: Hours MTBF – Decide on a good enough level of quality for the application • Plan [First One] 30,000 Slide 64 5. Quantify Quality 2 • Policy – All critical quality requirements will always be specified quantitatively – We will measure the level of quality actually delivered • During development • At acceptance • In operation • Practical – Train people in Planguage – Make specification templates (next slide) available – Make knowledge of good scales of measure and practical meters (tests) available. Scalar Requirements Template + <Hints> 65 <name tag of the objective> Ambition: <give overall real ambition level in 5-20 words> Type: <quality|objective|constraint> Stakeholder: { , , } “who can influence your profit, success or failure?” Scale: <a defined units of measure, with [parameters] if you like> Meter [ <for what test level?>] ==== Scalar Benchmarks ============= the Past Past [ ] <estimate of past> <--<source> Record [ <where>, <when record set> <estimate of record level> ] <-- <source of record data> Trend [ <future date>, <where?> ] <prediction of level> <- <source of prediction> ========= Scalar Constraints ============== Fail borders Limit [ ] <- Source of Limit Must [ ] <-- <source> ===== Scalar Targets ============= the future value and needs Wish [ ] <- <source of wish> Plan […] <target level> <-- Source Stretch [ ] <motivating ambition level> <- <source of level> Min: Erieye Project: Usability.Intuitiveness Requirement (Real Example) 66 Usability. Intuitiveness Ambition: High probability in % that operator will <immediately> within a specified time from deciding the need to perform the task (without reference to handbooks or help facility) find a way to accomplish their desired task. Scale: Probability that an <intuitive>, TRAINED operator will • find a way to do whatever they need to do, • without reference to any written instructions (i.e. on paper or on-line in the system, • other than help or guidance instructions offered by the system on the screen during operation of the system) • within 1 second of deciding that there is a necessity to perform the task. • <-- MAB "I'm not sure if 1 second is acceptable or realistic, it's just a guess" Meter: To be defined. Not crucial this 1st draft - TG Past [GRAPES] ~80% ? LN Record [MAC] 99%? TG Assumption: we have human operators! Must [TRAINED, RARETASKS [{<1/week,<1/year}] ] 50 - 90%? MAB Plan [TASKS DONE [<1/week (but more than 1/Month)]] 99% ? LN [TASKS DONE [<1/year]] 20% ? - JB [Turbulence, TASKS DONE [<1/year] ] 10% ? - TG Min: Slide 67 4. Contract Towards Quality • What does that mean? – When you contract for software work, you will define the work partly by quantified quality levels expected. – This is the same as the quantified qualities in the last point, just that we do it in legal contracts. • It gets taken more seriously than mere requirements! • Why is it better for software quality? – You are more likely to get the quality levels you want • At least you shouldn’t pay if you don’t! – All aspects of the development process will have to find a way to deliver the contracted levels. Slide 68 Symbolic ‘Quality” Contract • The Availability will be at – 99.98% • The Maintainability will be – 60 minutes/bug to find, fix and test. • The Usability will be at – 30 seconds for average task familiarization. Slide 69 3. Reuse Known Quality • What does that mean? – The various quality dimensions of a reusable software component are • • • • • known, measured, predictable, quantified, documented • Why is it better for quality? – The qualities you get are ‘by selection’, rather than ‘by process’. – This is a conventional engineering paradigm (use known components with known attributes) Slide 70 2. Evolve Towards Quality • What does that mean? – It means that your projects should be divided up into small (2%) stakeholderresult-delivery increments • Each one to deliver at planned quantified levels • Optionally going initially for the ‘final quality levels’ at an initially low level of functionality – It means that you have to prove you know how • to get your quality levels, • early and frequently. • Why is it (Evo) better for quality? – You have to prove all mechanisms; early and frequently for • • • • • • • • Contracts Requirements Design Reused components Development process Staff Subcontractors Stakeholder reactions Microsoft IE 3.0 Source: MacCormack, Product-Development Practices That Work: How Internet Companies Build Software in WINTER 2001 MIT SLOAN MANAGEMENT REVIEW 71 Linux Evolution Source: MacCormack, Product-Development Practices That Work: How Internet Companies Build Software in WINTER 2001 MIT SLOAN MANAGEMENT REVIEW 72 Slide 73 1. Design to Quality • What does that mean? – It means we get the qualities we want by actively designing/engineering and architecting – That means by choosing the design ideas which predictably will give us the qualities we require. – It means defining all critical quality dimensions quantitatively – It means evaluating all design options quantitatively in relation to our quality requirements levels. • Why is it better for software quality? – Because your design process is then • focused on the qualities you want and • on the designs which will give those qualities. – Because this is the historically proven way to get quality in engineering and architectural disciplines – Because current socalled ‘software engineering’ (example CMM, RUP) does not even have this ‘design’ idea on the agenda! Design process example: An example of considering two alternatives, based on their impacts on qualities, their cost, and their risk. 74 (Impact Estimation tool [CE, Posem]) Reliability 99%-99.9% Performance 11sec.-1 sec. Usability 30 min.-30 sec. Capital Cost 1 mill. Engineering Hours 10,000 Worst Case B/C ratio “Worst Worst” case considering estimate credibility factor Step Candidate A: {Design-X, Function-Y} 50% ±50% 80% ±40% Step Candidate B: {Design Z, Design F} 100% ±20% 30% ±50% -10% ±20% 20% ±15% 20% ±1% 2% ±1% 5% ±2% 10% ±2.5% (0+40-30)/(21+3) =0.42 (80-20+5)/(7+12.5) = 3.33 0.2 x 3.33= 0.67 0.8 x 0.42= 0.33 A’s Credibility=0.8 B’s Credibility=0.2 (High) (Low) See slide note for explanation The Head:Body Model of Evo. Architecture-level design combined with step level design. Project Architecture and Management Level 75 Requirements "Head" and Architecture Plan/Study/Act A Step Requirements Design "Body” or “micro-project” PLAN Quality Control (Construction/Acquisition) Testing Integration DO Delivery -> Stakeholder Measure & Study Results S Study Slide 76 Some Better Ways to Get Software Quality you might like to learn more about. • • • • • • • • • • 10. Evolutionary Testing 9. Defect Prevention Process 8. Motivate by Reward for Quality 7. Entry Level Defect Control: No Garbage In 6. Exit Level Defect Control: No Garbage Out 5. Quantify Quality Requirements 4. Contract Towards Quality 3. Reuse Known Quality 2. Evolve Towards Quality 1. Design to Quality End of Talk! Next slides are for extra detail later. Slide 78 A Use Case Critique Summary By Don Mills [Mills01] • This Appendix lists the “problems with use cases” that I found in my brief, and unscientific, survey of “the literature” (a mixture of books on my and my employer’s shelves, with articles found by browsing the Internet). The first eight entries come from the UI Design.net editorial for October 1999 (http://www.uidesign.net/1999/imho/oct_imho.html). • Solutions to all of the problems exist, but not within the RUP or the UML (or only clumsily, ambiguously, or inconsistently), while outside those strictures many competing solutions have been proposed. • Note that this is not intended as an exhaustive list ... Slide 79 Use Cases ? 1 • [The precise role of use cases is defined in The UML User Guide to be the description of a set of actions performed by a system to deliver value to a user: that is, system process design (at the user interface level).] Understanding the problem -- the business and its rules -- must happen first. Defining business process, system operating procedures or lines of communication is secondary. Use Cases lead to definition of procedures without proper understanding of the problem domain. • Developing Use Cases with a User Group or Business Analyst group leads to premature interaction design by unskilled practitioners. • It’s hard to determine the completeness of Use Cases because of their “single path” nature. This can lead to developers using their imagination to complete exception handling cases or rarely taken paths. This can quickly ruin a good Interaction Design. • Use Cases do not lend themselves to OO development due to their nature as procedural descriptions of functional decomposition. Slide 80 Use Cases ? 2 • The User Group defining them are required to second guess the future system operation. They find this difficult or even impossible. This leads to new systems which don’t make an adequate improvement in operations procedures and can miss the opportunity to simplify a process and remove unnecessary people. • Use Cases because of their procedural nature lend themselves to action-object User Interface designs. If you need or want to have an object-action UI Design (aka OOUI) then Use Cases are a poor foundation. • Use Cases can end up as the repository for the whole requirements. Everything goes into the Use Cases and the Business Analyst group will claim, “the design is done already, now write the code”. This is very very bad for Interaction Design. • Use Cases are poor input for Object Modeling. They can lead to poor definition of classes from noun extraction as you may otherwise be hoping to eliminate some of the domain terms used within the object model. • The UML Specification is so non-specific and lacking in obligatory integrity checking that it is easy to produce fragmentary, inconsistent, ambiguous use cases while still following an arguably correct interpretation of all of the UML’s requirements. Cockburn identified 18 different definitions of Use Cases, yielding over 24 different combinations of Use Case semantics. Slide 81 Use Cases ? 3 • Use cases do not require backward or forward traceability of requirements. • Standard UML specifications of use cases, together with descriptions in the Rational Object Technology Series of publications, lack a number of important testability elements, such as domain definitions for input and output variables, testable specifications of input-output relationships, and sequential and interactional constraints and dependencies between use cases. • Use cases, by definition in the UML Specification, emphasise ordering (“sequences of messages exchanged ... [and] actions performed by the system”, V1.3). Physical sequence of operations is normally a process restriction, not a true requirement, and when truly required can be defined more abstractly by preconditions. Early emphasis on ordering is among the worst mistakes an O-O project can make, but is hard to avoid if use cases are relied on for analysis, since the UML Specification provides no standard way of expressing the common situation of optional or flexible sequences of action. • Because the UML can neither express structure between use cases nor a structural hierarchy of use cases in an easy and straightforward way, use cases are developed as an “uncoordinated sprawl” of (by definition) discrete and unrelated functions. This creates a loose collection of separate partial models, addressing narrow areas of the system requirements, and presenting problems of relating these partial models and keeping them consistent with each other. Slide 82 Use Cases ? 4 • The UML Specification provides no clear semantics of what a use case really is (“representing a coherent unit of functionality” — but representing in what way(s)?), and no consistent guidelines on how it should be described. This “flexibility” may be seen as a good thing, but as the scale of design problems rises, with larger design teams and more and more use cases, the sort of “studied sloppiness” that can be beneficial for rapid design of modest problems begins to become a stumbling block. • The UML Specification requires a use case to “represent” “actions performed by the system”, but (despite a popular interpretation) does not restrict these to externally visible actions. It is not clear what kind of events we should concentrate on while describing use cases: external-stimuli and responses only, or internal system activities as well. • Use cases may not overlap, occur simultaneously, or influence one another, although actual uses of a computer system may do all of these. • The level of abstraction of use cases, and their length, are a matter of arbitrary choice — “just enough detail, but not too much”. The only level of detail that is “enough” is a level that removes all ambiguity. Slide 83 Use Cases ? 5 • Furthermore, no modularisation concepts are given to manage large use case models. The include and extend concepts are presented as a means to provide extensibility, but no rigorous semantics are provided for these concepts, allowing for multiple disparate interpretations and uses. • Use cases in general are descriptions of specific business processes from the perspective of a particular actor. As such they do not give a clear picture of the overall business context and imperatives that actually generate the requirements for these business processes. This means that they can be quite incomprehensible to non-domain experts. • For the same reasons, the important business requirements and imperatives underlying the use case model become invisible when taken out of business context and expressed in discrete use cases. Subsequent readers of the use case model may be quite unable to explain the forces and business requirements that shaped the model. • Developing Use Cases with a User Group or Business Analyst group leads to a focus on how users see the system’s operation. But the system doesn’t exist yet. (A previous system might exist, but if it were fully satisfactory you would not be asked to change or rewrite it.) So the system picture that use cases will present is based on existing processes, computerised or not. The system builder’s task is to come up with new, better scenarios, not to perpetuate antiquated modes of operation. Slide 84 Use Cases ? 6 of 6 slides • A UML use case model can’t specify interaction requirements where the system initiates an interaction between the system and an external actor. • Because the UML Specification forbids interactions between actors, use cases cannot model a rich system context involving such interactions. • The UML requires use cases to be independent of one another, which means that it offers no way to model persistent state across use cases, or to identify how the initial system state required by a use case (specified in Pre-conditions) is to be achieved. Slide 85 References 1 • RPL: www.result-planning.com (Gilb site) – – – – Requirements Slides Evo method slides Inspection slides and papers Planguage Glossary (part of CE book) • CE: Competitive Engineering book by Tom Gilb – Forthcoming 2002 Addison Wesley – A systems engineering and software engineering handbook, based on Planguage. (parts at www.result-planning.com) • Inspection: – GG: Gilb and Graham: “Software Inspection” (1993) – RR: Ronald A. Radice: “High Quality Low Cost Software Inspections” 2002, Paradoxicon Publishing, Andover MA, USA • PoSEM: Gilb: Principles of Software Engineering Management – (1988, Addison Wesley) Slide 86 References 2 • RUPSE: Rational Unified Process for Systems Engineering – RUP• SE1.0 – – – – A Rational Software White Paper (possibly avialble via www.rational.com?) TP 165, 8/01 This paper attempts to tackle the problem of system architecture for multiple quantified quality requirements. TG It fails in that it is not dealing with multiple quality requirements simultaneously, and is not doing much more than arm waving.It does not do what I would calla good job of quantifying quality. It does not do a good job of what I would consider showing the releation between a design and multiple qualities and costs. But it is the best attempt to recognize the need and the problem to come out of Rational so far. TG • Mills01:”What’s the Use of a Use Case?” – Don Mills Copyright © Software Education Associates Ltd Wellington, New Zealand, 2001 – Should be available at www.softed.com • [MacCormack2001] Evo in MIT Sloan Review Winter 2001 – Product-Development Practices That Work: How Internet Companies Build Software • Slides added after printed documentation made for conference Kent Beck eXtreme Programming 88 (QUOTED WITH PERMISSION) On 18/01/02 14:25, "Kent Beck" <[email protected]> wrote: > I think you are conflating two concepts--how you create a process and how > you create a community to use the process. > > I was quite "scientific" in my creation of XP. First I read voraciously and > asked lots of questions about a topic. Then I experimented with a technique > myself, generally to extremes so I understood the range of possible > behavior. Whatever worked best for me I taught to a few people I trusted. If > they reported good results I taught it to people I didn't know. Only if they > reported good results would I begin recommending the practice in speeches > and in print. I tried combinations of practices (not exhaustively, but I > tried to be aware of interactions when they occurred). > > I put "scientific" in quotes above, because it isn't science like physics is > science, but it is science as described by Sir Francis Bacon, and as > contrasted to Aristotelian pure reasoning. My notebooks certainly wouldn't > survive review by a physical scientist. But we aren't in the physical > science business. > > Now I had some tested ideas, and I was ready to see them implemented on a > large scale (we can get into motivation later). Given my resources, viral > marketing driven by storytelling was the only option. > > Does that answer your question? Return to main sequence CMM Level 3 Results 89 Slide 90 This is the last slide of the set of slides!