Experimental Evaluation of SDL and One-Op Mutation for C Marcio E. Delamaro, Lin Deng, Vinicius H.
Download ReportTranscript Experimental Evaluation of SDL and One-Op Mutation for C Marcio E. Delamaro, Lin Deng, Vinicius H.
Experimental Evaluation of SDL and One-Op Mutation for C Marcio E. Delamaro, Lin Deng, Vinicius H. S. Durelli, Nan Li, and Jeff Offutt Universidade de São Paulo & George Mason University Brazil & USA www.cs.gmu.edu/~offutt/ [email protected] of 19 Mutation Analysis Mutation Analysis—A Generic View We perform mutation analysis when we … … use well defined rules … defined on syntactic descriptions … to make systematic changes … to the syntax or to objects developed from the syntax Mutation Analysis—For Programs … use mutation operators … defined on language syntax … to create mutated versions … of programs or program components ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 2 of 19 Mutation Testing Mutation Testing We use mutation analysis for testing to : 1) Help testers design high quality tests or … 2) Evaluate the quality of existing tests ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 3 of 19 Why Mutation Works Fundamental Premise of Mutation Testing If the software contains a fault, there will usually be a set of mutants that can only be killed by a test case that also detects that fault • This is not an absolute ! • The mutants guide the tester to an effective set of tests • A very challenging problem : – Find a fault and a set of mutation-adequate tests that do not find the fault • Of course, this depends on the mutation operators … ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 4 of 19 Benefits & Cost • Numerous studies have found mutation-adequate tests to be very effective at finding faults in programs • These studies have also shown mutation is relatively costly 1. Lots of test requirements (mutants) 2. Lots of tests are needed 3. Lots of equivalent mutants to detect • Three general strategies for reducing costs 1. Do-fewer : Fewer mutants 2. Do-smarter : Techniques to reduce the amount of work 3. Do-faster : Algorithms and tools to speed up process ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 5 of 19 Do-Fewer Mutants M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 … Sampling : Use X% of the mutants M11 M12 M13 M13’ M14 M14’ M15 Selective : Use certain mutation operators Refining : Redefine mutation operators to create fewer mutants Untch, ACM-SE 2009 Deng et al., ICST 2013 ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt SDL : Use one operator—delete each statement 6 of 19 Two Studies 1. Evaluate statement deletion (SDL) in C Compare with previous results in Java 2. Generalize statement deletion to one-op mutation Try each operator by itself ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 7 of 19 Statement Deletion in C • Systematically remove each statement as well as all inner statements • Ensures that each statement affects behavior – This is far more than just statement coverage void test () { int a, b, c, t, i; if (a == 0) { b = 3; } for (i = 0; i < 5; i++) t = t + b + c; } ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt M1 M2 M3 M4 M5 M6 8 of 19 Advantages of SDL 1. Few mutants : O(LOC) 2. Every program generates SDL mutants 3. Relatively few equivalent mutants ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 9 of 19 Evaluating 1-Op Mutation • Tool : Proteum—75 mutation operators • Subjects (P) : 39 C programs • Mutants (MS) : All mutants for all programs (63,275 mutants) • Test Universes (TS) : Created a “universe” of tests for each subject—each test set killed all mutants for its program • Equivalent Mutants (ES) : Identified by hand while creating tests (7518 equivalent mutants) • Test Subsets (TSop) : For each program, found a subset of tests that killed all mutants from each mutation operator – 10 sets of tests for each operator, averaged • Experimental procedure … ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 10 of 19 Experimental Procedure P M Add tests until MS = 100 T Mop1 P Mop2 Mop75 TTop1 Top2 op10 TTop2 op1 T op10 Top75 TTop1 M M MS of tests for each op on all mutants M op10 ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 11 of 19 SDL Results Mutants Equivalent Tests SSDL Mutants All Mutants 2062 176 (8.54%) 203 63,275 7518 (11.88%) 815 : 96% Total mutants killed by SSDL tests Mean of all mutants killed over the 39 programs : 92% ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 12 of 19 One-Op Results • 52 Proteum operators (out of 75) generated mutants for at least one program • We computed the mutation score for each operator – Averaged over 10 test sets, all drawn from “universe” – Averaged over all 39 software subjects CRCR,Vsrr Low Mutation score Relative test set size % of all mutants % equivalent High SSDL 26.00% 2.04% 0.04% 96% 53% 14% 92.0% 28.8% 3.8% 0.00% 80% 5.1% How to balance effectiveness (mutation score) with cost (tests, mutants, equivalent mutants) ? ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt VDTR OCOR, SGLR OIPM OAEA, OASA, OBNG, OIPM, SBRn, SCRB 13 of 19 Cost vs. Effectiveness • Evaluating mutation operators balances two things : – Benefit : Mutation score – Cost : Number of tests, equivalent mutants, and mutants – Number of mutants affects machine cost, which is minor • We first create a cost function : Cost (op) = (%NormalizedTests (op) * Wt ) + (%NormalizedEquivMutants (op) * We ) – Normalization function is given in the paper – Wt and We are weights based on the relative costs of creating tests and analyzing mutants for equivalence ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 14 of 19 Cost vs. Effectiveness (2) • We next create a cost-effectiveness measure • Mutation Operator Cost Effectiveness Analysis (MOCEA) MOCEA (op) = Cost (op) MS (op) • Note that low values are more cost-effective ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 15 of 19 Cost-Effectiveness Scores Assumes tests Assumes & tests Assumes tests are free detecting equivalent are cheaper mutants cost the samewith different weights Cost-effectiveness scores Operator CRCR Ccsr SSDL ORBN Wt = 1 Wt = .5 Wt = 0 Wt = 1 Wt = 1 We = 1 We = 1 We = 1 We = .5 We = 0 .73 .38 .03 .72 .70 .66 .35 .03 .65 .64 .57 .69 .31 .44 .06 .18 .54 .59 CRCR : Required Constant Replacement Ccsr : Constant for Scalar Replacement SSDL : Statement Deletion ORBN : Relational Operator by Bitwise Operator ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt .51 .50 No ORBN mutants in many programs 16 of 19 Contributions & Conclusions Generalized SDL-mutation to one-op mutation Evaluated SDL-mutation in C • 39 C programs • Mutation score of SDL-adequate tests on all mutants Evaluated one-op mutation in C • 75 Proteum mutation operators • Mutation scores of OP-adequate tests on all mutants Introduced a cost-effectiveness measure • Tests, equivalent mutants, mutation score SDL best if generating tests and detecting equivalent mutants cost the same If tests are free, CRCR or Ccsr is best ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 17 of 19 Future Directions Two-op mutation ? Cost effectiveness formula should include whether mutants appear in all programs What is the true minimum number of mutants needed to achieve near 100% mutation score ? ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 18 of 19 Contacts Marcio Delamaro [email protected] http://www.icmc.usp.br/pessoas/delamaro/ Jeff Offutt [email protected] http://cs.gmu.edu/~offutt/ ICST 2014 © Delamaro, Deng, Durelli, Li, & Offutt 19 of 19