CSAIL Research Abstract

Introduction

Architecture, Systems
& Networks

Language, Learning,
Vision & Graphics

Physical, Biological
& Social Systems

Theory

horizontal line

Specification Coverage by Automatic Mutant Generation

Robert Seater, Greg Dennis & Daniel Jackson

Introduction

Specifications can suffer from the same problem as test suites: a lack of coverage. Even if a procedure has been proven to satisfy its specification, the procedure may nevertheless contain a bug, because the specification is incomplete. True completeness -- that an outcome be specified for each input -- is not required, but completeness relative to the code is desirable; the specification should distinguish the procedure from all close variants, or mutants.

We use a constraint solver [1] to automatically find mutants that are semantically inequivalent to the original procedure but which are not distinguishable by the specification. Candidate mutants are not analyzed in turn; the solver searches the space of all possible mutants (using any combination of mutation operators from a given class) with the same backtracking and pruning that it uses to find distinguishing executions. We are examining this technique in the context of a voting system.

Approach

The underyling idea is to represent both the procedure and its specification as logical constraints relating pre- and post-states. A procedure given by a constraint $τ(s,s')$ relating pre-state $s$ to post-state $s′$ satisfies a specification $ψ(s,
s′)$ when

∀ s, s′ . τ(s, s′) ⇒ 
        ψ(s, s′)

To find weaknesses in the specification, we determine whether there is a mutant version of the procedure code that still satisfies the specification but which produces different behavior for at least one input. The existence of such an inequivalent mutant indicates that the specification is not complete. A necessary and sufficient condition on such a mutant $τ$ ^m can be written as a conjunction of two subformulas, the first requiring that the mutant be inequivalent, and the second that it satisfy the specification.

∃ s, s′ . &tau

^m(s, s′) &and ¬τ(s, s′) ∧

∀ s, s′ . &tau

^m(s, s′) ⇒ ψ(s, s′)

We cannot solve this formula directly because the second constraint contains an unbounded universal quantifier for which relational encodings give unexpected results [5]. However, there is a constraint close to the desired constraint that can be solved. It only requires that the mutant be inequivalent on some execution that satisfies the specification:

∃ s, s′ . &tau

^m(s, s′) &and ¬τ(s, s′) ∧ ψ(s, s′)

This is a necessary condition for the desired constraint but not a sufficient one; the mutant may fail to satisy the specification on some other execution. We hypothesize that this will rarely occur and our experiments so far corroborate this.

Case Study: Voting Software

Our group is investigating how to develop reliable electronic voting software. To this end, we applied our mutant generation technique to check specifications of a simple but realistic Java program for tallying votes. The program takes as inputs a set of ballots and a set of candidates and returns the number of votes each candidate received and partitions candidates into winners, losers, and tie-ers.

We translated the code for the core vote-counting procedure into the relational first-order logic of the Alloy modelling language [1] using the method decribed in "Scope-Complete Code Analysis." That model is then parameterized by a set of possible mutations. The mutations we considered were of four types, though our technique is not particular to those four: (1) replacing a variable identifier with the identifier of another variable of the same type; (2) replacing one inequality operator with another; (3) simulating off-by-one errors by adding or subtracting 1 from integer expressions; and (4) replacing boolean expressions with their negation.

Using the techniques described, we analyzed the completeness of four possible specifications of the vote-tallying procedure. Our first parameterized model allowed for only 32 possible single mutants. On the first of the four specifications, our analyis found 7 of the 32 to meet our first-order criteria; subsequent analyses determined only two of these to be false alarms. For each of the next two specifications, 4 mutants were generated, none of which were false alarms. And on the final specification, which was complete, no such mutants were generated. On a less restricted Alloy model that allowed for up to 10 million mutants, the analysis found an inequivalent mutant within 2 minutes, suggesting that the complete mutation coverage analysis would be tractable on this model as well.

Future

It would be straightforward to incorporate the analysis we have described here in a practical tool. The construction of the parameterized model and the generation of the mutants poses no particular challenges. As our results indicate [2], the analysis does not scale well when there are inequivalent mutants that meet our first-order criteria. However, when solutions exist, they are found very rapidly. This suggests that a practical tool might set a time limit on analysis with only minimal loss of precision. A similar strategy is used in many code analyzers.

Our technique also has implications in the domain of more conventional testing. It could easily be adapted to perform automatic generation of effective test cases [4]. In mutation testing, an effective test case is one that kills at least one mutant. In our search for inequivalent mutants, we implicitly searched for the pre-state that witnesses the new behavior. This pre-state would, by definition, be an effective test case for that mutant.

Research Support

This work is supported by the NASA High Dependability Computing Program under cooperative agreement NCC-2-1298 and by grant 0086154 (Design Conformant Software) from the ITR program of the National Science Foundation.

References

[1] Daniel Jackson. Automating First-Order Relational Logic. In Proc. ACM SIGSOFT Conf. Foundations of Software Engineering (FSE), November, 2000.

[2] Greg Dennis, Robert Seater, Daniel Jackson, and Emina Torlak. Specification Coverage by Automatic Mutant Generation. Submitted for Publication, April 2004.

[3] Hana Chockler and Orna Kupferman. Coverage of implementations by simulating specifications. In 2nd IFIP International Conference on Theoretical Computer Science, 2002.

[4] Richard A. DeMillo and A. Jefferson Offutt. Constraints-based automatic test data generation. IEEE Trans. Softw. Eng., 17(9):900--910, September 1991.

[5] Viktor Kuncak and Daniel Jackson. On Relational Analysis of Algebraic Datatypes. MIT CSAIL Technical Report 985, April 2005.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu
(Note: On July 1, 2003, the AI Lab and LCS merged to form CSAIL.)