CSAIL Publications and Digital Archive header
bullet Research Abstracts Home bullet CSAIL Digital Archive bullet Research Activities bullet CSAIL Home bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line

 

Research Abstracts - 2007
horizontal line

horizontal line

vertical line
vertical line

Bounded Code Verification with SAT

Greg Dennis, Felix Chang & Daniel Jackson

Introduction

We are developing an analysis method that allows a procedure in a conventional object-oriented language to be checked against a rich specification. In common with all automatic analyses of code, we expect to make some compromises. Rather than the traditional compromise of abstraction, which tends to result in large numbers of false alarms, we explore all executions up to a certain size. Like bounded model checking, we restrict the length of execution traces; in addition, we bound the number of heap objects allocated for each type. This approach, being fully precise, never generates false alarms, but it can fail to discover bugs that require a larger bound for their detection.

The analysis, embodied in a tool called Forge, involves an automatic two-phase reduction. First, the code is translated [1] to an intermediate form in a relational logic suitable for describing heap-manipulating code. In the second phase of the reduction, the Kodkod relational engine [2] translates the logic to a boolean satisfiability problem, which is then handed to an off-the-shelf SAT solver.

Approach

From the code of a procedure, Forge automatically obtains a formula P(s, s') in relational logic that constrains the relationship between a pre-state s and a post-state s', and holds whenever an execution exists from s that terminates in s'. A second formula S(s, s') is obtained from a user-provided specification, and its negation is conjoined to the first, obtaining the formula:

P(s, s') and not S(s, s')

When checking the code of an abstract type, the (abstract) representation used in the specification will not be the same as the (concrete) representation used in the code. The standard technique for bridging the gap involves the user providing an abstraction function that interprets each concrete value as an abstract one, as well as a representation invariant on the concrete values. In our logical setting, the abstraction is provided as a formula A(c, a) and the invariant as a formula R(c). Thus, to check the procedure of an abstract data type for conformance to the specification, the analysis solves:

R(c) and P(c, c') and A(c, a) and A(c', a') and not S(a, a')

The procedure is translated to relational logic by means of a symbolic execution of the code. At each point in the execution, the symbolic environment maps local variables and fields to relational expressions for their values. The relational expressions are constructed according to a relational view of the heap [3]. In this view, fields are binary, functional relations that map elements of their class to elements of their target type; local variables and arguments are singleton sets; and field dereference becomes relational join. To illustrate, consider the field f represented by the functional relation F, and the variable x represented by the singleton set X. Because X is a singleton and F is a function, the join expression X.F yields a singleton set representing the value of the dereference x.f.

Forge accepts programs written in Java and specifications written in the Java Modeling Langauge (JML) [4]. The bounds given by the user is specified as a maximum number of objects that can appear in the heap and a limit on the number of loop iterations. If Kodkod finds a solution to the generated relational formula, Forge will translate that solution into a trace of the given Java method. The architecture of the Forge analysis is shown in the figure below.

Forge Analysis Architecture

Forge Analysis Architecture

The trace produced by Forge is guaranteed to be an actual trace of the code that violates the provided specification. However, counterexamples may be missed if they fall outside the bounds provided by the user. Thus, we call this analysis a bounded verification.

Our technique is intended to be used as a modular, not a whole program, analysis. Because the analysis checks a procedure in isolation against its specification, accounting for all possible contexts of use, that specification can then be assumed in place of the procedure when analyzing client code. As a result, large coded bases can be checked in a modular, incremental fashion.

Results

We have used Forge to check a variety of linked list implementations of the java.util.List interface against the JML specification for List [1]. The implementations were drawn from the Sun Collections Framework, the GNU Trove library, the Apache Jakarta Commons-Collections library, and some variants of the Sun implementation that had been seeded with bugs for an MIT software engineering class. The tool revealed all the seeded bugs and a previously unknown bug in the GNU Trove library.

Our current challenge is applying the tool to verify vote-tallying code used by the nation of the Netherlands. We also hope to incorporate Taghdiri's specification inference techniques [5] [6] into Forge to reduce the need for writing specifications of called procedures.

Research Support

This work is supported in part by the National Science Foundation (Deep and Scalable Analysis of Software, Award 0541183).

References:

[1] G. Dennis, F. S. Chang, and D. Jackson. Modular Verification of Code with SAT. International Symposium on Software Testing and Analysis (ISSTA '06), Portland, Maine, July 2006.

[2] E. Torlak and D. Jackson. Kodkod: A Relational Model Finder. Tools and Algorithms for Construction and Analysis of Systems (TACAS '07), Braga, Portugal, March 2007.

[3] D. Jackson. Object Models as Heap Invariants. Essays on Programming Methodology, edited by Carroll Morgan and Annabelle McIver, Springer Verlag, 2000.

[4] The Java Modeling Language. http://www.cs.iastate.edu/~leavens/JML/

[5] M. Taghdiri. Inferring Specifications to Detect Errors in Code. International Conference on Automated Software Engineering (ASE '04), Linz, Austria, September 2004.

[6] M. Taghdiri, R. Seater, and D. Jackson. Lightweight Extraction of Syntactic Specifications. International Symposium on Foundations of Software Engineering (FSE'06), Portland, Oregon, November, 2006.

 

vertical line
vertical line
 
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu