CSAIL Publications and Digital Archive header
bullet Technical Reports bullet Work Products bullet Research Abstracts bullet Historical Collections bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line

 

Research Abstracts - 2006
horizontal line

horizontal line

vertical line
vertical line

Palulu: Automatic Generation of Unit Regression Tests

Shay Artzi, Michael Ernst, David Glasser, Adam Kiezun, Carlos Pacheco & Jeff Perkins

Introduction

Programmers spend a significant amount of time modifying existing code: for example, fixing bugs, refactoring, or adding new features. Regression testing---comparing the behavior of the changed program with the program's original behavior---is a widely-used practice to ensure behavior preservation across code changes. We have developed a technique that automates the generation of regression tests at the unit level.

Figure 1 below shows a unit regression test for Java's Date class. The test was one among many generated for the Java JDK API by Palulu, a tool that implements our test generation technique. The test records and checks the behavior of a specific implementation of Date.after(Date) (Sun's JDK 1.5). This test uncovers a regression problem in Sun's JDK version 1.6 Beta (released February 2006): when run on 1.6, the call to after throws an exception not specified in the API and not thrown in 1.5. Tests like the one in Figure 1 help developers catch regression errors in new versions of their programs or libraries.

public static void test1() throws Exception {
    GregorianCalendar c1
    = new GregorianCalendar(1000, 1, 1);
    Date d1 = c1.getTime();
    d1.setSeconds(1);
    // Sun JDK 1.6 throws a ClassCastException here;
    // this is a regression from Sun JDK 1.5.
    d1.after(d1);
}

Figure 1: An example regression test (created by our technique).

Our technique takes a program and outputs many regression tests like the one in Figure 1. An automated tool like Palulu can help developers create regression tests that complement manually-created tests. Palulu has found a number of regression errors in well-tested libraries like IBM's implementation of the JDK and Sun's JDK 1.6 Beta.

Creating regression tests

Creating a regression test consists of (1) creating test inputs that exercise the program under test, and (2) creating an oracle that checks that the behavior of the program is preserved. Our technique creates test inputs by constructing random sequences of method calls that exercise the methods of the program under test, using feedback-directed generation [2]. For each input, the technique creates a regression oracle by running the program on the input and recording (primitive) values returned by a set of observer methods for the class under test. observer methods are obtained via an automatic purity analysis [1,3,4]. After executing a test input and recording its return values, the technique re-executes the input and ensures that the recorded behavior is reproduced; this second run removes recorded behaviors that are not reproducible (consider for example a test that records the value of Calender.getInstance().getTimeMillis()). Finally, the technique outputs the regression tests created as source code.

Creating complex regression tests

Random test generation is effective for testing programs with relatively unconstrained interfaces, i.e. programs (like the JDK) that allow for the most part arbitrary sequencing of method calls. Random testing is not as effective for programs with more constrained interfaces requiring specific sequence of calls to behave properly. For example, a database server may require calls that initialize the database and establish a connection with a client before servicing requests. Purely random test generation is blind to these constraints, and is not likely to create legal test inputs that require complex setup code.

We have developed a second technique (also implemented in Palulu) that guides the generation process, based on a model of legal method sequences derived from a method trace of an example execution of the program. The technique creates a directed graph that summarizes the sequences of method calls observed during the program's run: edges are method calls and paths represent call sequences observed during execution. The model also specifies the primitive values (or strings) that were given as input arguments during the example run. The technique creates one model for each class under test.

Figure 2 shows an example model for a class implementing a database server. In addition to recording method calls, our instrumentation framework emits all field writes that happen within a method call. We use this additional information to dynamically determine pure (observer) methods from the trace [1], which the generator uses to create a regression oracle. Thus, the technique performs two dynamic analyses based on the example execution: model creation used to guide test input generation, and purity analysis used to create regression tests.

Example model
Figure 2: Example model for a database server class.

Given a set of models (one per class under test), the regression test generator creates inputs randomly, except that objects for modeled classes are constrained to follow the model. For the database example, a database object would be created by following the model from Figure 2, and thus it would be properly initialized.

Evaluation

We have used our tool to create regression tests for classes from Sun's JDK 1.5 and ran the tests on a newer version of the JDK, i.e., Sun's JDK 1.6 Beta. In a second experiment, we created regression tests for classes from Sun 1.5 JDK (same as in the first experiment) and ran the tests on IBM's JDK 1.5, which is a different, independent, implementation of the same API. In both experiments, our tool was able to find bugs (like the one illustrated in Figure 1) in a matter of seconds.

In an experiment to evaluate model-based generation of complex inputs, we used Palulu to generate a test for a complex class in Daikon, a large program developed in our group. Based on a model generated from an example execution, Palulu created a test consisting of 20 method calls and involving 10 interrelated classes. We also asked a developer familiar with the code to manually create a test for the class; the task took him 30 minutes, and the manually-generated test was structurally very similar to the one generated by Palulu. Furthermore, random generation without a model was not able to create a sequence of method calls that created a legal object for the class.

References:

[1] S. Artzi, A. Kiezun, and Michael D. Ernst. Exploring Dynamic Purity Analysis. CSAIL Research Abstract 2006.

[2] Carlos Pacheco, Shuvendu K. Lahiri, Sanjukta Pal, Thomas Ball, and Michael D. Ernst. Feedback-directed Random Test Generation. Under submission.

[3] A. Rountev and B. G. Ryder. Points-to and side-effect analyses for programs built with precompiled libraries. Lecture Notes in Computer Science, 2027:20+, 2001.

[4] A. Salcianu and M. C. Rinard. Purity and side effect analysis for java programs. In VMCAI, pages 199 215, 2005.

 

vertical line
vertical line
 
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu