CSAIL Research Abstracts - 2005 link to http://publications.csail.mit.edu/abstracts/abstracts05/index.html link to http://www.csail.mit.edu
bullet Introduction bullet Architecture, Systems
& Networks
bullet Language, Learning,
Vision & Graphics
bullet Physical, Biological
& Social Systems
bullet Theory bullet

horizontal line

Chicory: Value Profiling in Java

Eric Fellheimer, Sandra Loosemore, Jeff Perkins & Michael Ernst

What

Value profiling [3] is a technique that observes the run-time values of variables and expressions. It is a general technique that is incorporated in any dynamic analysis concerned with values that a program computes. We have developed Chicory, a value profiler for Java applications. Given a Java program and its input, Chicory runs the program while recording runtime information to a data trace file. Chicory operates by performing a bytecode to bytecode transformation at load time.

Our goal was to develop a tool which is easy to use, easy to maintain, efficient, and does not alter the behavior of the target program (Chicory is just an observer of the target application).

Currently, Chicory records variable and expression values at method entries and exits. The system could easily be modified to record other information or to do so at different times than method entries and exits.

Why

The data trace of a program provides crucial run-time information to developers. For instance, a user could compare two execution traces of the same program, one of which crashed, to deduce where one of the executions went wrong.

More generally, such data traces are important to dynamic analysis tools. For instance, Chicory acts as a Java front end for the Daikon invariant detector. From an execution trace, Daikon infers (likely) invariant properties of a program.

How

Chicory performs a bytecode to bytecode transformation on classes as they are loaded, using the Java 5.0 profiling API. The bytecode transformer adds calls to Chicory at all instrumented method entry and exits points. The inserted routines use reflection to query variable values, then write them to the data trace file.

Chicory's implementation makes calls to the BCEL library for manipulating bytecode, which was considerably simpler than dealing with complicated source file parsing.

The user can invoke Chicory almost as easily as invoking the target application alone. For instance, if the program util.Sorter is invoked as:

  java util.Sorter 2 5 1

then a user can instrument it on the same inputs by calling:

  java daikon.Chicory util.Sorter 2 5 1

During the design phase, we also considered using the JDI (Java Debug Interface) and a source to source transformation as opposed to the bytecode to bytecode transformation. We will now discuss these alternatives briefly.

The Java Debug Interface (JDI) allows developers to extract information such as variable values from a separate JVM that runs the target program. Because it is designed specifically for debuggers, the JDI clearly states which methods have side effects in the target application. Therefore, the instrumentation code can avoid interaction with the target simply by not calling API methods with such side effects. Unfortunately, this interface is designed for debugging on time scales at which end-users interact with a debugger. It would introduce too much overhead into Chicory. Indeed, our tests showed that instrumentation with the JDI took over 10 times as long as instrumentation with a bytecode transformation. One program which took 5 minutes to instrument with the bytecode technique took over 70 minutes with the JDI.

A source code transformation technique provides similar behavior to a bytecode transformation in terms of interaction with the program and efficiency. Such a technique, however, often requires an intermediary compilation step which complicates how a developer invokes the system. Also, source code transformation techniques are often more sensitive to changes in the language, especially syntax. Consider the added complexity, even just in terms of syntax, which Java 1.5 embodies compared to previous versions (i.e., generics).

Progress

Chicory is now included in the Daikon release as the default front-end for Java. Instrumented programs run approximately twice as slow as the uninstrumented version, and they run faster than Daikon itself, so instrumentation overhead is not the bottleneck. Chicory has proved to be robust, and it has been run on programs as large as Daikon itself (approximately 180,000 lines of code).

Ideally, Chicory should not induce any side effects in target program execution. One issue occurs when querying values of static fields. The reflection specification asserts that the virtual machine will initialize the class when trying to get a static field value if the class has not yet been initialized. Therefore, by asking for a static field value, Chicory may change the order in which the target application's classes are initialized. In order to avoid this forced initialization, Chicory needs a way to check if classes were already initialized or not. The instrumentation transforms the bytecode in each class' static initializer to notify Chicory, at the end of the initializer, that the class has been initialized.

Future

There are several features and improvements we are looking forward to implementing. First is "online mode" in which Chicory's execution trace output is piped directly to Daikon, circumventing the intermediate file. This should reduce hard disk related latency.

In the long run, we will also consider modifying the execution trace file format for added efficiency.

Research Support

This research is supported by DARPA contract FA8750-04-2-0254, NSF grants CCR-0133580 and CCR-0234651, the Deshpande Center, NTT, IBM, and the Oxygen project.

References

[1] Michael D. Ernst. Dynamically Discovering Likely Program Invariants. PhD thesis, University of Washington Department of Computer Science, Seattle, Washington, August 15 2000.

[2] Michael D. Ernst, Adam Czeisler, William G. Griswold, and David Notkin. Quickly detecting relevant program invariants. In Proceedings of the 22nd International Conference on Software Engineering, pages 449-458, Limerick, Ireland, November 7-9 2000.

[3] B. Calder, P. Feller, and A. Eustace. Value profiling. In Proceedings of the 27th Annual International Symposium on Microarchitecture (MICRO-97), pages 259 269, San Jose, CA, Dec. 1 3, 1997.

horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu
(Note: On July 1, 2003, the AI Lab and LCS merged to form CSAIL.)