Abstracts - 2007
Instrumentation of Standard Libraries in Java
Jeff H. Perkins, David Saff & Michael D. Ernst
Program instrumentation is useful for many tasks, including memory profiling, execution profiling, debugging, logging, and other dynamic analyses. Instrumentation modifies a program by adding code at particular program points to capture dynamic information. For example, a program could be instrumented to count how many times each method is called. The instrumented code would include code at the beginning of each method that incremented the count for that method:
Instrumentation can be accomplished for Java by modifying either source files or class files. Instrumenting class files is preferable to instrumenting source files in many circumstances. Source is not always available, source format changes more often and more drastically than class files, and a class-file-based approach is often easier to use, because it makes it unnecessary to store and compile instrumented source files.
Most instrumentation tasks require that the standard libraries (for example, Java's rt.jar) be instrumented as well as the user class files. For example, instrumentation added to determine abstract types, must track the interactions between variables within the standard libraries. Similarly, memory profiling code must keep track of allocations within the standard libraries. Instrumenting user class files is straightforward, but instrumenting the standard libraries is challenging.
Standard Library Instrumentation Challenges
Instrumenting the Java standard libraries (primarily rt.jar) provides a number of challenges.
Previous research proposed the Twin Class Hierarchy (TCH) approach to instrumenting the Java standard libraries . TCH creates a copy of each class in the standard library with a different name. There is no inheritance relationship between the instrumented class and the original class. The instrumented classes have the same inheritance relations to each other as do the original classes. Each reference to an original class within the instrumented classes is modified so that it refers to its corresponding instrumented class.
TCH leaves the original classes unchanged, which solves many of the instrumentation challenges. It does not, however, handle native calls. Native calls require the original classes and cannot be called from the instrumented classes. This can be worked around by delegating the calls to an instance of the original class and translating each argument from the instrumented type to the original type. Since there is no type relationship between the instrumented class and the original class there is no automatic way to do this. Each native call must thus be implemented by hand; and there is no guarantee that the mechanism will always work.
Our approach is to double (duplicate) each method within the class rather than to twin each class. The doubled method is given a unique name and each reference to a method within it is modified to call the doubled version rather than the original version. The doubled version of a native method simply forwards its call to the original method. Since the original classes and fields are available, native methods work as expected.
If the instrumentation requires per-object data, that data is stored separately in a weak identity hash map so that the layout and fields of the original class can be left unchanged.
It is useful to be able to replace a class with a differently-named class that implements the same interface. For instance, we use this technique to implement Test Factoring . Test factoring replaces classes with new versions that track each interaction with the class. We also plan to use it to check dynamic mutability downcasts. It could also be used for other purposes such as logging or profiling.
Replacing one class by another is straightforward if the original design defined an interface that both classes implement, and the original design always referred to the interface, never to the original class. Such a situation is rarely the case.
Our interfacing technique automatically creates such interfaces
for every class, effectively separating type inheritance from implementation
inheritance. The new interface consists of all of the methods defined
in the concrete class, plus accessors (
Using our doubling and interfacing techniques, we have run instrumented versions of Java programs of up to 300,000 lines. The instrumented version runs about 80% slower than the original version.
Our current implementation does not fully support some Java language constructs. Since we change the signatures and names of methods, reflection requires special support so that the modified names are not visible to the user program. User-defined class loaders must be augmented so that our instrumentation is applied to the loaded classes. Full support of arrays requires a wrapper for each array so that the array can be an interface as well. We plan to complete support for these items and make the resulting tool publicly available.
This research is supported by DARPA contract FA8750-04-2-0254, NSF grants CCR-0133580 and CCR-0234651, the Deshpande Center, NTT, IBM, and the Oxygen project.
 M Factor, Assat Schuster, and K. Shagin. Instrumentation of Standard Libraries in Object-Oriented Languages: The Twin Class Hierarchy Approach. In Proceedings of the 19th annual ACM SIGPLAN Conference on Object-oriented programming, systems, languages, and applications, pages 288-300, Vancouver, BC, Canada, October 2004.
 David Saff, Shay Artzi, Jeff H. Perkins, and Michael D. Ernst. Automatic test factoring for Java. In Proceedings of the 21st Annual International Conference on Automated Software Engineering pages 114-123, Long Beach CA, USA, November 2005.