Abstracts - 2006
Dynamic Purity Analysis
Shay Artzi, Adam Kiezun & Michael Ernst
We present a new combined static and dynamic analysis to discover read-only parameters. A method parameter is read-only if the method does not mutate any object transitively reachable from the parameter using the parameter's reference. Knowledge of such parameters can be utilized in specifications [JML], invariant detection [Daikon] and test generation [Palulu]. Several whole-program static analyses (e.g., Salcianu's jppa, Routnev's) exist that address this problem. However, they do not scale to large programs (over 10KLOC) and are overly conservative (suffer from many false negatives), missing many read-only parameters.
We propose to use a combination of dynamic and localized static analysis to create a more scalable solution to the problem of detecting read-only parameters. Our algorithm is more precise than existing static analyses (i.e., detects more read-only parameters). Like any dynamic analysis, our analysis is unsound and may wrongly report as read-only a parameter that may in fact be modified on some future execution. We plan to investigate how such false positives affect the utility of the results.
Our main contribution in this research is the creation of a scalable combined static and dynamic analysis algorithm that correctly identifies all of the read-only parameters (i.e, no false negatives), with only a small percentage of false positives.
We plan to use the results of the analysis as input to Palulu, a model-directed regression test generator, that uses this information to create smaller models (omitting calls that will not change the state of objects). We also plan to use the results as input to Daikon, a dynamic invariant detector, that uses pure methods (methods that do not change any object that exists in the program before the method's invocation) to detect invariants. We want to investigate whether the output of those two tools improves when they are provided with results of our analysis.
The technique has two phases, dynamic and static. Each uses a different analysis to find modified parameters. Any parameter not found to be modified is reported as read-only.
In the first phase, the system uses dynamic information from an example execution to find modified parameters. A parameter is marked as modified if any object transitively reachable from the parameter is modified using the parameter reference.
For example, consider the following method:
If during the example execution, this method was called with b == true, the dynamic analysis would recursively analyze the body of method List.clear() and discover that the method modifies its receiver. Therefore, parameter l1 would be marked as modified. Parameter l2 would not be marked as modified on that execution.
After the dynamic phase many parameters are classified as read-only (such as l2 in the example above) even though a future execution may actually modify them. The second, static, phase is aimed to reduce the number of such false positives.
In the second, static phase of our technique, a simple and scalable static analysis is used. Using a static call-graph, our algorithm constructs a graph of static dependencies between parameters and the call sites in the method. This graph is seeded with information about parameter mutability obtained in the dynamic phase. Then, all parameters reachable from the initial set of modified parameters are also marked as modified. In the example above, parameter l2 is marked as modified by the static phase because a method (clear) that may modify its receiver if it is called using the l2 reference.
We have completed the implementation of the dynamic phase and are finishing the implementation of the static phase.
We conducted an experiment comparing the results from the dynamic analysis phase with Salcianu's static analysis on two example programs, the Java compiler from Eclipse and Daikon, a dynamic invariant detector. For methods covered by the example execution, our analysis detects all of the read-only parameters detected by the static analysis, while also finding additional read-only parameters missed by the static analysis.
For Eclipse, 957 (with 1094 non-primitive parameters) methods were covered by the example execution. Our analysis agreed with the static analysis on 82% of the parameters (i.e., classified parameters as read-only or modified) of the covered methods. For Daikon (3030 methods covered by example execution, 2495 non-primitive parameters), the agreement was on 63% of the parameters. We expect to improve those numbers with the implementation of the second phase (static analysis extension).
We plan to compare our results with those of other tools that aim at finding read-only parameters, e.g., jppa, Routnev's tool and Javarifier (a tool to infer reference immutability).
This work was supported in part by NSF grants CCR-0133580 and CCR-0234651, DARPA contract FA8750-04-2-0254, and gifts from IBM and Microsoft.
[Daikon] The Daikon dynamic invariant detector, website http://pag.csail.mit.edu/daikon
[Palulu] Automatic regression tests generator, website http://pag.csail.mit.edu/6.883/projects/unit-regression-tests.pdf
[Javari] Javari: reference immutability for Java, website http://pag.csail.mit.edu/javari
[JML] JML: Java Modeling Langauge, website http://www.cs.iastate.edu/~leavens/JML
[jppa] jppa: Java pointer and purity analysis tool, website http://jppa.sourceforge.net
[Routnev] Atanas Routnev, Precise Identification of Side-effect-free Methods in Java, ICSM 2004