Abstracts - 2007
Refactoring for Parameterizing Java Classes
Adam Kiezun, Michael Ernst, Frank Tip1 & Robert Fuhrer1
1IBM T.J. Watson Research Center
Generics are now a feature of the Java programming language. They enable the creation of type-safe reusable libraries. Much pre-generic Java code would benefit from being upgraded to use generics, and this task can be viewed as two related technical problems [1,6]:
The parameterization problem subsumes the instantiation problem, because to parameterize a class, references to other, already parametric classes, may need to be instantiated. For example, if the
In our previous work [3,4], we showed that the instantiation problem can be solved using completely automatic and scalable techniques, and the Infer Generic Type Arguments refactoring in Eclipse 3.1 is based on our results. However, thus far, class libraries such as the Java Collections Framework, Apache Collections, etc have been parameterized manually, and developers involved with this task described it as very time-consuming, tedious, and error-prone.
Our contribution in this project is a solution to the parameterization problem with the following salient characteristics:
Our parameterization algorithm is based on type constraints and has 3 steps:
Type constraints and the solving algorithm are discussed below.
Type constraints express subtype relationships between elements in the program. For example, given an assignment x = y, the rules of the Java language specify that in order for the program to be type-correct, the type of the right-hand side of the assignment (i.e., y) must be equal or be a subtype of the type of the right-hand side of the assignment. In our notation, [y] ≤ [x] (square brackets mean "type of", the less-than symbol means "equal or subtype of"). The two "sides" of a type constraint are called constraint variables. A program is type-correct when all type constraints are satisfied.
In addition to preserving type correctness, type constraints are used to preserve the program's behavior. For example, type constraints between parameters of overriding and overridden methods ensure that the overriding relationship is preserved. Without such constraints, the behavior of dynamic dispatch may change, thus changing the program's behavior.
In the presence of generics, a type of a program entity is not necessarily fixed, and may depend on the context in which the entity is referenced. For example, consider the following class declaration:
The return type of
In the parameterization problem, the placement of formal type parameters is initially unknown. The program entities whose types depend on the context are also initially unknown. Our algorithm uses context variables, a special form of constraint variables in which it is possible to express that a variable may or may not be given a type parameter. For example, consider the following call to an initially unparameterized
The type constraint
Solving Type Constraints
Because of dependencies, classes must be parameterized in groups. For example, if the UnionFind class has a member class to iterate over its elements, both classes must be parameterized in order to parameterize UnionFind. It is impractical, and often impossible (because of mutual dependencies) to parameterize classes in the order of dependencies. Some classes, such as abstract classes and interfaces, cannot be parameterized on their own.
Our solving algorithm enables parameterization of multiple interdependent classes at once. Our constraint solver uses a non-standard step that propagates type parameter information between classes. This allows parameterization of interdependent classes.
The constraint solving algorithm works as follows. Every variable in the solver has an assigned type estimate. Type estimate for a variable is a set of types that the variable may legally have. Type estimates contain non-parametric types, type parameters and wildcard types. As solving progresses, type estimates are narrowed, according to the type constraints. When a type estimate for a context variable is narrowed to contain only a type parameter, then the main, non-context part of the variable is assigned a (new) type parameter. This is necessary to preserve type correctness of the resulting program and allows parameterization of interdependent classes.
At the end of solving, type estimates for all variables are singleton sets. Those sets do not contain parametrized types. For example, there may be a variable
We implemented our technique as a refactoring in Eclipse. We evaluated its effectiveness by parameterizing selected classes from 6 already-generic and 2 non-generic libraries. Parameterizing already-generic classes allows direct comparison of the results automatically computed by our analysis with those that were created by a human expert (i.e., the libraries' authors).
For our experiments, we parameterized 148 classes from the selected libraries. Then, we manually compared the results with the expert-made parameterization. In 87% of cases, our tool performed equally well (i.e., the parameterization was exactly the same or equivalent). In 4% of cases, our tool performed worse than the human (i.e., the inferred parameterization, though correct, was not what a skilled programmer would have done). In 9% of cases, our tool performed better than the human expert (i.e., the automatically-inferred parameterization allowed for removal of more casts or more closely followed the style of the standard collections).
We present the results at ICSE'07 .
This work was supported in part by NSF grants CCR-0133580 and CCR-0234651, DARPA contracts NBCH30390004 and FA8750-04-2-0254, and gifts from IBM and Microsoft.
 A. Donovan, M. Ernst. Inference of generic types in Java, MIT Technical Report, 2003
 D. Duggan. Modular type-based reverse engineering of parameterized types in Java code, In The Proceedings of International Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA) 1999
 A. Donovan, A. Kiezun, M. Tschantz, M. Ernst. Converting Java programs to use generic libraries, In The Proceedings of International Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA) 2004
 R. Fuhrer, F.Tip, A.Kiezun, J.Dolby, M.Keller. Efficiently refactoring Java applications to use generic libraries, In The Proceedings of the European Conference on Object-Oriented Programming (ECOOP) 2005
 A. Kiezun, M. Ernst, F.Tip, R. Fuhrer. Refactoring for parameterizing Java classes, In The Proceedings of International Conference on Software Engineering (ICSE) 2007
 D. von Dincklage, A. Diwan. Converting Java classes to use generics, In The Proceedings of International Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA) 2004