MIT CSAIL Research Abstracts

As software systems grow in size and use more third-party libraries and frameworks, the need for developers to understand unfamiliar large codebases is rapidly increasing. We are building a tool, Relo that supports users? understanding by allowing interactive exploration of code. As the developer explores relationships found in the code, Relo builds and automatically manages a visualization mirroring the developer?s mental model, allowing them to group viewed artifacts or use the viewed items to ask the system for further exploration suggestions.

Introduction

Working with the complexity of large software projects is a pervasive and growing problem. Developers face increasing difficulties in comprehending and maintaining a mental model of the code in large codebases. While techniques like object-oriented programming and design patterns have helped control complexity by allowing developers to create and use appropriate abstractions and encapsulate inessential details, these techniques require a developer reading the code to follow many kinds of relationships at once. For example, following a function call, once a simple task, now also requires keeping track of inheritance and polymorphism.

We present a program comprehension tool called Relo, which helps developers to understand the roles of the multiple types of relationships in a software system. Relo visualizations start with a single code artifact (such as a package, class, or method), from which a user can browse the different types of relationships to incrementally add more code artifacts. Relo helps maintain context while users manage the visualization by choosing to remove or group artifacts together. Such visualizations, like concern graphs [5] , represent only a small manageable part of the code and do not include irrelevant details allowing a user to focus on the important relationships.

Relo visualizations try to be intuitive to end-users, showing code artifacts in diagrams similar to UML class diagrams, while at the same time allowing developers to zoom in to view and edit code using text editors embedded in the graph. Developers can therefore abstract to a high level, or zoom-in to see code. Relo further helps maintain users? focus by providing explicit support for exploration while managing the amount and presentation of information to the user based on his/her interaction with code elements [1] .

Most previous approaches have supported large projects by using multiple distinct views, each supporting only a single predetermined relationship (like inheritance or method-call hierarchy) ? with the views implemented graphically [6] or in most IDE?s as tree widgets. To overcome the loss of context in connecting these views Relo brings the different relationships together in a single view and uses diagrammatic constraints, such as containment or left-to-right ordering, to aid in representing the relationships between elements. While Relo uses visualizations it focuses on users expanding the diagram across any relationship, instead of just expanding by visual containment as in [2], and tries to minimize the apparent loss of relationships between elements that are not direct descendents as found in other approaches when following more than one type of relationship [3] .

Walkthrough

Figure 1. Relo started by
opening EllipseFigure.

Figure 2. Adding a method and clicking
on class to show navigation aids

Figure 3. Clicking on the
inheritance navigation aids

Figure 4. Expanding the class AbstractFigure
and the method addFigureChangeListener.

We illustrate how Relo would be used by a developer for typical comprehension task. For this example, we use a task similar to that used by JQuery [3] . The task involves a developer working with the JHotDraw [7] project, a GUI framework for building drawing applications consisting of figures like rectangles, triangles, ellipses, etc. A developer needing to add a feature that operates on figures would like to understand how to manipulate them. In attempting this task, the developer will try to understand the code, by likely taking a few steps:

A developer following the above steps using a traditional IDE will typically make rapid progress in the first three steps, finding a starting class (using simple heuristics and search queries), examining it, and selecting an appropriate base class. However, at step 4, when the developer selects a method that is called for manipulating figures and tries to examine the callers, he will have difficulty in keeping track of the various examined code artifacts. The difficulty will occur because of the desire to maintain a context when examining the roles of nodes connected by multiple relationships ? in this case: the inheritance, containment, and method calls relationships.

This scenario would be simple with Relo. As the developer looks at the code, he will find that JHotDraw has a number of packages, with one being called figures. The developer would look at that package, and find that the class EllipseFigure would be a relevant starting point for his/her exploration. The developer would then just need to select the class, and open it in Relo (as shown in Figure 1).

Figure 1 shows that the class has 15 members, and the developer clicks on the menu to see a list. Considering the method basicMoveBy as interesting, he clicks on the method name in the menu and thereby adds the method to the diagram for future examination. Once added, the developer clicks on the class, and is presented with a navigation aid indicating the class inherits from another class (shown in Figure 2). The developer clicks on this button to show superclasses, and continues his exploration to find a relevant base class by clicking upwards (shown in Figure 3).

Once the developer has an idea of the inheritance tree of figures, he chooses to expand the AbstractFigure class. After double-clicking to see all public methods, the developer removes methods irrelevant to his task (manipulating figures) by clicking on the ?x? in the corner, and examines the available methods to select one for expansion. Deciding that the addFigureChangeListener method is part of the general framework for manipulating figures, the developer decides to expand it.

The developer is presented with Figure 4, which shows the implementation of the method. After finding the implementation relevant, the developer will want to find a relevant caller of addFigureChangeListener. The developer collapses the AbstractFigure class and clicks on the caller navigation aid, Relo continues to build the graph (shown in Figure 5), and has begun to act as both a call-hierarchy browser as well as an inheritance-hierarchy browser.

Once presented with figure 5, the developer can easily select relevant classes that manipulate figures, and does not have two worry about connecting inheritance, containment, and method call relation-ships. As the developer continues with his task, he can build a larger visualization and choose to refine the generated diagram, so that the visualization helps in his understanding of the code base.

Constrained Layout

Relo tries to reduce cognitive overhead by using topological constraints to assist in providing a default layout of code elements, so that elements are found at expected locations. For example, wherever possible, inheritance edges are drawn vertically, method calls horizontally, and containment is shown by visual nesting.

Relo shows children of items in one of three ways. At the most constrained, children of classes are shown using a vertical layout; a more relaxed graph layout engine is used by default on children of packages; while in cases when the containment hierarchy is not important children are laid out independently of the parent. Relo allows users to select an element and get it to breakfrom its current layout, into one of the three options.

Exploration Support

Instead of requiring users to navigate property dialogs to configure relationships to be shown or filtered, Relo uses ?navigation aids? to allow navigating and extending the visualization with simple clicks. For example, as shown in Figure 1, when the addFigureChangeListenermethod is selected, it sprouts aids for different relationships that could be followed from the method (calls, called-by). These visual hot-spots appear on the currently selected items and allow clicking to follow the relationship represented by the aid. The aids provide support for a browsing behavior commonly used when trying to ?home-in? on information based on the surrounding contextual information [4]. They are only shown when clicking on them will result in a modification of the view, i.e. a class that is not extended by other classes will not have the extended-by navigation aids. Furthermore, after clicking once on the navigation aid, as a second click will not add any more elements, clicking on a relationship results in it being ?instantiated? in the form of the drawn relationships. To limit the number of navigation aid types shown to the user and the potential cognitive overhead, only the most common relationships are shown as navigation aids, with the remaining being available through context sensitive menus.

Relo further helps users explore through the code by implementing an Autobrowse feature, which effectively does a breadth first search finding other hidden artifacts that are relevant to the selected items. Since some relationships, like inheritance are considered more important than others, they are searched first, with the system terminating after an item is added to the view. Users can either select a few items to have autobrowse run on a smaller set of items, or can use autobrowse repeatedly to keep adding items. Relo further helps users by automatically adding relationships between shown items. To reduce clutter, Relo uses browsing agents that automatically add relationships to items that have not been ?broken?, i.e. code artifacts that are not laid out vertically next to each other, such as the default for methods in a class.

Level of Detail

In order to minimize cognitive overhead on users, every element presented in a Relo visualization, defaults to showing as little information as possible. Users can semantically zoom-in by double clicking on an element or selecting the expand navigation aid (?+?) to show more details. For classes, this means starting with only the class name, and at the first expansion level showing the children members having public access. For methods, expansion would mean showing the method implementation in an editor view. Users further have finer grain control of shown items by using the more items menu to get a list of children and add only the relevant items. Similar to expanding, users can also collapse code artifacts by clicking on the ?-? navigation aid, or selectively eliminate artifacts by clicking on the ?x? navigation aid.

When hiding a shown code artifact that have relationships to other artifacts, the simplest approach is to hide all relationships originating or terminating at the element being hidden. However, Relo implements the more sensible choice of allowing relationships to be transitively visualized. In Figure 1, when removing the AttributeFigure class, Relo will leave a relationship between EllipseFigure and AbstractFigure, converting the inheritance relationship to a transitive inheritance relationship visualized similar to the inheritance relationship but having a set of dots (?...?) in the middle. For relationships of different types the conversion happens to the parent type, i.e. dependency for inheritance and method calls.

Progress and Future Work

Relo is built as an integrated plug-in into the Eclipse development environment, and is freely available from: http://relo.csail.mit.edu. We are conducting evaluations of it, and are finding that developers are examining the code using an 'opportunistic strategy', i.e., not examining the code in a systematic manner. Thus, while one developer might find Relo to be very useful for a task another developer doing the same task could find Relo to not be useful. We are targeting to characterize these situations and find means for Relo to help developers be more 'opportunistically successful'. Furthermore, Relo builds visualizations similar to UML class diagrams. There are a number of situations, such as those when developers try to understand a small set of methods when exploration of more detailed diagrams similar to UML's interaction diagrams will be helpful. We are currently working on providing multiple types of views of the code for exploration by developers.

Acknowledgements

We would like to thank David Huynh, Derek Rayside, Mike Ernst, and Daniel Jackson, for comments and help with this research. This work has been supported by the MIT Oxygen Project.

References

[1] Sinha, V., Miller, R., Karger, D. R. " Incremental Exploratory Visualization of Relationships in Large Codebases for Program Comprehension?, Poster, Demo & ETX workshop, OOPSLA 2005.

[2] Storey, M.-A., Muller, M., and Wong, K. "Manipulating and documenting software structures using SHriMP views?, ICSM 1995.

[3] Janzen, D., and Volder, K. D. ?Navigating and Querying Code Without Getting Lost?, AOSD 2003

[4] Teevan, J., Alvarado, C., Ackerman, M. S., and Karger, D. R. ?The perfect search engine is not enough: a study of orienteering behavior in directed search?. CHI 2004.

[5] Martin P. Robillard , Gail C. Murphy, ?Concern graphs: finding and describing concerns using structural program dependencies?, ICSE 2002.

[6] Reiss, S. ?Visualization for Software Engineering ? Programming Environments?, Chapter 18, pages 259-276, in ?Software Visualization?, ed. Stasko et al.

Relo: Helping Users Manage Context during Interactive Exploratory Visualization of Large Codebases

Vineet Sinha, David Karger & Rob Miller