CSAIL Research Abstract

Introduction

Architecture, Systems
& Networks

Language, Learning,
Vision & Graphics

Physical, Biological
& Social Systems

Theory

horizontal line

Understanding Logical Models with Non-Examples

Robert M. Seater


"Give me a fruitful error anytime, full of seeds,
     bursting with its own corrections."
                                 --Vilfredo Pareto

Motivation

Suppose you are a software developer and have formulated an important aspect of your program as a logical model. Now you want to make sure that your model is saying what you think it is saying and that you are correctly interpreting the results it gives you. For example, you might ask "What is the role of this line in my model?" or "What might have happened had this line contained an error?". If you are not the author of the model, then these questions are all the more pertinent.

This work develops a notion of the 'role' of a subformula which has intuitive value to the user and formal meaning in the logic underlying the model. Using the technique we describe, it is possible to automatically generate explanations of the form "Had you omitted/altered Line X, the following solutions would have been allowed/disallowed...".

Potential Impact

Our group's previous work on logical models has been about making the impossible possible -- making analysis of models scale to larger problems while supporting more sophisticated language constructs. This project is about making the possible practical -- providing the user with tools and techniques to aid the development, maintenance, and understanding of models. Making models easy to understand and convenient to write is essential to expanding the role of modeling in reliable software.

The immediate impact of this work is to make developing logical models more pleasant, less error prone, and more fruitful. The two primary benefits of writing a model about part of your system are (1) to verify correctness properties and (2) to gain insight into the key issues of the system, allowing you to produce a more effective design. Answering questions about the role of particular portions of the model aids both endeavours. In addition, models, like programs, contain errors. Checking your intuition about the role of a subformula against that formula's actual role can help identify unexpected behaviors of the model. When such behaviors are found, either a bug has been found in the model or something has been learned about the system being modeled -- either result is beneficial.

The longer term benefits of tool support for understanding models is the wider acceptance of models as part of software development. Currently, modeling is recognized as useful, but often not used either because it is time consuming or intimidating. Logical modeling lends itself to expressing partial models which represent a key aspect of a system while abstracting away the rest. While this means that the models are much simpler than the code, they can still be too complex to properly understand by hand. After all, if understanding the model were trivial, there would have been little benefit in having written it in the first place. The interesting and useful models have some subtle aspect to them, reflecting as subtlety of the underlying system. Discovering and understanding such subtleties of a system is a step towards making it robust and correct.

Major Findings & Results: The Role of a Subformula

When looking at a logical model, it is natural to ask what role some particular constraint plays in that model. But what sort of an animal is the role of a subformula? Logicians and philosophers have addressed this issue in a number of ways that involve extending the logic to include special constructs. We propose nonexample generation as an appealing, lightweight alternative. This technique explains the role of a formula by calculating what would happen differently if that formula were absent or altered. We introduce conjunction diagrams as a notation to present such information to the user, as well as a useful aid for doing proofs.

Determining the effect of deleting a subformula T is not as simple as removing T and re-analyzing the model. Suppose, after removing T, your checker produces a new solution. Was it a solution before you removed T? Are the old solution still solutions? What you really want to know is what new solutions were added and what old solutions have been inhibited. A more sophisticated approach is required, and we introduce non-example generation as one such approach. The role of a subformula can be described in terms of how the set of solutions to the model would change if that subformula were altered or removed.

It can be proven that removing a subformula will never add one solution while removing another. Thus any formulae f can be classified as being either (a) relaxing (deleting f would reduce the set of solutions), (b) restraining (deleting f would increase the set of solutions), or (c) irrelevant (deleting f would not change the set of solutions).

One can also ask what would happen to the set of solutions if the subformula in question were altered in an arbitrary way. That is, what is the potential impact of arbitrary errors in that subformula (as opposed to the impact of omitting it)? In the case of a boolean formulae (not a relational expression), it turns out to be sufficient to examine the effect of deleting the formulae and the effect of sabotaging it. A formula is sabotaged by replacing it with whichever boolean constant is not equivalent to deleting it. Sabotage is not a useful concept for the user, but rather its value is that it is used to compute the effect of an arbitrary mutation.

We have applied non-example generation to several case studies, including a model of the firewire leader election protocol. Understanding the effect of deleting has immediate benefits, and understanding the effects of arbitrary mutation revealed a more subtle property of the algorithm.

References

Robert M. Seater. Core Extraction and Non-Example Generation: Debugging and Understanding Logical Models. MIT Masters Thesis, November 2004.

Acknowledgements

This work is supported in part by the NASA High Dependability Computing Program under cooperative agreement NCC-2-1298 and by Grant 0086154 (Design Conformant Software) from the ITR program of the National Science Foundation.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu
(Note: On July 1, 2003, the AI Lab and LCS merged to form CSAIL.)