MIT/LCS/TR-355

DATA STRUCTURE MANAGEMENT
IN A
DATA FLOW COMPUTER SYSTEM

Bhaskar GuhaRoy

May 1985
This blank page was inserted to preserve pagination.
Data Structure Management in a Data Flow Computer System

Bhaskar Guharoy

May 1985

© Bhaskar Guharoy 1985
The author hereby grants to M.I.T. permission to reproduce and distribute copies of this thesis document in whole or in part.

MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Laboratory for Computer Science
Cambridge, MA 02139
This empty page was substituted for a blank page in the original document.
Abstract

DATA STRUCTURE MANAGEMENT
IN A
DATA FLOW COMPUTER SYSTEM
by
Bhaskar Guharoy

Submitted to the Department of Electrical Engineering and Computer Science
on May 28, 1985 in partial fulfillment of the requirements
for the Degree of Master of Science

VIM is an experimental computer system being developed at MIT for supporting functional
programming. The execution mechanism of the computer is based on data flow. This thesis presents
mechanisms for managing data structures in this system. The thesis also develops a methodology for
designing computers, which is based on successive refinement of formal models of the computer.

A formal model L1 of the abstract architecture of VIM is first developed. The behaviour of this model is
described by its operational semantics; L1 is the specification of VIM. L1 is then refined to model hierarchical
physical storage consisting of main memory and disk. This refined model is called L2. The unit of storage
allocation and of data transfer between main store and disk is a chunk. The thesis proposes a new data
structure called VIM-tree which is a tree of chunks. Data structures in VIM (arrays and records) are stored on
VIM-trees. VIM-trees allow efficient applicative operations on data structures and permit a large amount of
sharing. A reference count mechanism is proposed to perform automatic storage reclamation. Special care is
taken to handle operations in L2 on data structures containing early-completion queues and suspensions, which
are distinctive features of VIM. A base language for this machine is outlined in the thesis.

The models L1 and L2 are then shown to be equivalent for the proposed base language. The
equivalence is proved by exhibiting a McGowan mapping between the states of the two models during the
execution of a program written in the base language.

Thesis Supervisor: Jack B. Dennis
Professor of Electrical Engineering and Computer Science

Keywords: - VIM, Dataflow Graphs, Functional Languages, Structure Management, Hierarchical
Storage, Heaps, Tree structures, Early-Completion Structures, Suspensions, Streams, Machine Equivalence,
VIM-Val. Compilation.
This empty page was substituted for a blank page in the original document.
Acknowledgments

There are many people who contributed, directly or indirectly, to the writing of this thesis. I wish to thank Jack Dennis for providing encouragement and support during the long period of gestation of this thesis. Thanks to Suresh Jagannathan for being ever ready to listen to my ideas and coming up with constructive criticism. Much of the formal model L1 was developed jointly with Suresh.

My sincere thanks to David Culler for being such a wonderful friend. Thanks to Sara Mayeno for allowing me to cook for her whenever I visited her and David. Steve Heller provided much of the fun and laughter. My thanks to Keshav Pingali for the many interesting discussions I had with him. Thanks to Andy Boughton, Guang-Rong Gao, Vinod Kathail, Greg Papadopoulos, Earl Waldin and all the other members of CSG and FLA who make the laboratory such a fun place to work in.

My family has always been very encouraging and I cherish their enthusiasm and interest in my work. Finally, my thanks to Marcela for making me finally go crazy, in the right way, and for being the perfect partner in goofiness.

The woods are lovely, dark and deep
But I have promises to keep
And miles to go before I sleep
And miles to go before I sleep.

- "Stopping by woods on a snowy evening"
  Robert Frost.
This empty page was substituted for a blank page in the original document.
This empty page was substituted for a blank page in the original document.
Table of Contents

Chapter One: Introduction 1

1.1 The VIM Project 2
1.2 Background and Previous Work 3
1.3 Outline of the Thesis 6

Chapter Two: The Val Interpretive Machine 8

2.1 The VIM VAL Language 8
2.2 An Example Program in VIM VAL 9
2.3 The Val Interpretive Machine — VIM 12
   2.3.1 Function Application 13
   2.3.2 Early-Completion Queues 17
   2.3.3 Suspensions and Streams 19
2.4 Operational Model for VIM - L1 21
2.5 Summary 38

Chapter Three: Operational Semantics of VIM with Storage 40

3.1 Arrays and VIM-trees 44
3.2 Operational semantics of L2 45
3.3 Discussion 68

Chapter Four: Equivalence of L1 and L2 70

4.1 Proof of Equivalence of L1 and L2 71
4.2 Discussion 81

Chapter Five: A Base Language for VIM 82

5.1 The Let expression 82
5.2 Conditional Expression 83
5.3 The Tagcase Expression 84
5.4 Function Application and Returns 86
5.5 Stream Producers and Consumers 87
5.6 Discussion 90

Chapter Six: Conclusion and Scope for Further Work 92
This empty page was substituted for a blank page in the original document.
Chapter One

Introduction

In recent years data flow computer systems have been the focus of vigorous research, especially in the context of high speed scientific computations. In addition to higher speed, the data flow model of computation appears to provide a more robust programming environment than is available on conventional systems. The VIM project of the Computation Structures Group at MIT is aimed at examining the issues involved in implementing a modern, general-purpose computing environment based on the principles of dataflow that can effectively support such diverse computational applications as database systems, logic programming, etc. The ideas about the VIM system have evolved over the years, drawing much from the works of Dennis [9, 10, 11], Patil [31] and Weng [38].

The VIM system will support functional programming and the execution mechanism is based on data flow. In the world of functional programming all values are treated as mathematical values. This implies that the traditional view of data structures (arrays and records) as modifiable entities is no longer valid — the system must operate so that the user gets the view that a new structure is created from the old one whenever required. In a simplistic implementation, this would lead to a proliferation of copies of data structures, each differing from the others in only a small number of values. It is recognized that sharing of common elements among structures would reduce both the amount of copying and the storage space required to run the program. Various proposals have been made to implement data structures in data flow systems: none of them can be called definitive solutions. Applicative languages are also side-effect free languages and the language constructs provided in the functional language for VIM does not allow the creation of circular structures. Therefore, reference-counted memory management becomes an attractive alternative to traditional mark-and-sweep methods for garbage collection. This thesis proposes a representation for data structures that greatly reduces the amount of copying and describes a reference count mechanism for storage reclamation.

In most currently proposed functional language architectures, an implicit assumption is that the
program and the data which the program operates upon are all located in the main memory\(^1\). Vim has a two-level physical hierarchy of storage consisting of a large, slow disk and a smaller, faster main store. Values in the main-memory can be accessed immediately while values which are resident in the disk must be read into the main memory first.

The problem of storage reclamation on systems with large address spaces is a prickly one; the strategy for garbage collection in Vim is based on reference counting. The architecture of Vim modelled in this thesis consists of a single processor, some main store and disk store. The principal source of parallelism in the single-processor version of Vim stems from the concurrency in the processing of instructions and disk activities.

1.1 The VIM Project

The goal of the VIM project is to develop a computing environment which supports functional programming and provides a large address space and automatic storage reclamation. A two-level physical storage has been chosen to reduce the cost of physical memory. The primary vehicle for programming on this system will be the VimVal language, a functional language that is an extension of the language VAL developed by Ackerman and Dennis [1, 26, 27]. The criteria that have guided the design of the new language are that it should have the following characteristics.

- It should be sufficiently expressive in that it provides language constructs to the programmer to express most application programs that he needs to, without having to appeal to some features of the underlying architecture that are not evidenced in the language.

- A program consists of one or more modules. Modules must be independently compilable. All the independently compiled modules of a program are linked prior to execution by a linker.

- The language must be strongly typed, i.e., if the compiler and the linker certify the program to be legally typed then the program will not encounter any type errors at the time of execution of the program.

- The language must provide constructs to express computations on streams.

\(^1\)The models proposed by Dennis and Weing have a two-level physical storage.
• Non-determinacy must be expressible in the language.

• Higher-order functions must be permitted.

Programs written in VIMVAL will be run on a data flow processor with hierarchical storage. The conceptual framework for the machine was described in [38]. As currently envisioned, the VIM system consists of a single processing element and a two-level physical storage consisting of main memory and disk.

1.2 Background and Previous Work

A number of projects have aimed at providing a coherent and structured programming environment within the framework of a multiprocessor system. Of principal interest from the perspective of this thesis are the Hydra/C.mmp system, the Cm* computer and the SYMBOL computer.

The Hydra/C.mmp was an experimental multiprocessor system [39]. Capabilities were adopted as a mechanism for providing a large and uniform address space and also to control accesses to shared data structures. However, the system fell short of providing a truly integrated interface between the capability architecture and the programming language. The task of processor management was left largely in the hands of the user. The user had to ensure the correct usage of shared data structures by the use of appropriate locking and synchronization primitives, with a resultant decrease in the programmability of the system [23]. However, in spite of these shortcomings the Hydra/C.mmp system represented a significant advance in programmability over the multiprocessor systems then existing.

The Cm* [34, 35] was also a capability based architecture consisting of a large number of processors and memory modules. An underlying goal of the Cm* project was to develop a system that would be scalable, i.e., the computing power of the system would grow in proportion to the number of processors in the system. However, this effort too left the issue of processor management as a user responsibility. Also, since the cost of a memory access was proportional to the distance of the memory cell from the processor, the task of organizing the program so that the number of non-local memory references would be minimal was left to the programmer [22].
The Mu project [18] at MIT was aimed at assessing the importance of programmability in multiprocessor organizations. Though the theoretical framework appears to provide a better working environment than in Cm* and C.mmp, the system was still unable to provide an elegant way of avoiding the need for explicit synchronization mechanisms for shared data that could be updated independently by the processors. It became clear from Halstead's work that the language supported by a multiprocessor is critical to the usability of the system. The difficulty of programming on a multiprocessor can be alleviated if the user can write programs without having to worry about task scheduling, process synchronization and hazards such as read-before-writes, such chores being taken care of by the underlying system automatically.

The SYMBOL computer system [8, 29] was a language based multiprocessor system that allowed the user to program without having to worry about low level considerations like mapping the tasks onto the processors. Each of the processors had a very specific task; however, the task division was so rigid that it ruled out the possibility of scaling the system. Also, the various processors did not aim at solving a single problem in parallel. There was a processor dedicated to compilation, one to memory management, one to I/O management, one that actually executed the compiled program, etc. The parallelism in this system resulted from the fact that memory management, input-output and actual processing could be done in parallel. There was no facility in the system whereby multiple processors could concurrently execute a compiled program.

The SYMBOL was not a true multiprocessor since it was unable to support parallel execution of a program exhibiting a lot of computational parallelism. However, many of the ideas it introduced were far ahead of the times. It was one of the first processors to specialize the memory architecture to support structure memory. The memory representations of data were specialized to reflect the type of data, allowing operations to be performed on such typed data more efficiently. Significant amount of specialized hardware was developed to allow structure operations to be executed fast — a revolutionary approach, considering the cost of hardware in that period.

One of the seminal contributions of the SYMBOL system is that it viewed that the design of the memory management system was an integral part of the multiprocessor system design. The memory management mechanism provided primitives which could support high-level memory abstractions such
as stacks, queues, lists and strings. A specialized processor performed the memory management tasks, exemplifying the philosophy of static load distribution that so characterized the system.

In spite of its failings, the SYMBOL system, which predated the other projects discussed above by a number of years, presented a pointer to the direction in which the development of programming environments for multiprocessor systems ought to proceed — an architecture based on a high-level language that provided a very uniform, integrated environment for programming.

Among the various general purpose computing enviroments available on modern systems, the one on the Lisp machines deserve special mention. Lisp machines are language-based uniprocessors designed at Massachusetts Institute Technology [24, 28, 32, 33, 36]. They provide a uniform programming environment; there is no distinction between the command language used for interaction with the system and the principal programming language supported (Lisp), the hardware is tailored for processing Lisp primitives, high level data structures such as lists and arrays are regarded as data types even at the machine architecture level, and mechanisms for storage reclamation constitute an integral part of the system design.

Lisp machines provide a very large address space which can be effectively used to support a uniform addressing scheme for all objects created in the system. However, the necessity of explicitly “loading” a file containing an object residing in secondary storage before the object can be used detracts from the uniform addressability feature. Once the file is loaded, the object may be placed on the disk by the memory manager; references to this object are handled by the system so that its actual placement in the memory hierarchy is transparent to the user. Ideally, the user should never have to worry about whether the object is in the primary storage or in the secondary; given the name of the object, the system should automatically resolve the references to the object appropriately. In the VIM system, there is no concept of a file — all data structures are persistent in that they continue to exist across sessions, until there exist no references to the structure in the system, in which case they are discarded. This strategy obviates the necessity of “loading” files.

The storage reclamation scheme adopted by the Lisp machine is a variant of the mark and sweep strategy. The process of marking and sweeping the address space starts when the system runs out of storage. In the Lisp machine, the process of garbage collection is overlapped with the processing of
other tasks in the system. This overlap is achieved by frequent switching of the task of reclamation with other activities. In conventional systems (von Neumann derived architectures), the switching of tasks involves the execution of substantial amounts of code to save the state, reducing the efficiency of the process of switching. The data flow model of program execution allows such switching to be performed with only a small overhead. In mark-and-sweep garbage collection, the time required to reclaim storage is proportional to the size of the address space over which the reclamation is to be performed. By contrast, the time required to reclaim the storage occupied by an object by reference-count based reclamation schemes is proportional to the size of the storage occupied by the reclaimed objects.

1.3 Outline of the Thesis

The design of the VIM architecture espouses the following philosophy. An architecture should be developed by successive refinement, starting from an abstract mathematical specification. The extensions and refinements at each of specification are designed to permit more efficient implementation of the machine. By proving that all the models are equivalent\(^2\), one can largely eliminate the unexpected behaviours that one encounters when designing a system in an *ad hoc* basis. This type of top-down approach is especially important to the design of multiprocessor systems, since the possibility for errors of omission and commission is so much greater.

The design of the VIM system started with the design of the language VIMVAL which conformed to the aims outlined earlier. VIMVAL programs are compiled into programs in a base language, a preliminary version of which was proposed by Dennis and Stoy in 1982; a refinement of the base language is presented in this work.

In this thesis, first the operational semantics of an abstract model for VIM is described. The model, called L1, is the basis for specifying the behaviour of VIM, and is a set theoretic characterization of the abstract machine. The execution model is defined by a non-deterministic state-transition function. The set of instructions in this abstract model is an extension of that proposed by Dennis and Stoy. Chapter 2 gives a brief description of the VIMVAL language and then presents the formal model

---

\(^2\) A naive notion of equivalence may be that the models produce the same results. A formal notion of equivalence for the various models of VIM will be defined in Chapter 4.
L1.

The operational model L2 is a refinement of L1 and is obtained by adding the notion of storage. In this model, structure values (such as arrays and records) which were modelled as elements of sets in L1 are viewed as stored values. L2 models a system with a hierarchically organized physical memory consisting of main store and disk. Storage consists of a large collection of equal sized chunks, each of which is an ordered set of words. Structure values are stored in trees of chunks, thus permitting sharing of information. L2 models a strategy for storage reclamation based on a reference count mechanism. The operational semantics of this model is presented in chapter 3.

In accordance with our proposal of designing by refinement, we must next demonstrate that the L2 satisfies the specifications of L1. This is shown by proving that the two machines are computationally equivalent. A formal definition of equivalence is developed in Chapter 4 and the proof of the equivalence of L1 and L2 is presented.

The base language for the machine, which is the target language of the compiler for \textsc{vimval}, is described in Chapter 5. Essentially, the data flow graphs are such that when the computation of a program terminates, the reference counting mechanism would guarantee that if a structure becomes inaccessible in L1, the corresponding element in L2 would have reference count of zero and would thus be reclaimed.

The thesis concludes with a discussion of the relationship between L2 and its physical realization, and a brief list of related problems which are beyond the scope of this thesis and need further investigation.
Chapter Two

The Val Interpretive Machine

The goal of the ValM project is the design and development of a computer system that supports functional programming well. The architecture of the computer is based on data flow principles and the data flow model of program execution is well suited for interpreting functional languages.

The functional language supported by Val is ValVal, which has evolved from Val. ValVal is a textual language and a brief description of it is given in the first section. ValVal programs are compiled into programs in the base language, which consists of a set of data flow graph schemata. Translation from ValVal to the base language is straightforward since each construct in ValVal corresponds to a graph schema in the base language.

Programs in the base language are executed by interpreting the data flow instructions which are the nodes in the data flow graph for the program. Section 2.3 gives an informal description of some of the distinctive mechanisms used in Val. The operational semantics of the abstract model L1 is presented in section 2.4. The model is the specification of Val and all implementations of Val must meet the specifications.

2.1 The ValVal Language

The programming language for the Val system is the ValVal, an applicative language which is a revision and an extension of the Val programming language. The extensions include the addition of stream-types, free variables, recursion and mutual recursion, and higher order functions. A type inference mechanism guarantees type safety even if most type declarations are absent. Type inference is also used to provide polymorphic functions.

The data types of ValVal fall into two classes — simple types and structure types. The simple types include the familiar types integer, real, boolean, character and null. The structure types include array-types, record-types, distinguished unions, stream-types, and functions.
Functions are first-class objects. They may be passed as arguments to and returned as results from functions, and they may be built into data structures. The body of a function definition is an expression. Evaluation of an expression yields a single value or a tuple of values. Forms of expressions include the conditional expression, the `tagcase` expression, and the function invocation. There is no form of expression for expressing iteration, use of recursion being preferred.

2.2 An Example Program in VimVal

A program in VimVal consists of one or more modules. Each module has a header specifying its interface, type declarations, function definitions and one expression which constitutes the body of the module. An example module is shown in Figures 1 and 2. Figure 1 illustrates how the user may define a new data type `List`, which represents a list of integers. The example module defines three simple operations on objects of type `List`. Figure 2 illustrates the use of streams in the language. The functions `car`, `cdr` and `cons` defined by the example programs have the same meaning as in Lisp. The function `ListToStream` creates a stream of integers when it is given a list of integers. `SumOfStream` sums up the elements of a stream of integers.

A module written in VimVal defines a function that may be invoked from within another module or by a user command to the system. A module may contain function definitions — these may be invoked only from within the module unless they are explicitly exported by incorporating them into data structures sent out as module results. The body of a module may use names that are not defined bound to values by definitions in the module. These `free` names must be bound to other modules before the module may be run.

Within a module, type declarations precede the function definitions and the body. Within a function, the type declarations must precede the expression that constitutes the body of the function. An array `A` of integers is declared as follows.

```
A : array[integers] = array(1, 100)
```

The elements of the array are initially undefined. `select(A, i)` returns the Value of the `i`th element of the array. `append(A, i, v)` creates a new array which is identical to the array `A` except that the value of the `i`th element of the new array is `v`. 
module returns record[head, tail, tuple : function]
    type List = oneof [emptylist : nil;
                        atom : integer;
                        pair : record[first, second : List]]

function car(L : List) returns List;
    tagcase L
        tag emptylist : error;
        tag atom : error;
        tag pair : L.first;
    endtag
endfun

function cdr(L : List) returns List;
    tagcase L
        tag emptylist : error;
        tag atom : error;
        tag pair : L.second;
    endtag
endfun

function cons(L1, L2 : List) returns List;
    make List[pair : record[first : L1, second : L2]]
endfun

record[head:car, tail:cdr, tuple:cons]
endmodule

Figure 1: An example program in VIM VAL.

The definition of a record-type is of the form
    type Pair = record[first, second : List]
Records of type Pair have two fields named left and right. The operation
    record[first : vl, second : v2]
constructs a record where vl and v2 are of type List. Record fields are accessed by the select operation,
for instance
    L.first
yields the value of the left field of L, which must be of type List in which the tag is pair. Tagged unions
are used where different choices of representation are appropriate for different cases of a value. For
example, the type List is a tagged union.
function ListToStream(L: List, C: List) returns stream[integer]
tagcase L
  tag empty : tagcase C
    tag empty : stream[];
    tag atom : affix(C, stream[]);
    tag pair : ListToStream(car(L), cdr(L))
  endtag
  tag atom : affix(L, ListToStream(car(L), cons(cdr(L), C)),
    tag pair : ListToStream(car(L), cons(cdr(L), C))
  endtag
endfun;

function SumOfStream(S: stream[integer]) returns integer;
  if isempty(S) then 0
  else first(S) + SumOfStream(rest(S))
endif
endfun;

Figure 2: Continuation of the example.

type List = oneof [emptylist : nil;
  atom : integer;
  pair : record [first, rest : List]]

where the subtypes are distinguished by the tags emptylist, atom and pair. make[atom : 0] creates a
oneof in which the tag field is atom and the associated value is 0. A case expression is used to access
values of a oneof type:
tagcase L
  tag emptylist : expr1;
  tag atom : expr2;
  tag pair : expr3
endtag

A stream is a sequence of values, all of the same type, that are passed in succession, one-at-a-time
between functions. The operations defined on streams are [], first, rest, affix and empty. [] produces an
empty stream. first(S) produces the first element of the stream S. The result of rest(S) is the stream left
after removing the first element of S. affix(v, S) is the stream whose first element is v and whose
remaining elements are the stream S. The result of empty(S) is true if S is an empty stream, false
otherwise.
2.3 The Val Interpretive Machine — VIM

The abstract architecture of VIM uses data driven program execution. A program in the base language consists of one or more functions, each represented by an acyclic, directed data flow graph. The nodes of the graph are the instructions and the arcs between the nodes specify the data dependencies among the instructions. Arcs connecting two nodes may be of two types — value arcs and signal arcs. Values are carried on tokens along the directed value arcs of the graph. A function template is an array of the instructions which belong to the data flow graph corresponding to a function definition in VIMVAL. The size of the array is equal to the number of instructions in the data flow graph and the indexing of the array starts from 1. Instructions are identified by their array indices within a function template.

In VIM, iteration is modelled as recursion, and the chosen method for implementing recursion avoids the use of cyclic graphs\(^3\). Instead, each function application uses a fresh copy of the graph represented by the function template, the copy being called an activation template. An instruction is enabled or ready for firing when a value is available on each input value arc, and a signal has been received on each signal arc. Note that it may happen that some instructions in a template\(^4\) will receive values but will never fire because no signal will ever arrive. In chapter 5 we give rules of graph construction to ensure that this does not happen; otherwise, the storage reclamation scheme will be unable to reclaim all possible structures, leading to degraded memory utilization.

A salient characteristic of VIM is that no arc is ever reused — at most one value or signal will be sent from one instruction to another along a value or signal arc of an activation template, respectively. This is assured by the acyclic nature of the data flow graphs and by the property that each function application produces a new activation template. This is quite different from the data flow models used by the U-Interpreter [2] or the Static Data Flow machine [14, 12]. In the static data flow machine, the data flow graph does not change during program execution. The creation of function activations provides a very natural way of implementing recursion in VIM. VIM is similar to the static machine in that instructions have special fields for holding the operand values. This is quite unlike the mechanism

\(^3\) Turner uses cyclic graphs to implement recursion in [37].

\(^4\) We shall use "template" instead of "activation template" whenever there is no cause for confusion.
used in the U-interpreter where the value is stored in an associative store. Function application in VM expands the execution graph due to the creation of activation templates: the graph contracts whenever a function terminates and the activation is discarded. In the U-interpreter function application results in the creation of a new context, which is a part of the tags on values.

Another feature of VM which distinguishes it from other data flow models such as the U-interpreter or the Static data flow machine is the heap. VM maintains a heap in which all objects except scalars that enter into computation are held. Scalar values are stored in the operand fields of the instructions, and passed around among the instructions on the tokens. The kinds of objects held by the heap include function templates, closures, early-completion queues (described below) and data structures (arrays, records, etc.). Each object on the heap has a unique identifier which permits its selection from among all objects in the heap. Conceptually, the heap is a multi-rooted, directed acyclic graph in which an arc signifies that the target object is a component of its superior.

A distinctive feature of VM is the set of mechanisms designed to support aspects of the VMVAL language; in particular, these include support for function application and tail recursion and computation on streams. These mechanisms are described informally below; a formal description of the mechanisms will be presented in the next section.

2.3.1 Function Application

Function applications are made by the APPLY instruction, which requires two operands — a function closure for the function to be applied, and a data structure containing argument values. The first element of the closure is the uid of the function template which is to be applied; the rest of the closure contains information defining the binding of any free variables of the function. The APPLY instruction creates an activation of the function by copying the function template. It then sends the closure, the argument structure and the return link to the first operand of the first three instructions in the activation template, respectively. The return link consists of the uid of the calling activation and the uid of the destination list of APPLY.

Instructions of the activation are then executed according to the data flow firing rule until the RETURN instruction is enabled. The RETURN instruction uses the return link to send the result of the
function invocation to the recipients. Due to the presence of early-completion structures the RETURN instruction may not be the last instruction to execute in the activation. A separate RELEASE instruction releases the storage occupied by the activation template.

The following notation will be used for drawing data flow graphs. The nodes of a function template are instructions drawn as rectangular boxes. The value arcs connect from bottoms to tops of instruction boxes and convey data values. The signal arcs convey signals that perform control functions such as the release of function templates. The signal arcs connect from right sides to left sides of instruction boxes. Numerals at the left corner of instruction boxes denote the index of the instruction in the activation. A Greek letter next to an instruction box corresponds to the address of the instruction, consisting of the uid of the activation template and the index of the instruction. An open box with two or more values or signals is the merge operator. The graphs are arranged such that exactly one Value or signal will arrive at a merge box. This is merely a notational convenience; in ViM, the signal count and operand counts are set such that the merge occurs naturally.

Figure 3(a) shows a data flow graph which causes a function activation. $\lambda$ is the address of the destination instruction of APPLY, which is sent to the third instruction of the activation created by APPLY. Figure 3(b) shows a typical function template. The RETURN instruction receives the destination list consisting of the address $\lambda$; when it receives the result computed by the function body, it sends the result to the instructions whose addresses are listed in the return link and sends a signal to a RELEASE instruction.

In many cases the value returned by a function $f$ is computed directly by a tail-recursive application of $f$, as shown in Figure 4. In this situation the result to be returned by the caller is exactly that returned from the callee, and the reactivation of the caller is unnecessary. The TAILAPPLY instruction in ViM implements this. It also causes a function activation but is different from APPLY: it has an extra operand, a return link which it passes to the callee instead of generating a new one; also, it sends signals to the instructions whose indices are in the destination list of the TAILAPPLY instruction.

Figure 4(a) shows the TAILAPPLY instruction and illustrates the operands that it needs. Figure 4(b) shows a typical template corresponding to a tail-recursive function. The SWITCH instruction takes two operands: if its second operand, which must be boolean, is true then the first operand is sent to all
the destinations of `switch` whose addresses are marked `true` in its destination list and if it is false then the first operand is sent to the destinations marked `false`. The function body determines if no further applications are to occur, in which `return` is activated, otherwise the FAIL `apply` instruction is enabled.
Figure 4: (a) The `TAILAPPLY` instruction. (b) Typical data flow graph for the body of a tail-recursive function.
2.3.2 Early-Completion Queues

In computations involving data structures, concurrency is increased if a data structure can be made available for access before all the component values have been computed. If instructions are required to receive all their operands before their application, as is usual for the execution of data flow programs, this concurrency of creating and accessing a data structure is not possible.

In VIM there is a special facility called early-completion queue (abbreviated EC-queue) to permit structures to be created before the values of all the components are available. Arrays will be used to describe the early-completion mechanism informally (figure (5)). The behaviour of structures containing EC-queues is specified by the state-transition rules of the MKINARRAYEC, SELECT, APPEND and SET instructions.

An EC-queue is a collection of addresses of instructions. MKINARRAYEC creates an array in which all the elements are EC-queues, all initially empty. This shell of the structure is passed onto consumers of the data structure, and also to producers which replace the EC-queues by values using the SET instruction. If a SELECT tries to access an element which is an EC-queue, its address is added to the EC-queue and the instruction is removed from the set of enabled instructions. Eventually, a SET instruction replaces the EC-queue by a Value and adds the addresses of the instructions in the EC-queue to the set of enabled instructions. When these instructions are attempted for execution again, they would read the value, as desired. Structures with EC-queues provide a powerful mechanism for synchronisation, and is an effective solution to the read-before-write problem [4].

Figure 5 illustrates the early-completion mechanism. Figure 5(a) shows a data flow graph which creates an array of one element which is an empty EC-queue. The array is sent to two consumers whose addresses are α and β, and to a set instruction γ. Figure 5(b) shows how the contents of the array changes when the instructions α, β and γ fire in sequence. If γ fires first, then α and β can access the value in the usual manner: the erstwhile presence of the EC-queue does not affect subsequent accesses after it is replaced.

The early-completion mechanism makes it possible to allow function applications to begin execution before the values of all their arguments have been computed. This is done by packaging the arguments into a record of EC-elements. Similarly, the result values, if there are more than one, may be
returned as a structure of EC-elements so each may be available to the caller without waiting for all the results to be evaluated.

\[\text{Figure 5: (a) Data flow graph showing producer-consumer relationship for structure containing EC-queue. (b) The contents of the array under the firing of } \alpha, \beta \text{ and } \gamma \text{ in sequence. First, the firing of } \alpha \text{ causes } \alpha \text{ to be added to the EC-queue, which was empty. Next } \beta \text{ attempts to access the element and also gets added to the EC-queue. Eventually when } x \text{ is computed, the SET fires. It replaces the EC-queue with } x \text{ and adds } \alpha \text{ and } \beta \text{ to the set of enabled instructions.} \]

The semantics of arrays with ecqueues is very similar to the semantics of I-structures, which were proposed by Arvind and Thomas [5]. An I-structure is a linear contiguous data structure; an element of an I-structure can be written into at most once. Reads occurring before an element has been written into are deferred until the arrival of the value. The pictures construct of Halstead [19] is also of similar flavour.
2.3.3 Suspensions and Streams

Stream structures are an attractive language feature since they permit the producer and consumers of the stream to operate concurrently. VM provides a special mechanism for efficient implementation of streams. A stream is represented in VM as a chain of records of two elements, each of which is an EC-queue: the first element contains an element of the stream and the second element is a pointer to the rest of the stream. In a completely data driven evaluation of a stream, the producer would proceed at its own pace and generate the values to construct the stream. The consumer process accesses the elements of the stream at its own rate, waiting whenever it encounters an EC-queue until the value is supplied.

The problem with this scheme is that it allows the producer to get arbitrarily far ahead of the consumer process. If the consumer needs only a part of stream then substantial computation performed by the producer may wasted. In particular, if streams are evaluated in a data driven manner, then infinite streams cannot be supported on VM. So streams are produced in a data driven manner, allowing the user to write programs in VMVAL which deal with potentially infinite streams.

VM uses suspensions to implement demand-driven evaluation of streams. Suspension mechanisms have been used to implement infinite data structures by Henderson [20], Friedman and Wise [16], etc. In VM, a suspension contains the address of an instruction, consisting of the uid of the activation template of the instruction and its index in the template. When a SELECT instruction tries to access an element which is a suspension, the suspension is replaced by an EC-queue containing the address of the SELECT and a signal is sent to the instruction whose address is found in the suspension. The signalled instruction eventually causes the EC-queue to be eventually replaced by a value and the SELECT instruction gets the value it was trying to access.

Figure 6(a) shows the creation of an array whose only element is an EC-queue. If SELTSLSP fires before SELECT, it finds an empty EC-queue and replaces it by a suspension. When SELECT executes, the suspension is replaced by an EC-queue containing the address β of SELECT and a signal is sent to the instruction indicated in the suspension. If SELECT fires first, it is enqueued in the EC-queue; SELTSLSP executes, finds a non-empty EC-queue, and simply sends a signal. The graph is arranged such that the arrival of the signal initiates a computation that ultimately enables a SET instruction. The SET instruction replaces the EC-queue in this array by a value and services the EC-queue by adding the set of
instructions whose addresses are in the EC-queue to the set of enabled instructions.

Figure 6: (a) Creation of an array with a suspension element. (b) Effect of firing of SETSUSP and SETSUSP instructions in different orders.

The use of suspenions for generating the elements of a stream is shown in Figure 7. The records
which constitute the stream have two fields which are named head and tail. The stream consists of successive integers. Figure 7(a) shows an initial stream whose first element is 1 and the tail component is a suspension. When a consumer α tries to access the tail of the stream, the suspension is replaced by an EC-queue containing the address of the consumer (7(b)). A signal is sent to the suspended instruction which causes the EC-queue to be replaced by new record whose head component contains the next element of the stream and the tail component is a suspension (7(c)). The consumer which tries to access the rest of this stream in turn replaces the suspension by an EC-queue. If there are no consumers which have pointers to the beginning of the stream then the element at the front may be abandoned (7(d)). Suspensions can also be used to advantage for evaluating the elements of arrays in a demand driven manner. The main benefit in doing this would be that array elements which are never read need not be computed, thus reducing the amount of computation performed.

The rest of the chapter gives a mathematical specification of VIM. The operational semantics of the instructions of VIM are presented. The specification will be called L1 in this thesis. L1 will serve as the basis for the development of an operational model for VIM in which storage is modelled; that model will called L2.

2.4 Operational Model for VIM - L1

The VIM interpreter has two components: a function Interp and State. Interp takes two arguments — a State and an enabled instruction (defined later) and produces a new State of the machine. The following notation will be used in the thesis. Sets are denoted by bold font, elements of sets (which may themselves be sets) are denoted by italicised letters and names are indicated in a distinctive font. Thus, This is a set. This ∈ This and This is a name.

The actions of the interpreter are described by state transition rules. A programming language-ish description is used to specify the rules for mapping a set to another set. Mathematical notations such as set unions and differences are used wherever convenient. A rule \( F \) is expressed as follows:

\[
\text{Define } F(A, (i, j), v) \equiv \\
\text{... body of the rule}
\]

The definitions may be recursive.
Figure 7: Demand-driven generation of stream elements. (a) Stream element: the producer is awaiting a demand. (b) The consumer demands the next stream element. (c) The producer generates one stream element and suspends itself. (d) The consumer abandons the previous element and demands another.

\[ \text{VIM} = \langle \text{Interp, State} \rangle \text{ where} \]
\[ \text{Interp} : \text{State} \times \text{EIS} \rightarrow \text{State} \]
\[ \text{State} = \text{Act} \times \text{H} \times \text{EIS}, \]
\[ \text{Act} = \text{U} \rightarrow \text{Function} \]
\[ \text{H} = \text{U} \rightarrow \text{ST} \]
\[ \text{U} = \text{the set of all unique identifiers.} \]
\[ \text{EIS} = \text{the set of all enabled instructions, described later.} \]

\text{Act} \text{ is the set of all activations; an activation is created by the invocation of a function. The heap} \]
\[ \text{H} \text{ contains all structure values and function templates, early-completion queues (discussed later).} \]
function closures and the instructions. Each element on the heap has a unique identifier (uid). Only scalar values and uids are sent on tokens from one data flow actor to another; data structures always reside on the heap.

Scalar values are tagged.

Scalar values are tagged.

- Scalars = Integers ∪ Reals ∪ Booleans ∪ Character ∪ Null
- Integers = \{int\} × (\{undef\} ∪ the set of all integers)
- Reals = \{real\} × (\{undef\} ∪ the set of all reals)
- Booleans = \{bool\} × (\{true, false, undef\})
- Character = \{char\} × (\{undef\} ∪ the set of characters in the machine.)
- Null = \{null\} × \{nil, undef\}

The set ST describes the elements which reside on the heap. Elements of different types are distinguished by their tags.
\[ ST = (\{\text{arr}\} \times (\text{Array} \cup \{\text{undef}\})) \cup (\{\text{fn}\} \times \text{Function}) \]
\[ \cup (\{\text{eq}\} \times \text{ECQ}) \cup (\{\text{inst}\} \times \text{Instruction}) \cup (\{\text{dests}\} \times \text{Dests}) \]
\[ \cup (\{\text{clsr}\} \times \text{Clsr}) \]

\text{Array} = [\text{Integers} \rightarrow (\text{U} \cup \text{Scalars} \cup \text{SUSP})], \text{Integers} \text{ being the set of integers.}

\text{Function} = [\text{N} \rightarrow \text{Instruction}], \text{N} \text{ being the set of natural numbers.}

An early-completion element (EC-element) is a tuple \((u, i)\) where \(u\) is the uid of a function activation and \(i\) is the index of an instruction in the activation. An early-completion queue is a collection of such EC-elements.

\[ \text{ECE} = \text{U} \times \text{N}. \]

Thus \((u, k) \in \text{ECE}\) where \(u\) corresponds to the uid of a function activation and \(N\) is the index of the instruction in the activation template.

The EC-queue is a collection of elements of ECE. All EC-queues are members of the set ECQ which is defined below. The notation \(\mathcal{P}(N)\) denotes the powerset of the set \(N\).

\[ \text{ECQ} = \mathcal{P}(\text{ECE}) \]

A suspension is a member of the set SUSP specified by:

\[ \text{SUSP} = \{\text{susp}\} \times (\text{U} \times \text{N}) \]

An instruction is a seven-tuple:

\[ \text{Instruction} = \text{OPS} \times (\text{U} \cup \text{Scalars})^3 \times \text{N} \times \text{N} \times \text{U} \]

\text{OPS} is the set of opcodes, the next three elements of the tuple refer to the operands, the fifth and sixth elements of the tuple are the operand count and the signal count and the last element is the unique identifier of the list of destinations. Each destination of an instruction is the index of the instruction to which the result is to be sent. The result may be a value or a signal. For \(I \in I\), the elements of the tuple will be denoted by the '.' notation. Thus, \(\text{/opcode}\) is the first element of the tuple, \(\text{/op1}, \text{/op2}\) and \(\text{/op3}\) refer to the second through fourth elements of the tuple, \(\text{/opcnt}, \text{/sugcnt}\) and \(\text{/destlist}\) denote the fifth through seventh elements of the tuple.

\[ \text{OPS} = \{\text{ADD, SUB, MKINTARRAY, MKINTARRAYT, SELECT, APPEND, SET, SUSP, SWITCH, APPLY, TAI(APPLY, RETURN, RELEASE)\} \}

A destination of an instruction consists of the the address of the instruction to which the result is
to be sent, and the operand which is to receive the result value. op1, op2 and op3 denote the first, second and third operand fields in an instruction, respectively. If the result is a signal then no operand number is required. The destination also specifies if the result is to be sent unconditionally or conditionally. For all instructions except for SWITCH, the results of instructions (both values and signals) are sent to the destinations unconditionally.

\[ \text{Dest} = \mathcal{D} \]

\[ \mathcal{D} = \{ \text{unconditional, true, false} \} \times \mathbb{N} \times \{ \text{op1, op2, op3, signal} \} \]

\[ \text{Clr} \] is the set of closure records. The operator (APPLY, TAILAPPLY, STREAM-APPLY) which cause a function activation take a closure as the first argument. The first component of the closure is the uid of a function template; that uid corresponds to the function which is to be called.

\[ \text{Clr} = \{ (\text{FunctionToApply} \cup M) \rightarrow (U \cup \text{Scalars}) \}
\]

where \( C(\text{FunctionToApply}) \in U \) such that \( H(u) \in (\{ fn \} \times \text{Function}) \).

An enabled instruction is an element of \( \text{EI} \subseteq U \times N \). An instruction \( I \) becomes enabled when \( l\text{opcnt} \) and \( l\text{sigcnt} \) both become zero. The set of enabled instructions describes the collection of instructions which are ready to executed because they have received an operand on each of the operand arcs and a signal on each of the signal arcs. The set of enabled instructions is:

\[ \text{EIS} = \mathcal{R}(\text{EI}) \]

The function \( \text{Choice} \) selects an element from a set of enabled instruction. The instruction is then interpreted by the function \( \text{Interp} \). Thus \( \text{Choice} \) is our scheduler.

\[ \text{Choice} : \text{EIS} \rightarrow \text{EI} \]

Functions \( \text{AddToHeap} \) and \( \text{DeleteFromHeap} \) add and delete elements from the heap. \( \text{AddToAct} \) and \( \text{DeleteFromAct} \) which

\[ \text{AddToHeap} : H \times U \times ST \rightarrow H \]

\( \text{AddToHeap}(H, u, v) \) produces a new function \( H' \) such that:

\[ (\forall u' \neq u [H'(u') = H(u')]) \text{ and } H(u) = V \]

\[ \text{DeleteFromHeap} : H \times U \times ST \rightarrow H \text{ DeleteFromHeap}(H, u, V) \) produces a new heap \( H' \) such that the domain of \( H' \) is the domain of \( H \) without the element \( u 
\]

\[ H' \text{ such that } (\forall u' \neq u [H'(u') = H(u')]) \]
AddToHeap(H, u, v) creates a new heap which contains, in addition to the associations between uids and objects in H, a new association between u and v. DeleteFromHeap(H, u, v) creates a new heap which does not contain the uid u in its domain.

Similarly, AddToAct and DeleteFromAct create new activation sets by adding an activation to and deleting an activation from the current set of activations.

\[\text{AddToAct} : \text{Act} \times \text{U} \times \text{Function} \rightarrow \text{Act}\]
\[\text{DeleteFromAct} : \text{Act} \times \text{U} \times \text{Function} \rightarrow \text{Act}\]

Function SendResult is used to dispatch the result of an instruction I to a destination instruction. SendResult models the following actions: the result is stored in the appropriate operand field of the destination instruction, and the operand count and signal count fields are decremented accordingly. Since we are dealing with a mathematical representation of instructions and function activations, this updating is modelled by producing new values that reflect the changes. Thus, \( FA' \) is the new new value of \( FA \) with the updated \( I \) and \( \text{Act} \) is the same as \( \text{Act} \) except that the \( i \)th instruction of activation \( FA \) has received some operand (or signal).

\[\text{SendResult} : \text{Act} \times \text{EIS} \times \text{U} \times \text{D} \times \{\text{value}\} \times \{\text{U} \cup \text{Scalars}\} \cup \{\text{signal}\} \rightarrow \text{Act} \times \text{EIS}\]

define \text{SendResult}(\text{Act}, \text{EIS}, u_{FA}, (dc, i, opnum), \text{result}) \equiv

\begin{align*}
\text{let} \quad FA &= \text{Act}(u_{FA}), \quad \% FA is an activation template. \\
I &= FA(i) \quad \% I is the \( i \)th instruction in the activation template.
\end{align*}

\begin{align*}
\text{in} \\
\text{AddToAct(DeleteFromAct(Act, u_{FA}, FA), u_{FA}, FA')} \\
\text{if } (I \cdot \text{opcnt} = 0) \land (I \cdot \text{sigcnt} = 0) \text{ then } \text{EIS} \cup \{u_{FA}, 0\} \\
\text{else } \text{EIS} \\
\text{endif} \\
\text{endlet} \\
\text{where} \\
FA'(j) &= FA(j), \quad j \neq i. \\
&= I, \quad j = i.
\end{align*}
and \( \Gamma \) \( \in \) Instruction,
\[
\Gamma_{\text{opcode}} = \Gamma_{\text{opcode}},
\Gamma_{\text{op1}} = \text{if opnum} \neq \text{op1} \text{ then } \Gamma_{\text{op1}} \text{ else } \Gamma \text{ where result} = (\text{value}, i)
\Gamma_{\text{op2}} = \text{if opnum} \neq \text{op2} \text{ then } \Gamma_{\text{op2}} \text{ else } \Gamma \text{ where result} = (\text{value}, i)
\Gamma_{\text{op3}} = \text{if opnum} \neq \text{op3} \text{ then } \Gamma_{\text{op3}} \text{ else } \Gamma \text{ where result} = (\text{value}, i)
\Gamma_{\text{opcnt}} = \text{if opnum} \in \{\text{op1}, \text{op2}, \text{op3}\} \text{ then } \Gamma_{\text{opcnt}} \text{ else } \Gamma_{\text{opcnt}} - 1
\Gamma_{\text{sigcnt}} = \text{if opnum} = \text{signal} \text{ then } \Gamma_{\text{sigcnt}} - 1 \text{ else } \Gamma_{\text{sigcnt}}
\Gamma_{\text{destlist}} = \Gamma_{\text{destlist}}
\]

The function \( \text{SendToDestinations} \) sends the result of an instruction to all the destinations of the instruction. It is a simple tail-recursive function which uses \( \text{SendResult} \) repeatedly to send the value or signal to the destination.

\[
\text{SendToDestinations} : \text{Act} \times \text{EIS} \times \text{U} \times \mathcal{S}(\text{D} \times ([\text{value}] \times (\text{U} \cup \text{ Scalars})) \cup \{\text{signal}\}) \\
\rightarrow \text{Act} \times \text{EIS}
\]

\[
\text{define } \text{SendToDestinations} (\text{Act, EIS, u, DestValue}) \equiv \\
\text{if DestValue} = \{\} \text{ then Act, EIS} \\
\text{else} \\
\text{let } ((\text{dc, d, opnum}, v) = e \text{ where } e \in \text{DestValue} \\
\text{in} \\
\text{if } v = \text{signal} \text{ then} \\
\text{SendToDestinations} \\
(\text{SendResult} (\text{Act, EIS, u, (dc, d, signal, signal)}, \text{DestValue} \setminus \{e\}) \\
\text{else} \\
\text{SendToDestinations} \\
(\text{SendResult} (\text{Act, EIS, u, (dc, d, opnum, v)}, \text{DestValue} \setminus \{e\}) \\
\text{endif} \\
\text{endlet} \\
\text{endif}
\]

Execution of a program on \( \text{VIM} \) is initiated by the invocation of a function in the base language. The execution terminates when there are no more enabled instructions. The execution loop may be summed as:

\[
\text{define } \text{MainLoop} (\text{State}) \equiv \\
\text{let (Act, H, EIS) = State} \\
\text{in} \\
\text{if EIS} = \{\} \text{ then halt} \\
\text{else } \text{MainLoop} (\text{Interp(Act, Choice(EIS)))} \\
\text{endif} \\
\text{endlet}
\]
The interpreter is defined by the function \textit{Interp} and \textit{Choice} is a function which selects an element from the set of enabled instructions. This instruction is interpreted by \textit{Interp} in the context of the current state of the machine; the result of the execution of the instruction is a new state. \textit{Choice} is the scheduler in ViM since it makes the choice of the instruction which is to be executed. \textit{Choice}(EIS) where \( EIS \in \text{EIS} \) is the address of an enabled instruction.

\[
\text{Interp} : \text{State} \times \text{Choice} (\text{EIS}) \rightarrow \text{State}
\]

where \( \text{State} = \text{Act} \times \text{H} \times \text{EIS} \).

It is pertinent to point out that ViM is a non-deterministic state transition system; any one of the enabled instructions could be selected for execution and the final result of the computation is independent of the order of execution of the enabled instructions.

The following notation is used to denote that the new state \((\text{Act}', \text{H}', \text{EIS}')\) is produced when the instruction \( e \) is interpreted by \textit{Interp} in the state \((\text{Act}, \text{H}, \text{EIS})\).

\[
(\text{Act}, \text{H}, \text{EIS}) \vdash (\text{Act}', \text{H}', \text{EIS}') \text{ on } e.
\]

Now we can define the interpreter by specifying the state transitions for each of the opcodes. The state transitions for some of the more interesting instructions will be presented below; these serve as the model for specifying the transition rules for the rest of the instruction set. The body of \textit{Interp} is a big conditional statement; the branches of the conditional are based on the opcodes of the instruction being executed. Some general comments are in order here. The result of an instruction is sent to its indicated destinations unconditionally, unless it is a \textit{SWITCH} instruction.

\[
\begin{align*}
\text{define } \text{Interp}(\text{State}, (u_{\text{FA}}, k_{\text{FA}})) \equiv \\
\text{let } \\
\text{FA} = \text{Act}(u_{\text{FA}}), \% \text{ the function activation} \\
I = \text{FA}(k_{\text{FA}}), \% \text{ the instruction} \\
\{(\text{d}_{c_1}, \text{d}_1, \text{opnum}_1), (\text{d}_{c_2}, \text{d}_2, \text{opnum}_2), \ldots, (\text{d}_{c_n}, \text{d}_n, \text{opnum}_n)\} = \text{destlist} \% \text{ the destinations of the instruction } I. \\
\text{in} \\
\text{if } \text{opcode} = \text{SET} \text{ then } \ldots \\
\text{elseif } \text{opcode} = \text{APPLY} \text{ then } \ldots \\
\ldots \\
\text{endif} \\
\text{endlet}
\end{align*}
\]

The specification of the state transition rules for some of the interesting and representative
instructions of VIM will be the subject of the rest of this chapter. Each conditional statement of the form "if \texttt{opcode} = ... then " is an arm of the big conditional statement in \texttt{Interp} above. Thus, the names \texttt{Act}, \texttt{H}, \texttt{EIS}, \texttt{u}_{F_A}, \texttt{k}_{F_A}, (\texttt{d}_1, \texttt{d}_1, \texttt{opnum}_1), ... (\texttt{d}_n, \texttt{d}_n, \texttt{opnum}_n) will have the same bindings as indicated in the body of \texttt{Interp} shown above.

Let us begin with the simple instruction \texttt{IADD} which adds two integers. The operands are read from the operand fields of the instruction and the result of the addition is \texttt{sum}. This value is sent to the listed destinations. Observe that the heap remains unchanged.

\begin{verbatim}
if \texttt{opcode} = \texttt{IADD} then
    let
        (int, m) = \texttt{lop1},
        (int, n) = \texttt{lop2},
        \texttt{sum} = m + n,
        \texttt{Act'}, \texttt{EIS}' =
            \texttt{SendToDestination} (\texttt{Act}, \texttt{EIS}, \texttt{u}_{F_A}, \texttt{k}_{F_A}, \{ \texttt{(unconditional, d}_1, \texttt{opnum}_1, \texttt{a}_1), ... \texttt{(unconditional, d}_n, \texttt{opnum}_n, \texttt{a}_n) \})
            in
        \texttt{Act'}, \texttt{H},
        \texttt{EIS}' - \{ \texttt{u}_{F_A}, \texttt{k}_{F_A} \}
    endif
\end{verbatim}

where \texttt{a}_i \in \{ \texttt{(value, sum), signal} \}

The actions of the interpreter for instructions such as \texttt{ISUB}, \texttt{IMUL}, \texttt{IDIV}, \texttt{IGT}, \texttt{ILT}, etc. are very similar and will not be described.

The instructions which operate on structures produce a new heap. The operations on one type of structures — arrays — are described here; the actions of \texttt{Interp} for instructions which operate on \texttt{record} and \texttt{oneof} types are very similar and are not presented. The array instructions of interest which are discussed below are: \texttt{MKARRAYINT}, \texttt{MKARRAYINTFC}, \texttt{SELECT}, \texttt{APPEND}, \texttt{SET} and \texttt{SETSLSP}.

\texttt{MKARRAYINT} takes two integer operands (\texttt{m} and \texttt{n}) and adds an array with bounds (\texttt{m}, \texttt{n}) to the heap. All the elements of the array are undefined. The uid of the new array is sent as the result to the destinations, along with signals if necessary.
if \texttt{opcode} = \texttt{MKINTARRAY} then
  let
  \((\text{int. } p) = \texttt{lop1},\)
  \((\text{int. } q) = \texttt{lop2}.\) \hspace{1cm} \% \textit{p and q are integers, }p < q.\)
  \(u_v = \text{a new uid in } U.\)
  \(\text{Act}', \text{EIS}' =\)
  \(\text{SendToDestinations}(\text{Act}, \text{EIS}, u_{FA}', \{((\text{unconditional}, d_1, \text{opnum}_1), \alpha_1), \ldots, ((\text{unconditional}, d_n, \text{opnum}_n), \alpha_n)\})\)
  in
  \(\text{Act}', \text{AddToHeap}(H, u_v, (\text{arr. } ([p, q] \rightarrow \text{undef})\)).\)
  \(\text{EIS}' = \{(u_{FA}', k_{FA})\}\)
endlet

where \(\alpha_i\) is either \texttt{(value. } u_v\text{)} or \texttt{signal}.\)

The instruction \texttt{MKINTARRAYEC} works quite similarly except that the elements of the array are all
early-completion queues, all empty.

The \texttt{APPEND} instruction takes three operands — an array \(A\), an index \(i\) and a value \(x\). It creates a
new array \(A'\) which is identical to \(A\) except that the \(i\)th element of \(A'\) has value \(x\). It is important to be
very careful while performing \texttt{APPEND} operations on arrays with \texttt{EC}-elements. If some elements in \(A\)
are \texttt{EC}-elements then the corresponding elements of \(A'\) would also be \texttt{EC}-elements. When a \texttt{SET}
instructions replaces an \texttt{EC}-element in \(A\) by a value, this value must be forwarded to the corresponding
elements in structures which were created by \texttt{APPEND} on \(A\). There may be a cascade of value-
forwarding precipitated by this since the values may also have to be forwarded to arrays created by
\texttt{APPENDS} on structures derived from \(A\). Since suspensions are potential sites for \texttt{EC}-queues, \texttt{APPEND}
operations on arrays containing suspensions introduces a similar need for \texttt{Value} forwarding. This thesis
adopts a simple alternative to the value-forwarding discipline outlined above. An \texttt{APPEND} instruction is
executed provided that there are no \texttt{EC}-elements or suspensions in the array on which the operation is to
be performed. If there is any \texttt{EC}-element in the array then there is no change in the state of the
machine; the \texttt{APPEND} instruction remains in the set of enabled instructions and will be selected for
execution at some future time.
if \( \text{opcode} = \text{APPEND} \) then
    let
        \( u = \text{lop1} \),
        \((\text{arr}, A) = H(u)\),
        \((\text{int}, i) = \text{lop2} \),
        \( x = \text{lop3} \),
        \( u' = \text{new uid from U} \),
        \( \text{Act} \cdot \ EIS' = \)
            \( \text{SendToDestinations(} \text{Act}, \ EIS, \ u'_{F_A'}, \{(\text{unconditional}, d_1, \text{opnum}_1, \alpha_1), \ldots, \)
                \( (\text{unconditional}, d_n, \text{opnum}_n, \alpha_n)\}) \)
        in
    if \( |\{i : A(i) \in U \times \{(\text{ecq}) \times \text{ECQ} \} \cup \{(\text{susp}) \times \text{SUSP}\}| = 0 \) then
        \( \text{Act}', \)
        \( \text{AddToHeap}(H, u', A') \),
        \( EIS' = \{(u'_{F_A'}, k_{FA})\} \)
    else
        \( \text{Act}, \)
        \( H, \)
        \( EIS \)
    endif
endlet
where \( A'(j) = A(j) \quad j \neq i \)
\( = x \quad j = i. \)

and where \( \alpha_i \) is either \( \text{value}, u' \) or \( \text{signal} \).

\( \text{SELECT} \) requires two operands — the uid of an array \( A \) and an integer \( i \), and in the simplest case (the element being accessed is neither an early-completion structure nor a suspension) returns the value associated with the element of \( A \) that has index \( i \). The behaviour of the interpreter is more complex when such is not the case. The state transition is specified below, and the discussion on early-completion elements and suspensions follows. The special value \( \text{est} \) is used for indicating the end of a stream.
if opcode = SELECT then
  let
  \[ u = \text{top1}, \]
  \[ (\text{arr, A}) = H(u), \quad \% (u, (\text{arr, A})) \in H \]
  \[ (\text{int, i}) = \text{top2}, \quad \% i \text{ must be an integer} \]
  \[ t = A(i) \]
  in
  if \[ t = (u, (\text{ecq, Q}) \] then
    \[ \text{Act,} \]
    \[ \text{AddToHeap}(\text{DeleteFromHeap}(H, u, (\text{ecq, Q})), u, (\text{ecq, Q} \cup \{(u_{FA}, k_{FA})\}))) \]
    \[ EIS = \{(u_{FA}, k_{FA})\} \]
  elseif \[ t = (u, (\text{susp, (u', k')})) \] then
    let \[ \text{Act}', EIS' = \text{SendResult}(\text{Act, EIS, u', (unconditional, k', signal, signal)}) \]
    in
    \[ \text{Act}', \]
    \[ \text{AddToHeap}(\text{DeleteFromHeap}(H, u, (\text{susp, u', k')})), u, (\text{ecq, u}_{FA}, k_{FA})) \]
    \[ EIS' = \{(u_{FA}, k_{FA})\} \]
  endlet
else
  let \[ x = \]
  \[ \text{Act}', EIS' = \]
  \[ \text{SendToDestinations}(\text{Act, EIS, u}_{FA}, \{(\text{unconditional, d}_1, \text{opnum}_1, \alpha_1), ..., \}
  \]
  \[ \{(\text{unconditional, d}_n, \text{opnum}_n, \alpha_n)}\}) \]
  in
  \[ \text{Act}', \]
  \[ H, \]
  \[ EIS' = \{(u_{FA}, k_{FA})\} \]
endlet

where \( \alpha_i \) = either (value, x) or signal.
if $\text{opcode} = \text{SET}$ then
\begin{align*}
&\text{let} \\
&\quad u_1 = \text{lop1}, \\
&\quad (\text{arr}, A) = H(u_1), \\
&\quad (\text{int}, i) = \text{lop2}, \\
&\quad v = \text{lop3}, \\
&\quad \% (u, (\text{arr}, A)) \in H \\
&\quad \% i \text{ must be an integer} \\
&\quad \% v \in \text{Scalars} \cup U \\
&\quad \% \text{where the uids must be of records, arrays, oneofs.} \\
&\quad t = A(i), \\
&\quad \% t = (u, (\text{eq}, Q)) \\
&\quad \text{Act}'(\text{EIS}' = \\
&\quad \quad \text{SendToDestination}\text{Act', EIS', } u_{F_A'} \cdot \{ \langle \text{unconditional}, d_1, \text{opnum}_{n_1}, a_1 \rangle, \ldots \text{,} \\
&\quad \quad \langle \text{unconditional}, d_m, \text{opnum}_{n_m}, a_m \rangle \} \\
&\quad \text{in } \text{Act}'. \\
&\quad \% \text{the new set of activations.} \\
&\quad \text{AddToHeap}(\text{DeleteFromHeap } H, u_1, (\text{arr}, A), u_1, (\text{arr}, A')), \\
&\quad \% \text{the new heap reflects the fact that the ith element} \\
&\quad \% \text{of the array with uid } u \text{ has value } v. \\
&\quad (\text{EIS} \; - \{ (u_{F_A'}, k_{F_A'}) \}) \cup Q \\
&\quad \% \text{The new EIS includes the instructions whose addresses were in the} \\
&\quad \% \text{EC-queue. } (u_{F_A'}, k_{F_A'}) \text{ is the address of the current instruction}, \\
&\quad \% \text{which is removed from the set of enabled instructions. When the value} \\
&\quad \% \text{becomes available, this instruction will be added back to the set of enabled} \\
&\quad \% \text{instructions.}
\end{align*}

endlet
endif

where $A(j) = A(i), \quad j \neq i \\
= v, \quad j = i.$

The use of EC-elements in data structures permits the construction of a data structure before the values of all the components have been computed. Suspensions allow demand driven computation. A suspension is a tagged triple — a tag $\text{susp}$, the uid $u$ of some function activation, and $i$ the index of an instruction in the activation. The instruction $\text{SETSUSP}$ takes three arguments : an array $A$, an integer $v_1$ which is an index of the array, and another integer $v_2$ which is the index of an instruction in the template of the $\text{SETSUSP}$ instruction. $\text{SETSUSP}$ sets the $v_1$th element of the array to a suspension of the form $(\text{susp}, u_{F_A'}, v_2)$ where $u_{F_A'}$ is the uid of the activation template of the $\text{SETSUSP}$. When a $\text{SELECT}$ tries to access the element, the suspension is replaced by an EC-queue which contains the address of the $\text{SELECT}$ and a signal is sent to the instruction whose address is found in the suspension. The graph is so arranged that the arrival of the signal enables the instruction $(u_{F_A'}, v_2)$, which initiates the computation of the value of the element.
if \texttt{opcode} = \texttt{SETSUSP} then
   let
      \( u_1 = \texttt{op1} \),
      (\texttt{arr}, A) = H(u_1),
      \( (\texttt{int}, v_1) = \texttt{op2} \) % \((u_i, (\texttt{arr}, A)) \in H\)
      \( (\texttt{int}, v_2) = \texttt{op3} \) % \(v_1, v_2\) must be an integer
      \( u^i(\texttt{ECQ}, Q) = A(i) \) % \(A(i)\) must be an early-completion queue.
   endlet
   \( \text{Act'} \cdot \text{EIS}' = \text{SendToDestinations}(\text{Act}, \text{EIS}, u_F, \{(\text{unconditional}, d_x, \text{signal}), \text{signal}\}, \ldots, \{(\text{unconditional}, d_y, \text{signal}), \text{signal}\}) \)
   in
   \( |Q| = 0 \) then
      % put a suspension in the \textit{i}th element of the structure with uid \( u_F \)
      \( \text{Act}' \cdot \text{AddToHeap}(\text{DeleteFromHeap}(H, u_1, (\texttt{arr}, A), u_1, (\texttt{arr}, A'))) \). \( \text{EIS}' - \{(u_{FA'}, k_{FA'})\} \)
   else
      % just send a signal to the instruction whose index is \( v_2 \)
      let \( \text{Act}'' \cdot \text{EIS}'' = \text{SendResult}(\text{Act}', \text{EIS}', u_{FA'} \{\text{unconditional}, v_2, \text{signal}\), \text{signal}\}) \)
      in
      \( \text{Act}' \cdot \text{H}, \text{EIS}' - \{(u_{FA'}, k_{FA'})\} \)
endlet

endif

The \texttt{SWITCH} operator is used to implement the conditional graph schema. It takes two operands — a value \( v \) and a boolean \( b \). The condition fields of the destinations of the \texttt{SWITCH} operator must have values either \texttt{true} or \texttt{false}. If \( b \) is true, then \( v \) is sent to the destinations which are marked \texttt{true}, otherwise they are sent to the destinations marked \texttt{false}. The destinations (\texttt{true}, \( d_{T}, \text{opnum}_{T} \), \ldots, (\texttt{true}, \( d_{T'}, \text{opnum}_{T'} \)) denote the destinations to which the first operand must be sent if the second operand is true; otherwise the first operand is sent to the destinations (\texttt{false}, \( d_{F}, \text{opnum}_{F} \), \ldots, (\texttt{false}, \( d_{F'}, \text{opnum}_{F'} \)).
if 
   \texttt{opcode = SWITCH} then 
   let 
   \[ u = \texttt{lop1}, \]
   \[ (\texttt{bool, } b) = \texttt{lop2}. \] % \texttt{b} must be a boolean value. 
   \[ (\texttt{true, } d_{1i}, \texttt{opnum}_{1i}), ..., (\texttt{true, } d_{ip}, \texttt{opnum}_{ip}), \] 
   \[ (\texttt{false, } d_{qi}, \texttt{opnum}_{qi}), ..., (\texttt{false, } d_{fq}, \texttt{opnum}_{fq}) = \texttt{destlist}, \] 
   \[ \texttt{Act'}, \texttt{EIS'} = \] 
   if \[ b \] then 
   \[ \texttt{SendToDestinations}(\texttt{Act, EIS, } u'_{FA}, \{((dc_{1i}, \texttt{d}_{1i}, \texttt{opnum}_{1i}), \texttt{a}_{1i}), \ldots, \]
   \[ ((dc_{ip}, \texttt{d}_{ip}, \texttt{opnum}_{ip}), \texttt{a}_{ip})\}) \]
   else 
   \[ \texttt{SendToDestinations}(\texttt{Act, EIS, } u'_{FA}, \{((dc_{qi}, \texttt{d}_{qi}, \texttt{opnum}_{qi}), \texttt{a}_{qi}), \ldots, \]
   \[ ((dc_{fq}, \texttt{d}_{fq}, \texttt{opnum}_{fq}), \texttt{a}_{fq})\}) \] 
   endif 
   in 
   \[ \texttt{Act'}, \] 
   \[ \texttt{H}, \] 
   \[ \texttt{EIS'} \cdot \{u_{FA}', k_{FA}'\} \] 
   endlet 
endif

The \texttt{SWITCH-SIGNAL} operator is very similar. It takes a boolean operand; if the operand has \texttt{true} value then it sends signals to the destinations tagged \texttt{true}, otherwise it sends signals to the destinations marked \texttt{false}.

The \texttt{SIGNAL} instruction requires no operands; it becomes enabled when it receives a signal on each of the signal arcs incident on it. The result of executing the instruction is a signal which is sent to the destinations listed in the \texttt{dests} field of the instruction. The state transition rule for the \texttt{SIGNAL} instruction is very simple and will not be specified here.
if \( \text{opcode} = \text{APPLY} \) then
let
\( (\text{clsr}, C) = H(\text{lop1}) \), \( \% (u_p, \text{clsr}, C) \in H \)
\( u_p = \text{lop2} \), \( \% (u_p, \text{rec}, R) \in H \)
\( f^* = H(C(\text{FunctionToApply})) \), \( \% (u_f, f, F) \in H, u_f = C(\text{FunctionToApply}) \)
\( u' = \text{a new uid from U} \),
\( \text{Act}' = \text{AddToAct}(\text{Act}, u', F) \),
\( \text{Act}'' = \text{SendResult} \)
\( (\text{SendResult} \)
\( (\text{SendResult} \)
\( (\text{Act}', \text{EIS}, u', (\text{unconditional}, 1, \text{op1}), (\text{value, } u_1)) \), \( \% \text{function closure} \)
\( u', (\text{unconditional}, 2, \text{op2}), (\text{value, } u_2)) \), \( \% \text{arguments} \)
\( u', (\text{unconditional}, 3, \text{op1}), (\text{value, } \text{idestlist}) \) \( \% \text{return link} \)
in
\( \text{Act}'', \text{EIS}' = \{(u_{FA}, k_{FA})\} \)
endlet
endif.

The function closure, argument structure and the return link are sent to the first operand of the first, second and third instruction in the activation template of the called function. A \text{RETURN} instruction in the called function will send the result of the function application to the destinations of \text{APPLY}.

The \text{TAILAPPLY} instruction is used whenever tail-recursive function application is performed. It sends the closure, argument structure and the return link to the first operand of the first three instructions in the called activation. It then sends a signal to each of its destinations.
if \( \text{lopcde} = \text{TAILAPPLY} \) then

let

\( (\text{clsr}, C) = H(\text{lop1}). \)  \( \% (u, (\text{clsr}, C)) \in H, \text{the closure} \)

\( u_2 = \text{lop2}. \)  \( \% (u, (\text{rec}, R)) \in H, \text{the arguments} \)

\( u_3 = \text{lop3}. \)  \( \% (u, (\text{dests}, \text{Destinations})) \in H, \text{the return link} \)

\( F = H(C(\text{FunctionToApply})). \)  \( \% (u, (\text{fn}, F)) \in H, \text{f}_j = C(\text{FunctionToApply}) \)

\( u' = \text{a new uid from} \ U, \)

\( \text{Act}' = \text{AddToAct}(\text{Act}, u', F), \)

\( \text{Act}'', \text{EIS}' = \)  % function closure.

\( \text{SendResult} \)

\( (\text{SendResult} \)

\( (\text{Act}', \text{EIS}, u', (\text{unconditional, 1, op1}), (\text{value}, u_2)). \)  \( \% \text{arguments} \)

\( u', (\text{unconditional, 2, op2}), (\text{value}, u_3). \)  \( \% \text{return link}. \)

\( \text{Act}''', \text{EIS}''' = \)  % function closure.

\( \text{SendToDestinations}(\text{Act}, \text{EIS}, u', \{(\text{unconditional, 2, signal}), \text{signal})), \ldots, \)

\( ((\text{unconditional, 3, signal}), \text{signal})\}) \)

in

\( \text{Act}''', \text{H}, \text{EIS}'' - \{(u_{FA}, k_{FA})\} \)

endlet

endif

STREAM-TAILAPPLY is another instruction for function application and is used for tail-recursive evaluation of streams. It takes three arguments — a function closure which contains the stream producer, a record which will contain the next element of the stream, and the argument record. Its semantics are very similar to that of TAILAPPLY and it sends signals to the destinations listed in its \texttt{dests} field. The compilation of functions which generate streams and use this instruction is discussed in chapter 5.

The \texttt{RETURN} instruction is responsible for sending the result of a function activation to the caller. It requires two operands — a return link and a value. It constructs the addresses of the instructions which are to receive the result of the function invocation from the return link and sends the value to each of those destinations. In addition, it sends signals to those instructions in its own activation whose indices appear in the destination list of \texttt{RETURN}.
if opcode = RETURN then
  let
  \{d_{c_1}, d_{c_2}, ..., d_{c_p}, opnum_{c_1}, ..., opnum_{c_p}\} = H(\text{op1}),
  \% the list of destinations to which the value computed
  \% by the function must be forwarded
  u_y = \text{op2}, \% the value to be returned
  Act', EIS' =
  \text{SendToDestinations}(Act, EIS, u_{FA'}, \{((\text{unconditional}, d_{c_1}, \text{signal}), \text{signal}), ..., ((\text{unconditional}, d_{c_p}, \text{signal}), \text{signal})\}),
  \text{Act''}, EIS'' =
  \text{SendToDestinations}(Act, EIS, u_{FA'}, \{((\text{unconditional}, d_{c_1}, opnum_{c_1}), \alpha), ..., ((\text{unconditional}, d_{c_p}, opnum_{c_p}), \alpha)\})
  \text{in}
  \text{Act''},
  H,
  EIS'' \cdot \{u_{FA'}, k_{FA}\}
endlet
endif

The data flow graph of a function is arranged so that RELEASE is the last instruction to be enabled and executed in the activation. The effect of the instruction is to remove the activation it belongs to from the set of current activations in the system. In a "real" system, this would amount to the release of the storage occupied by the activation template to the pool of free storage in the system.

if opcode = RELEASE then
  DeleteFromAct(Act, u_{FA}, FA)
  H,
  EIS \cdot \{u_{FA}, k_{FA}\}
  \text{where FA is the activation template associated with } u_{FA}.
endif

2.5 Summary

In the preceding section a formal specification of VIM was given by defining its operational semantics. A brief description of the functional language VimVAL was given, and some example programs illustrated some features of the language. Some of the key features of Vim were described informally. Next, we developed a formal model of the abstract machine for VIM. An interpreter for executing the data flow instructions was defined. The operational semantics of the data flow instructions of the machine were presented. The state transition rules for operations on early-completion structures and suspensions were formalised.
This operational model is called the specification of VIM. The next chapter defines the operational semantics of L2, in which the machine L1 is refined to include a model for storage engines.
Chapter Three

Operational Semantics of VIM with Storage

The VIM computer system is envisioned to have a hierarchically organized physical storage consisting of main-memory and a disk. Information is brought from the disk into the main memory upon demand. Since the system has finite main storage, objects are periodically moved from the main memory to the disk to create free space in the main store into which objects from the disk are brought in. It is desired that the only information brought into the main memory from the disk are those required by the computation and no other.

The address space of the system is governed by the size of the disk. For the purpose of this thesis it is assumed that a sufficiently large disk is available, thus avoiding the complications of managing the disk space. To facilitate data transfers between the disk and the main store, the address space is divided into equal sized pages, where a page is a set of contiguous words in the address space. Objects are stored in these pages. If the page size is large, it may be necessary to pack multiple objects in a page to reduce internal fragmentation. It is likely that when a page containing an object referenced by the computation is brought into main memory, objects which are not required by the computation would also be brought in since they share the same page. This conflicts with our stated goal of reading in only referenced objects from the disk into the main memory.

In VIM the basic unit of storage allocation and for disk↔main store transfer is a chunk. A chunk is a small page, and is expected to consist of 24-32 words. Each chunk has a unique chunk identifier (cid). Since the pages are small, it is unnecessary to allocate multiple objects on the same chunk. Objects which are larger than one chunk are stored in data structures made of chunks.

The small page size in VIM allows us to allocate at most one object per chunk without causing significant wastage of storage space due to internal fragmentation. When an object residing on the disk is referenced, only that object (or part of the object, if it consists of many chunks) is brought into the disk. Since large data structures are composed of many chunks, choice of a suitable data structure
organization should permit large amount of sharing of information; this sharing is essential for efficient execution of structure operations in an applicative environment.

In L1 we saw that data structures reside on the heap and only the pointers to the structures (their uids) are passed among instructions. In this chapter the operational semantics of an extension of the model L1 is presented. The extended model, called L2, adds the notion of storage to L1 — data structures (arrays, records and oneofs) are stored in chunks. The storage representation of arrays in terms of chunks will be described; the concepts presented may be extended to the storage representation of records and oneofs.

The modelling of arrays as stored values makes it necessary for us to consider the issues of storage allocation and reclamation. The unit of storage allocation in L2 is a chunk. For the purpose of this thesis it is assumed that there is a large pool of free chunks (which are not part of any data structure). The allocation of a storage unit amounts to selecting a cid from this free pool and using the storage corresponding to that cid. In L2, chunks which are modelled as being in the main store are tagged accessible and the chunks which correspond to those resident in the disk are tagged inaccessible. It is assumed that the free cid selected corresponds to a chunk whose storage part is in the main memory (i.e. the chunk is tagged accessible).

A program executing in L2 exhibits dynamically changing storage requirements during the course of the computation. This variable demand arises from the fact that data structures are created and discarded (in the sense that they are not used any more) during program execution. The storage that is discarded can be reused for storing other data structures.

The function of a storage management scheme in a language implementation is two-fold — allocation of memory when the computation demands, and the reclamation of storage which contain the values of to discarded information structures. Reclamation of storage and its subsequent recycling allows the system to satisfy the storage requirements of the program within the existing bounds of the system. even though the total amount of memory (number of free chunks) requested by the computation far exceeds the total storage capacity of the system. There are two principal strategies for storage reclamation — the mark-and-sweep scheme and the reference-count scheme.
Mark and sweep garbage collection is the most common method of automatic storage reclamation. In a simple mark and sweep scheme, the size of the inaccessible information occupying the address space keeps growing until there is no free storage left, at which point all normal processing is suspended. All the units of storage which are in use are marked by tracing down all the structures which are accessible from the current state of the system. Then the entire memory is scanned to identify all the unmarked storage units — these are the discarded memory units which are aggregated into the pool of free chunks and used for reallocation. A drawback of this strategy is that if the address space is very large and the physical store spans disks, the process of garbage collection can be very expensive. Simple implementations of mark-and-sweep suspend all computations other than those for reclamation; a large number of disk accesses will imply that all other computations will remain suspended for a long time. Various algorithms have been proposed which permit concurrent execution of the mark-and-sweep reclamation and other computational activities (also known as real-time garbage collection); however, they are complicated and exhibit questionable performance in real-life situations. Real-time garbage collectors of acceptable performance are difficult to implement even on single-processor systems; how the schemes may be extended to perform garbage collection with acceptable efficiency in a multiprocessor architecture remains an open problem.

The notion of using reference counts on the information structures used by the computation has been around for a long time; however, there are only a scant number of garbage collectors which use reference counts exclusively. The basic idea is very simple. A counter is associated with each information unit; it keeps a count of the number of references to the structure in the system. The counter is incremented whenever a new reference is created and decremented when one is destroyed. When the count becomes zero, the structure becomes inaccessible from the computation and the storage occupied by it may be reclaimed for reallocation. This simple scheme has one major drawback which has prevented its use in any practical garbage collector so far — it cannot reclaim circular structures. However, it has been shown that memory reclamation using reference counts is possible in the presence of certain classes of circular structures [7, 17].

Circular structures can be created only if there are operations which cause side-effect. All operations in ViMsViM are free from side effects and so the user cannot create circular structures. The creation of circular structures by the interpreter (for whatever may be the purpose) has been precluded
by design\textsuperscript{5}. These features of the VIM system make it feasible to use a reference count mechanism for garbage collection. The principal argument against reference counts is that the cost of updating the count every time a structure is manipulated may be unacceptably high. A significant advantage of reference count mechanism for garbage collection is that storage reclamation is done concurrently with the computation. Also, the scheme appears to be more amenable to implementation on a multiprocessor architecture.

In L2, two counts — \texttt{refcnt} and \texttt{setcnt} are associated with each chunk. \texttt{refcnt} contains the count of the number of references to the chunk in the current state of the machine. A chunk is reclaimed when its \texttt{refcnt} field becomes zero. The \texttt{refcnt} fields of all the structures which are pointed at by the chunk whose \texttt{refcnt} field becomes zero are also decremented. The \texttt{setcnt} field of a chunk is used only if it is the root chunk of some structure. It keeps a count of the number of elements in the structure which are EC-queues.

Some simplifications (of "real" life behaviour of computer systems) have been made in developing the formal model L2 to reduce the complexity of the model within manageable limits. In a "real" system, information which is in main memory is immediately accessible; in L2 the chunks which correspond to those resident in the main store are tagged \texttt{accessible}. The chunks which correspond to those on the disk are tagged \texttt{inaccessible}. Instructions are tagged \texttt{executable} if they have received all their operands and signals. Only such an instruction is chosen for execution. The instruction executes ("runs to completion") only if all the chunks that it requires to access are tagged \texttt{accessible}. If such is not the case then the system requests that the chunk be made accessible and the instruction is tagged \texttt{dormant}. Some other \texttt{executable} instruction is then selected and run for execution. Eventually, the requested chunk becomes accessible and the instruction which requested the chunk is made \texttt{executable}.

Data flow graphs usually expose a high degree of concurrency in most programs. It is hoped that by using suitable program transformation techniques, the number of enabled instructions at any time during the execution of the program will be very large. The overlap between instruction execution and

\textsuperscript{5}In an implementation for his combinator reduction language, Turner [37] uses circular structures to model recursion.
disk accesses is the principal source of concurrency in the system. I expect that in an actual implementation the paging algorithms and instruction scheduling can be so ordered that the system seldom has to wait for a disk access to complete before it can execute an instruction. This translates into the following caveat — there must be at least one executable instruction during most of the time of the program execution.

3.1 Arrays and VIM-trees

We now examine a special kind of data structure called a VIM-tree, which is used to store the elements of an array in L2.

A positional $k$-tree is a directed tree with the following property: Each edge out of a node $v$ is associated with one of the numbers in $\{0, 1, \ldots, k-1\}$; different edges, out of $v$, are associated with different numbers. It follows that the number of edges out of a node is at most $k$, but may be less; in fact, a leaf has none.

We associate with each leaf node $v$ in a positional $k$-tree $V$ the word consisting of the sequence of numbers associated with the edges on the path from the root $r$ to the node $v$. This sequence is called the index word of node $v$. The index word also represents an integer in base $k$. The height of the tree is the length of longest index word in the tree [15].

A VIM-tree is a positional $k$-tree in which every node is associated with a chunk. The chunk associated with the root node is called the root chunk. A chunk has two parts — header and chunkstore. The header part of the chunk contains some book-keeping information about the chunk; the chunkstore contains the actual data values (or pointers to other chunks). Elements of an array are stored in the chunkstore part of some of the chunks associated with the leaf nodes. Those leaf chunks which contain elements of the array are called value chunks. All value chunks have index words of the same length. For convenience, the terms "value chunk" and "root chunk" will be used in place of "chunk associated with the leaf node" and the chunk associated with the root node". A word in the chunkstore of a value chunk in a VIM-tree is uniquely identified by the base-$k$ integer $w$, where $w$ is the index word of the value chunk and $i$ is the word number in the chunkstore. I shall use the terminology "the $wih$ word in the VIM-tree $I$" to denote the $i$th word in the chunk with index word $w$ in $V$; $wi$ is the word number (in
base \cdot k) for that element of the array in V. All the words of the chunkstore part of a chunk contain the value unallocated by default. If an arc from node A to B be marked i, then the ith word of chunk associated with A contains the cid of the chunk at B. If there is no node corresponding to the ith edge from A, the ith word of A contains the value unallocated.

The refcnt field of a chunk is a count of the number of occurrences of the cid of the chunk in the current state of the machine. This is used for reference counted automatic storage reclamation. The sectnt field of the root chunk contains the number of elements of the array which are currently EC-queues. This field is used to determine if an APPEND operation should be performed. The depth field of the root chunk has value d, where the height of the tree is d-1. It is used to construct the index word to the value chunk. The number of array elements that can be stored in a full VIM-tree of height d-1 is \( k^d \).

The lo and hi fields of the root chunk of VIM-tree V contain the low and high indices of the array whose elements are stored in the tree. If the low index of the array is \( p \) and the word number for the \( p \)th element in V is \( s_1 s_2 \ldots s_{d-1} s_0 \), then the \( m_{min} \) field of the root chunk is \( (p \cdot s_1 s_2 \ldots s_{d-1} s_0) \), all arithmetic being done in base-k. The \( m_{min} \) field is used to determine the index word of the Value chunk in which the element which is to be accessed resides.

Let an array \( A \) with index bounds \( p \) and \( q \) be stored in a VIM tree \( V \) in which the \( m_{min} \) and depth fields of the root chunk have values \( m_{min} \) and d. For every i within the bounds, \( (i \cdot m_{min}) \) is the word number corresponding to the ith element of the array \( A \). The word corresponding to the word number \( w_i \) in \( V \) is said to be shared if there is a chunk corresponding to the the index word \( w_i \) and the refcnt field of some chunk along the path from the root determined by the word index has value greater than 1. Otherwise, the word is unshared.

### 3.2 Operational semantics of L2

In 1.1 we saw that arrays are represented as abstract mathematical entities – as functions mapping integers to values. In 1.2 we augment 1.1 by introducing a storage model for arrays. This chapter focuses on the operations on arrays in 1.2. The operations on records and oneofs are very similar to those on arrays; records and oneofs may be regarded as fixed-length arrays from the point of view of the
implementation of their operations in the machine. Functions, activations, early-completions queues, etc. are still regarded as abstract mathematical entities. This simplifies the presentation and keeps the complexity of the model within reasonable bounds. The techniques illustrated in this thesis may be used to develop a model in which the aforesaid entities are also data structures, rather than sets and functions.

The rest of this chapter is a description of the model $L_2$. The notation used is the same as the one used in presenting $L_1$. Sets are denoted by bold font, elements of sets are denoted by italicised letters and names are written in a special font. Thus, This is a set, $This \in This$ and This is a name.

A State $S$ in $L_2$ is a four-tuple consisting of Act (the current set of function activations), $H$ (the heap), $EIS$ (the set of enabled instructions) and $C$ (the set of chunks which are currently in use to store the arrays on the heap).

$$\text{VIM} = \langle \text{Interp}, \text{State} \rangle \text{ where}$$

$$\text{State} = \text{Act} \times H \times EIS \times C$$

$$\text{Act} = U \rightarrow [N \rightarrow \text{Instruction}]$$

$$\text{H} = U \rightarrow [(\{fn\} \times \text{Function}) \cup (\{ecq\} \times ECQ)$$

$$\cup (\{instr\} \times \text{Instruction}) \cup (\{dests\} \times Dests)$$

$$\cup (\{clsr\} \times Clsr) \cup (\{arr\} \times \text{Structure})$$

$\text{Cid} = \text{the set of unique names of chunks}.$

The functions $\text{AddToHeap}$, $\text{DeleteFromHeap}$, $\text{AddToAct}$ and $\text{DeleteFromAct}$ are the same as defined in $L_1$.

A structure is defined by a $cid$ and a collection of chunks which store the values of the elements of the structure; the $cid$ corresponds to that of the root chunk of the VIM-tree.

$$\text{Structure} = \text{Cid} \times C$$

The set of enabled instructions is partitioned into two subsets depending on the tags on the instructions. When an instruction first becomes enabled it is tagged executable and added to the set of enabled instructions. Instructions with executable tags are tagged dormant if they attempt to access chunks which are inaccessible. dormant instructions become executable when the chunks they were trying to access become accessible.
EIS = I(EI)
EI = Status × U × N
Status = \{executable\} ∪ \{dormant × Cid\}

Each chunk has a unique name, a tag, and some storage called chunkstore. There is a unique
chunkstore associated with each cid. The storage part consists of some header and k words for values.

C = P(Chunk)
Chunk = Cid × \{accessible, inaccessible\} × Chunkstore
ChunkStore = Header × DataPart
Header = Int^5
DataPart = (Scalar ∪ U ∪ SUSP ∪ Cid ∪ \{unallocated\})^k
Both Header and DataPart are sets of ordered tuples.

The following notation is strictly adhered to in the rest of the presentation. \(Ch_i\) always represents
an element of Chunk. \(c_i\) denotes the unique name of the chunk \(Ch_i\) and \(cs_i\) denotes the chunkstore part
of the chunk \(Ch_i\). Thus, if \(Ch_w\) is a chunk, then \(c_w\) is its cid and \(cs_w\) is the chunkstore part of the chunk.
\(c_w.m_{min}, c_w.lo, c_w.hi, c_w.depth, c_w.rcnt\) and \(c_w.sctnt\) denote the first six elements of the ordered
tuple \(cs_w\). \(c_w[i]\) denotes the \((i + 6)\)th element in the tuple. The notation for drawing a chunk and
specifying the contents of the chunkstore part is given in figure 9.

The chunkgraph of an array \(A\) in L2 is defined as follows. Let \(V\) be the ViM-tree in which the
elements of the array \(A\) are stored. The nodes in the chunk graph correspond to the chunks associated
with the nodes in \(V\); if the edge from node \(A\) to \(B\) in \(V\) is marked \(i\), then an arc is drawn from the box
number \(i\) of the chunk associated with \(A\) to the chunk at \(B\). Chunkgraphs are an efficient and concise
notational convenience for specifying the operations on ViM-trees.

The chunkgraph of a heap is the collection of the chunkgraphs of the structures on the heap. The
chunkgraph of a heap provides an easy way of indicating the sharing of chunks among structures.

Scalar values are as defined in L1.

Scalars = Integers ∪ Reals ∪ Booleans ∪ Character ∪ Null
Integers = \{int\} × (\{undef\} ∪ the set of all integers)
Reals = \{real\} × (\{undef\} ∪ the set of all reals)
Booleans = \{bool\} × (\{true, false, undef\})
Character = \{char\} × (\{undef\} ∪ the set of characters in the machine.)
Null = \{null\} × \{nil, undef\}
The cid of the chunk

\[
\begin{array}{cccccc}
  m_{\text{min}} & \text{lo} & \text{hi} & \text{depth} & \text{setcnt} & \text{reftcnt} \\
\end{array}
\]

\[c \]

\[0 \ 1 \ 2 \ \bullet \ \bullet \ \bullet \ \bullet \ k-2 \ k-1\]

\[k\] words of the data part of the chunk.

The header part of the chunk

where \( c_i \) is the cid of the chunk and \( c_{\text{st}} \) is the chunkstore part of the chunk.

**Figure 9:** Notation for drawing a chunk and specifying its value. The chunkstore part of the chunk is represented by a box, which is divided into two rows of boxes. The upper row is divided into six boxes, one corresponding to each of \( m_{\text{min}}, \text{lo}, \text{hi}, \text{depth}, \text{setcnt} \) and \( \text{reftcnt} \) fields of the header part of the chunk, in that order. The second row contains the \( k \) words of the data part of the chunk. The unique name (cid) of the chunk is indicated at the left of the box. An unsigned integer in the \( \text{reftcnt} \) or \( \text{setcnt} \) field means that the field now has that value. A signed integer in the box implies that the new contents of the box is equal to the old contents added to the signed integer. This notation will be used to specify how the reference counts on chunks is incremented and decremented.

\[\text{ECE} = U \times N\]
\[\text{ECQ} = \mathcal{R}(\text{ECE})\]
\[\text{SUSP} = [U \times N]\]

The definitions of the sets Instruction, Function, ECE, ECQ and Dests are the same as in L1.

\[\text{Instruction} = \text{OPS} \times (U \cup \text{Scalars})^3 \times N^2 \times U\]
\[\text{Dests} = \mathcal{R}(\text{D})\]
\[D = \{\text{unconditional, true, false} \} \times N \times \{\text{op1, op2, op3}\}\]
\[\text{Clsr} = [(\{\text{FunctionToApply}\} \cup M) \rightarrow U]\]

where \( C(\text{FunctionToApply}) \in U \times (\{\text{fn}\} \times \text{Function}) \), \( C \in \text{Clsr} \)

The function \text{SendResult} which is invoked to dispatch the result of an instruction to the destination instructions is almost identical to the one in L1. The only difference between is that in L2 the instructions which become enabled are tagged \text{executable}; in L1 no tagging is done. The function \text{SendToDestinations} is the same as in L1.
SendResult : Act × EIS × U × D × [{value} × (U ∪ Scalars)] ∪ {signal} → Act × EIS

Define SendResult(Act, EIS, u_FA, (dc, i, opnum), result) ≡

let
    FA = Act(u_FA),
    I = FA(i)
in
    AddToAct(DeleteFromAct(Act, u_FA, FA), u_FA', FA') # new set of activations
    if (I.opcnt = 0) ∧ (I.sigcnt = 0) then EIS ∪ {executable, u_FA', i}
    else EIS
endif
endlet
where
    FA(j) = FA(j), j ≠ i, I' = I, j = i.
and I' ∈ Instruction.
    I.op1 = if opnum ≠ op1 then I.op1 else I where result = (value, I)
    I.op2 = if opnum ≠ op2 then I.op2 else I where result = (value, I)
    I.op3 = if opnum ≠ op3 then I.op3 else I where result = (value, I)
    I.opcnt = if opnum ∈ {op1, op2, op3} then (I.opcnt - 1) else I.opcnt
    I.sigcnt = if opnum = signal then (I.sigcnt - 1) else I.sigcnt
    I.destlist = I.destlist
endfun.

The function Interp maps a state and an enabled instruction to a new state. The enabled instruction chosen by Choice must have tag executable.

Interp : State × Choice(EIS) → State
    where State = Act × H × EIS × C.

A practical implementation of VIM would have a finite amount of main store and a very large amount of storage space on the disk. The contents of a chunk cannot be read unless it is present in the main memory. Chunks are read into the main memory from the disk on demand. Eventually there may not be any free storage in the main memory into which chunks may be brought in. The system frees main storage by moving some chunks from the main memory to the disk and declaring the main storage that was occupied by them to be free. New chunks from the disk are placed in this free storage, which is then marked as occupied. Effectively, chunks which are resident in the disk are not directly accessible to the computation. Thus chunks become accessible/inaccessible during the execution of a program, depending on whether they move from the disk to main memory, or from main store to disk.
Chunks which are modelled as being in the main memory are tagged \textit{accessible}, otherwise they are tagged \textit{inaccessible}. To capture this notion of chunks becoming inaccessible during program execution, a function \textit{PageOut} is introduced. \textit{PageOut} selects some chunks in the current state of the machine which have tag \textit{accessible} and marks them as \textit{inaccessible}.

\textit{PageOut} : C \rightarrow C

Every element marked \textit{dormant} in the set \textit{EIS} \in \textit{EIS} contains a pointer to a chunk (the \textit{cid} of the chunk); the tag on the chunk is \textit{inaccessible}. \textit{Fetch} selects some such \textit{cid}, tags it as \textit{accessible} and all the instructions which had become \textit{dormant} trying to access this chunk are tagged \textit{executable}. The action performed by \textit{Fetch} corresponds to the conventional notion of pages being brought into the main memory from the disk.

\textit{Fetch} : \textit{EIS} \times C \rightarrow \textit{EIS} \times C

Let \textit{c} be the \textit{cid} of a chunk which is tagged \textit{inaccessible} and let \( W = \{(u, k) : (\text{dormant}, c, (u, k)) \in \text{EIS}\}, W \neq \emptyset \), in some state (\textit{Act, H, EIS, C}) of the machine. \textit{Fetch}(\textit{EIS, C}) returns a new set of enabled instructions given by \( \text{EIS} \cup \{(\text{executable}, u, k) : (u, k) \in W)\} - W \), and a new set of chunks specified by \( \text{C} \cup \{c, \text{accessible, cs}\} - \{(c, \text{inaccessible, cs})\} \) where \textit{cs} is the chunkstore associated with the chunk with name \textit{c}.

The main loop of the machine is defined by the following tail-recursive function. The machine executes an instruction, makes some chunks inaccessible and then makes some of the demanded chunks accessible.
Define $MainLoop(S: State) \equiv$

let
  $(Act_1, H_1, EIS_1, C_1) = S$
in
  if $EIS = \{\}$ then halt
else
  let
    $e = \text{Choice}(EIS)$ where $e = (\text{executable}, u_{FA}, k_{FA})$,
    $EIS, C = \text{Fetch}(EIS_1, \text{PageOut}(C_1))$, 
    $H = H_1$, 
    $Act = Act_1$
  in
    $MainLoop(\text{Interp}(Act, H, EIS, C), e)$
endlet
endif
endlet;

$\text{Interp}$ defines the manner in which the state transitions are made, depending on the instruction which is being executed.

Define $\text{Interp}((Act, H, EIS, C), e) \equiv$

let
  $(\text{status}, u, k) = e$, 
  \text{The instruction must have tag executable.}
  $FA = Act(u)$, 
  $l = FA(k)$
in

if $\text{opcode} = \text{IADD}$ then ...
elseif $\text{opcode} = \text{MKINTARRAY}$ then ...
...
elseif $\text{opcode} = \text{APPLY}$ then ...
...
endif
endif
endlet

The notation 

$(Act, H, EIS, C) \leftarrow (Act', H', EIS', C')$ on $e$

denotes that if the state of the machine given by $(Act, H, EIS, C)$ is the argument to $MainLoop$ then $(Act', H', EIS', C')$ is the result of executing the instruction $e$ chosen by $\text{Choice}$, and invoking $\text{PageOut}$ and $\text{Fetch}$ in sequence.

We are now equipped to describe the actions of the interpreter. The scalar operations do not
affect the heap and so the C component of the state is unaffected. The state transition rule of the
instruction is almost identical to that in L1, the only difference being the introduction of the fourth
component in the state (which remains unchanged).

\[
\text{if } l.o.p.c.d.e = IADD \text{ then}
\]
\[
\text{let}
\]
\[
\begin{align*}
(int, m) &= l.o.p.1, \\
(int, n) &= l.o.p.2, \\
sum &= m + n,
\end{align*}
\]
\[
Act', EIS' =
\]
\[
\text{SendResult}( \\
\text{SendResult}(...) \\
\text{SendResult}(Act, EIS, (u_{FA}, (\text{unconditional}, d_1, opnum_1)), \alpha_1), ... ..., \\
(u_{FA}, (\text{unconditional}, d_n, opnum_n)), \alpha_n))
\]
\[
in \\
Act', \\
H, \\
EIS' = \{(\text{executable}, u_{FA}, k_{FA})\}, \\
C
\]
\[
\text{endif}
\]
\[
\text{where } \alpha_i \in \{(\text{value, sum}), \text{signal}\}
\]

Other scalar operations have very similar state transition rules and do not affect the heap or the set
of chunks.

Operations on arrays will now be described. Figures are used to explain the algorithms for
building and manipulating the trees of chunks that store the contents of the arrays.

The state resulting from the execution of the MKINTARRAY instruction is described below.
if \( \text{opcode} = \text{MKINTARRAY} \) then

let
(\text{int}, p) = \text{lo},
(\text{int}, q) = \text{lo2},
\( n = \|\text{dests}\| \),
(\text{c}_1, \text{accessible}, \text{cs}_1) = \text{a new chunk},
u = \text{a new uid},
A = (u_\text{arr}, c_1, ((\text{accessible}, \text{cs}_1)))

\text{Act}', \text{EIS}' =

\text{SendToDestinations}(\text{Act}, \text{EIS}, u_{F_A'}, \{(\text{unconditional}, d_1, \text{opnum}_1), \alpha_1 \}, ..., 
((\text{unconditional}, d_n, \text{opnum}_n), \alpha_n))

in

\text{Act}', \text{EIS}'

\text{AddToHeap}(H, u, A),
\text{EIS}' - \{(\text{executable}, u_{F_A'}, k_{F_A'})\},
C \cup \{(c_1, \text{accessible}, \text{cs}_1)\}

endlet
endif

where \( \alpha_j \) is either \((\text{value}, u)\) or \text{signal}.

As mentioned earlier, it is assumed that an accessible free chunk is available. The contents of the chunk with \text{cid} \text{c}_1 is in figure 10. The contents of this chunk along with the definition of the \text{SELECT} operation ensure that any \text{SELECT} operation on this tree produces the an undefined value.

---

The cid of the chunk

\[
\begin{array}{cccccccc}
\text{m}_{\text{min}} & \text{lo} & \text{hi} & \text{depth} & \text{setcnt} & \text{reftcnt} \\
\hline
\text{p} & \text{p} & \text{q} & d & 0 & n \\
\hline
\text{UA} & \text{UA} & \text{UA} & \bullet & \bullet & \bullet & \text{UA} & \text{UA} & \text{UA} \\
\hline
0 & 1 & 2 & \bullet & \bullet & \bullet & k-2 & k-1
\end{array}
\]

\text{k words of the data part of the chunk.}

The header part of the chunk

\text{Figure 10:} The chunk structure created by the \text{MKINTARRAY} instruction. The entire tree of chunks is not created; only the root chunk is allocated. The symbol \text{UA} stands for unallocated. The \text{depth} field is set to the value \( d = \lfloor \log_k (u \cdot p + 1) \rfloor \).
Let us now consider the effect of the instruction \texttt{MKINTARRAYEC} on the state of the machine. It acquires a collection of chunks which are not members of the set \( C \) in the current state of the machine. These chunks are used to build a tree which is illustrated in figure 11.

\[
\text{if } \texttt{opcode} = \texttt{MKINTARRAYEC} \text{ then} \\
\text{let} \\
\quad (\text{int}, p) = \texttt{op1}, \\
\quad (\text{int}, q) = \texttt{op2}, \\
\quad n = |\texttt{dests}|, \\
\quad M = q-p+1, \\
\quad N = M(k^d+1)/(k-1), \\
\quad \{(c_1, \texttt{accessible}, cs_1), \ldots, (c_N, \texttt{accessible}, cs_N)\} = N \text{ free accessible chunks}, \\
\quad u = \text{a new uid}, \\
\quad A = (u, (\text{arr, } c_1, \{(c_1, \texttt{accessible}, cs_1), \ldots, (c_N, \texttt{accessible}, cs_N)\})), \\
\quad \text{Act}', \text{EIS}' = \\
\quad \text{SendToDestinations}(\text{Act}, \text{EIS}, u, \mathcal{A}', \{(\texttt{unconditional, } d_1, \text{opnum}_1, \alpha_1), \ldots, \\
\quad \texttt{unconditional, } d_n, \text{opnum}_n, \alpha_n)\}) \\
\quad \text{in} \\
\quad \text{Act}', \\
\quad \text{AddToHeap}(H, u, A), \\
\quad C \cup \{(c_1, \texttt{accessible}, cs_1), \ldots, (c_N, \texttt{accessible}, cs_N)\} \\
\quad \text{endlet} \\
\text{endif}
\]

where \( \alpha \) is either \texttt{signal} or \{(\texttt{value, } u)\}, the contents of the chunks is shown in figure 11 and the resulting heap \( H' \) is specified by augmenting the chunkgraph of the heap \( H \) by the chunkgraph shown in figure 11.

The \texttt{APPEND} operation is by far the most complex operation. It requires three arguments — an array, an integer index and a value (uid of some other structure or a scalar value). Recall that the \texttt{APPEND} operation creates a new array only if its argument \texttt{array} (first operand) does not have any \texttt{EC}-queues; otherwise it is attempted for reexecution at some later time. In L2 the \texttt{setcnt} field of the root chunk of the first operand (which must be an array) contains number of \texttt{EC}-queues in the structure. If the \texttt{setcnt} field is not zero, then no change occurs in the state. Instead, some other instruction is selected by the \texttt{Choice} function in the next iteration of \texttt{MainLoop}. The \texttt{APPEND} instruction remains in the set of enabled instructions and will be eventually executed, when all the \texttt{EC}-queues have been replaced by values by \texttt{SET} instructions.

If the first operand to \texttt{APPEND} is a shared structure (many pointers to it exist in the current state
of the system), then a new array is created which shares a number of chunks with the argument array. If the value of the refcnt field of the root chunk of the first operand to APPEND is one, then it is possible to perform an in place update. This condition under which no copying need be performed at all to implement the APPEND operation can be taken advantage of by the compiler. If the integer index (the second operand) is outside the bounds of the argument array, APPEND creates a larger array. The transition rules are specified in the following pages.

Let the array on which the APPEND instruction is to performed have the chunkgraph shown in Figure 12.
if `opcode = APPEND` then
  let
    \( u_1 = \text{op1} \),
    \((\text{arr}, c, \{\{c_1, a f_1, c s_1\}, \ldots, (c_N, a f_N, c s_N)\}) = H(u_1)\),
    \((\text{int}, o) = \text{op2} \),
    \( x = \text{op3} \),
    \( p = c_{1.\text{lo}} \),
    \( q = c_{1.\text{hi}} \),
    \( d = c_{1.\text{depth}} \),
    \( n = |/\text{dests}|\),
    \( u_2 = \text{new uid from } U \)
  in
  if \(|i|: \text{element with word index } i \text{ of the array is a suspension}
    \text{ or an } FC\text{-queue} | > 0 \)
    then \text{Act. } H, \text{FIS, } C
      \% There are FC-queues in the structure - APPEND does not execute.
    else if \( p \leq i \leq q \)
      then if (word number \((i-c_{1,m_{\text{mon}}})\) in VIM-structure with root \(c_1\) is unshared)
        then ...
else ...
\hspace{1em} % The element is shared.
\hspace{1em} endif
\hspace{1em} elsif \(c_i.m_{\text{min}} < i < p\) then ...
\hspace{1em} elsif \(i < c_i.m_{\text{min}}\) then ...
\hspace{1em} elsif \(q < i < c_i + k^d\) then ...
\hspace{1em} else ...
\hspace{1em} % \(i > c_i + k^d\) ...
\hspace{1em} endif
\hspace{1em} endlet
\hspace{1em} endif

First consider the case in which the \(i\)th element of the array is not shared among any other structures. The APPEND instruction performs an in place update by replacing the value of the \(i\)th element of the array by the new value \(x\). The Vim-tree represents a new array, which is reflected by the new uid that is associated with the structure. The set of chunks \(C\) remains unchanged from the previous state.

If any chunk with \(cid \ c\) whose contents need to be accessed during the execution of the APPEND instruction is found to have an inaccessible tag, the state \(S_f = (Act_f, H_f, EIS_f, C_f)\) resulting from the execution of the APPEND instruction is defined below.

\[
\begin{align*}
Act_f &= Act \\
H_f &= H \\
EIS_f &= (EIS \cup \{(dormant, c), u_{FA}, k_{FA}\}) - \{(executable, u_{FA}, k_{FA})\} \\
C_f &= C
\end{align*}
\]

The instruction thus remains in the set of enabled instructions. Eventually when the chunk \(c\) becomes accessible (caused by the function \(Fetch\)), it will again become executable.

The actions of the interpreter for the APPEND operation will now be presented. Consider first the case when the \(i\)th element of the array on which the APPEND is performed is an unshared element and \(i\)th lies within the bounds of the array\(^6\).

\(^6\)A simplifying assumption made here is that there is a leaf chunk corresponding to the \(i\)th element of the array. If that is not the case then new chunks are added to the set of chunks in the state and the set \(C\) is augmented suitably.
if $p \leq i \leq q$ and (word number $(i - c_i.m_{min})$ in VIM-structure with root $c_i$ is unshared) then

let

$$Act', EIS' =$$

$$SendResult(\)$$

$$SendResult(...)$$

$$SendResult(Act, EIS, (u_{FA}, (\text{unconditional}, d_i, opnum_i)), a_1, ...) ...,$$

$$(u_{FA'}, (\text{unconditional}, d_n, opnum_n)), a_{n'})$$

in $Act'$,

$$(H \cdot (u_1, (\text{arr}, c_1, \{ (c_1, \text{accessible}, cs_1), ..., (c_N, \text{accessible}, cs_N) \})))$$

$$\cup \{ (u_n, (\text{arr}, c_1, \{ (c_1, \text{accessible}, cs_1), ..., (c_N, \text{accessible}, cs_N) \}) \}$$

$$EIS' \cdot \{ (\text{executable}, u_{FA}, k_{FA}) \}$$,

$C'$ where $C'$ reflects the fact that an in situ change has been made to the chunks of the structure $A$ as shown in figure 13.

endif
structure is created by copying chunks along the access path. The reference count of the argument structure \( A \) is decremented by 1. If \( \texttt{refcnt} \) becomes zero, the node is deleted from the heap and the reference counts of all the chunks that \( c_i \) points at are also decremented. This may cause a cascade of deletion of nodes from the heap. If any of the chunks whose contents needs to be accessed (for decrementing its reference count field or for reading the contents of its chunkstore) is tagged \textbf{inaccessible} then the instruction is made dormant and no changes are made to the \textit{Act}, \textit{H} and \textit{C} components of the state. The state transition rule for such a case has already been described.

\[
\text{if } p \leq i \leq q \text{ and (word number } i-c_i \text{ in VIM-structure with root } c_i \text{ is shared) then let}
\]

\[
\text{Act}', \ EIS' =
\text{SendResult(), SendResult}(...)
\text{SendResult(Act, EIS, (unconditional, } d_i \text{, opnum}_i, a_i), ...)
\text{SendResult( (unconditional, } d_n \text{, opnum}_n, a_n), ...)
\text{(c', accessible, cs', ...)} = \text{new chunks unused in C,} \\
H' = H \cup \{u_2, \text{ (arr, } c_1, \text{ chunks shared between argument and result arrays)} \cup \{(c'_1, \text{ accessible, cs'}_1), ...,(c'_n, \text{ accessible, cs'}_n)\} \\
\{\text{structures whose root chunk has refcnt field has value zero}\}
\]
Figure 14: Chunkgraph produced by APPEND on the *th element of the structure whose chunkgraph is shown in figure 12. This is the case when \( p \leq i \leq q \) and the element is a shared element.

Chunks along the access path are copied, while others are shared as shown in figure 16. The \( \text{lo} \) field of the root chunk of the VIM-tree for the result array is set to \( i \). No change is made to the value represented by the argument structure. The \( \text{refcnt} \) field of \( c_1 \) is decremented. If that becomes zero, then the usual process of decrementing reference counts is started. The deletion of structures must be reflected in the resulting heap and set of chunks in the state.
if $c_i, m_{min} < i < p$ then
let
  $Act', EIS' = ...$
  in
  ...
endlet
endif

The action of the interpreter on this occasion is similar to that in the previous cases. See figure 16 for the chunkgraph of the part of the heap which is affected.

---

**Figure 15:** Chunkgraph for result of `APPEND` when $c_i, m_{min} < i < p$. 

---
Now we consider the case in which the height of the VIM-tree resulting from the APPEND on $A$ is greater than the VIM-tree for $A$. This happens when $i < A.m_{min}$. The VIM-tree for the result array $A'$ created by APPEND is of the minimum such height that $A'.m_{min} < i$, and $\text{SELECT}(A, A.m_{min}) = \text{SELECT}(A', A.m_{min})$. The semantics of the APPEND operation (as specified in 1.1) is preserved; a formal proof of this assertion will be presented in the next chapter. The reference count of $A$ does not change since one reference to it is consumed by APPEND while a new one is inserted in $A'$.

Given $i < c_{1}.m_{min}$, a tree of the minimum possible height must be constructed such that if $c_{1}'$ be the root of the resulting tree then $i \cdot c_{1}'.m_{min} > 0$. The appropriate height can be computed as follows.

Let $d'$ be the height of the resulting tree, which is shown in figure 17. Clearly, the word number of $c_{1}.m_{min}$ in this tree is 100..00 such the length of the sequence is $d'$.

\[
\cdot \cdot c_{1}.m_{min} \cdot c_{1}'.m_{min} = 100..00
\Rightarrow c_{1}'.m_{min} = c_{1}.m_{min} \cdot (100..00), \text{ in base } k
\Rightarrow c_{1}'.m_{min} = c_{1}.m_{min} - k^{d'-1}.
\]

Now, $i \cdot c_{1}'.m_{min} \geq 0$

\[
\Rightarrow i \cdot c_{1}.m_{min} + k^{d'-1} \geq 0
\Rightarrow k^{d'-1} \geq c_{1}.m_{min} \cdot i
\Rightarrow d'-1 \geq \log_k (c_{1}.m_{min} \cdot i)
\Rightarrow d' \geq \log_k (c_{1}.m_{min} \cdot i) + 1
\Rightarrow d' = \lceil \log_k (c_{1}.m_{min} \cdot 0) \rceil + 1.
\]

It is of interest to note that the word number of $c_{1}.m_{min}$ could also be 8000..00, where $1 \leq i \leq k$. The arithmetic would be vary similar to the above case.

The instruction becomes dormant if any of the required chunks is tagged inaccessible. The reflcnt field of the argument array need not be decremented because even though the instruction consumes a reference to athe array, a new one is created in the new structure. Therefore, in this case no storage reclamation will be triggered.
if \( i < c_1.m_{min} \) then
let
\[ \text{Act'}, EIS' = \ldots \]
\[ \ldots \]
endlet
endif

see figure 17.

The cases for the rest of the two cases are similar to the previous two cases and are not discussed.

The state transition rule for the SELECT operation is specified below. The select operation decrements the reference count of the argument structure. If the element which is being accessed by SELECT is an EC-queue then the address of SELECT is placed on the EC-queue and the instruction is removed from EIS. If the element is a suspension then the suspension is replaced by an EC-queue containing the address of SELECT and a signal is sent to the instruction whose address was specified in the suspension. If the element is a scalar value then the value is simply dispatched to the destinations. If the result of SELECT is a structure, then its refcnt field is incremented by \( n \), the number of destinations of SELECT. If \( A_{\text{refcnt}} \) becomes zero due to the decrementing of the reference count it is deleted from the heap. The refcnt fields of the chunks that the root chunk \( c_1 \) of \( A \) points at are also decremented and the chunk \( c_1 \) reclaimed. The decrementing of refcnt fields may trigger more reclamation. If any of the chunks which needs to be accessed is tagged inaccessible then SELECT is made dormant; no changes are made to the Act, H and C components of the state.
Figure 16: Chunkgraph for result of `APPND` when \( i < c_1 m_{\text{min}} \). The height of the result VIM-tree is \( d = \lceil \log_k (c_1 m_{\text{min}} + 1) \rceil + 1 \). \( a = c_1 m_{\text{min}} - k^{d-1} \).
if \textit{opcode} = \textit{SELECT} then
\begin{itemize}
  \item \textit{let}
  \begin{itemize}
    \item $u_1 = \textit{op1}$,
    \item $(\text{arr}, c_1, \{(c_1, a_1, c_{s_1}), ..., (c_{N'}, a_{N'}, c_{s_{N'}})\}) = H((u_1),$
    \item $(\text{int}, i) = \textit{op2}$.
  \end{itemize}
\end{itemize}
\begin{itemize}
  \item \textit{in}
  \begin{itemize}
    \item if $(i < c_1.l_0)$ or $(i > c_1.h_1)$ then \textit{undef}
    \item else
    \begin{itemize}
      \item \textit{let}
      \begin{itemize}
        \item $x =$ contents of word number $(i - c_1.m_{\text{min}})$ in the VIM-tree with root $c_1$
      \end{itemize}
      \item \textit{in}
      \begin{itemize}
        \item if $x$ is a scalar then ...
        \item elseif $x$ is a uid $u_B$ corresponding to structure $B$ then ...
          \begin{itemize}
            \item \% send the value and increment \textit{refcnt}
            \item \% field of the root chunk of the VIM-tree for $B$.\end{itemize}
        \item elseif $x$ is an EC-queue then add $(u_{FA}, k_{FA})$ to the queue, etc.
        \item elseif $x$ is a suspension then \{replace suspension by EC-queue with $(u_{FA}, k_{FA})$\}
          \begin{itemize}
            \item increment $c_1.setcnt$, decrement $c_1.refcnt$, etc.\end{itemize}
      \end{itemize}
    \end{itemize}
  \end{itemize}
\end{itemize}
\end{itemize}
endif
endlet
endif
endlet
endif
endif

Observe that if $c_1.refcnt$ becomes zero after \textit{SELECT} executes, the node corresponding to the uid $u_1$ is deleted from the resulting heap and the \textit{refcnt} fields of all the structures that it points to are decremented. If any chunk required by \textit{SELECT} (either for accessing the element or for updating the \textit{refcnt} fields) is tagged \textit{inaccessible}, the instruction is made dormant, thus changing \textit{EIS}; the other components of the state remain unaffected. When the chunk becomes \textit{accessible} later, the \textit{SELECT} instruction becomes \textit{executable}.

The transition rule for \textit{SET} is given below.
if $\textit{opcode} = \text{SET}$ then
  let
  $(u_1 = \text{\textit{op1}},
  \text{\textit{arr}}, c_1, \{(c_1, a_{f_1}, c_{s_1}), \ldots, (c_N, a_{f_N}, c_{s_N})\}) = H(u_1),
  (\text{\textit{int}}, i) = \text{\textit{op2}},
  x = \text{\textit{op3}}$
  in
  if the $(i-c_1,m_{min})$th word of the VIM-tree with root $c_1$ is an EC-queue $Q$ then
    let
    $\text{\textit{Act}}, EIS' = \text{send signals to destinations of } e,$
    $EIS'' = EIS' \cup \{\text{\textit{executable}}, u', k' : (u', k') \in Q\},$
    $H' = \text{the old } H \text{ plus the change in the contents of the chunk of } A \text{ is reflected},$
    $C' = \text{same as } C \text{ except that the chunk containing the EC-queue contains } x$
    and decrement $\text{refcnt}$ and $\text{setcnt}$ fields of $c_1$
  in
  $\text{\textit{Act}}, H', EIS'', C'$
  endlet
else
  let
  $\text{\textit{Act}}, EIS' = \text{send signals to destinations of } e$
  in
  $\text{\textit{Act}},$
  $H,$
  $EIS',$
  $C$
  endlet
endlet
endlet
end

The transition rule for \texttt{SETSUSP} is defined as follows.
if \( \text{opcode} = \text{SETSUSP} \) then

let
\[
(u_1, (\text{arr}, c_1, \{(c_1, af_1, cs_1), \ldots, (c_N, af_N, cs_N)\}) = \text{lop1},
\]
\[
(\text{int}, i) = \text{lop2},
\]
\[
(\text{int}, m) = \text{lop3}
\]
in
if word number \((i \cdot c_1, m_{min})\) in \(\text{VIM-tree}\) with root \(c_1\) is an empty EC-queue then

let
\[\text{Act}', \text{EIS}' = \text{send signals to destinations of } e,\]
\[H' = \text{reflect the fact that the } i\text{th element of the array is a suspension and the EC-queue at the element is deleted from heap and elements deleted from heap due to refcnt becoming zero},\]
\[C' = \text{different from } C \text{ in that the chunk containing the } i\text{th element now contains a suspension and decrement refcnt field of } c_1\]
in
\[\text{Act}', \]
\[H',\]
\[\text{EIS}',\]
\[C'\]
endlet
elsif the element is an non-empty EC-queue then

let
\[\text{Act}', \text{EIS}' = \text{send signals to destinations of } e' \text{ and to } (u_{E_0}, m),\]
\[C' = \text{decrement refcnt fields of } c_1 \text{ and reclaim, if applicable},\]
\[H' = \text{reflect the changes due to changes in } C \text{ and due to reclamation}\]
in
\[\text{Act}', \]
\[H', \]
\[\text{EIS}', \]
\[C'\]
endlet
else

let
\[\text{Act}', \text{EIS}' = \text{send signals to destinations of } e',\]
\[C' = \text{decrement refcnt fields of } c_1 \text{ and reclaim, if applicable},\]
\[H' = \text{reflect the changes due to changes in } C \text{ and due to reclamation}\]
in
\[\text{Act}', \]
\[H', \]
\[\text{EIS}', \]
\[C'\]
endlet

endif
endlet
endif

The set of instructions related to function application (\texttt{APPLY}, \texttt{TAILAPPLY}, \texttt{STREAM-TAILAPPLY}, \texttt{RETURN}) also do not affect the heap or the set of chunks and their transition rules may thus be directly adapted from 1.1.
3.3 Discussion

The problem of efficiently implementing data structures in functional languages is of long standing. Various solutions have been proposed by researchers so that APPEND type operations can be done on such aggregates of values without copying the entire array.

I-structures proposed by Arvind are write-once structures. I-structures solves the problem of read-before write synchronization in a concurrent system; however, the implementation described causes the structure to be copied when an append operation is done on it.

Myers proposed an implementation of applicative lists on AVL-trees [30]. The paper describes a generalization of an AVL-tree, called an AVL-dag, which is used as a representation for linear lists. He presents algorithms which oerform applicative manipulation of linear lists in time that is proportional to the logarithm of the length of the list. He also gives algorithms that perform SELECT and APPEND operations on fixed sized arrays with \( N \) elements in time \( O(KN^{1/K}) \), where \( K \) can be chosen arbitrarily.

Hudak and Bloss [21] have recently proposed schemes for statically inferring situations in which an APPEND may be implemented as place updates. Failing this, they propose that reference counts be maintained on the structures; the entire structure is copied if an APPEND occurs on a structure with its reference count greater than one.

None of the above solutions satisfactorily address the issue of sharing information among structures. The new data structure VIM-tree proposed in this thesis allows APPEND and SELECT operations to be performed in logarithmic time. Moreover, the algorithms common information to be shared among structures, so that the storage requirements of programs is reduced significantly. The following factors influenced the design of the data structure for representing arrays: sharing, fast SELECT and APPEND operations, and the constraint that the storage has a physical hierarchy. B-trees and AVL-trees were considered candidates for representing arrays; however, analysis indicated that the amount of processing required to balance the trees is substantially more than that for VIM-trees. It was found that balancing a \( k \)-way AVL-tree or B-tree would require a larger number of chunks (than for a VIM-tree) to be accessed, many of which might not be in the main memory. It was desirable to keep the branching factor of the trees quite high so that the depth of the tree that had to be traversed to perform the frequent SELECT operations would be quite low. For example, an array of 4096 elements can be
stored in a 16-way VIM-tree of height three. The use of trees of chunks to store data structures also eliminates the problem of compaction.

This chapter also described a reference count mechanism which is used for reclamation of chunks. Reference counting permits real-time garbage collection in VIM. The operational semantics of the instructions are defined such that if a chunk required to be accessed by an instruction is not in the main memory, then the instruction is not executed. In a more detailed model, it may be possible (and maybe worthwhile in an actual implementation) to consider partial execution of instructions, so that an instruction is removed from the set of enabled instructions once it has been chosen for execution by the scheduler.
Chapter Four

Equivalence of L1 and L2

In chapter 2 the operational semantics of L1 was presented. L1 was then refined to model hierarchical physical stage consisting of main memory and disk and a representation for data structures of the storage model was described in Chapter 3. As mentioned earlier, L1 provides the specification for any implementation of an architecture for VIM. We desire that programs executed on L1 produce the same result as that produced by running the program on L2, and vice versa. This would demonstrate that L2 indeed satisfies the specifications of L1. A formal proof of the equivalence of L1 and L2 is presented in this chapter.

Let $P$ be a program written in the base language. Let $\text{Translate}_{L1}$ translate a program in the base language to some initial state for L1. Similarly, $\text{Translate}_{L2}$ produces an initial state in L2 for a program in the base language.

The heap of L1 is a directed, acyclic graph in which the nodes are the arrays, closures, etc., and an arc between two nodes in the graph denotes that one structure is a component of the other. We want to capture the concept of a node in the heap being accessible (not to be confused with tags accessible) from the current State of the computation. If the node is not accessible then the structure associated with the node is not usable, and is garbage.

Definition 4-1: Let $S = (Act, H, EIS)$ be a state of machine L1. A node on $H$ in $S$ is reachable if one of the following two conditions are satisfied:

1. There is some unexecuted instruction in $Act$ which has a pointer to the node (holds the uid of the node), or

2. It is a component of some structure on $H$, and the node corresponding to that structure is reachable in $S$.

The preceding chapter gave an informal definition of the chunkgraph of a heap. The chunkgraph of a heap in some state is now formally defined in graph-theoretic terms.

Definition 4-2: Let $S' = (Act', H', EIS', C') \in \text{State in L2}$, let $C = \{C_1, C_2, ..., C_n\}$, where $C_i$ is of the form $(c_i, af_i, cs_i)$ and $af_i$ is either accessible or inaccessible.
The chunkgraph of $H$ in $S$ is a graph $G = (V, E)$ where $V = \{Ch_1, Ch_2, ..., Ch_n\}$ and $E = \{(c_i, c_j) : (\exists p \in \{0, 1, ..., k-1\}) c[p] = c\}$. Recall that the $c[p]$ denotes the contents of the $n$th word in the chunkstore part of the chunk whose $cid$ is $c$.

It is possible that two different states in L2 represent the same set of values, the only difference being that the set of chunks used to store the elements of arrays in the two states are different. Two such states are said to be similar, the formal definition of the similar relation is given below.

Define two functions $StValue_1$ and $StValue_2$ for L1 and L2, respectively, as follows. If $u$ be a uid and $H$ be a heap in some state in L1 then $StValue_1(u, H)$ returns an ordered set; the elements of the set are the results of SELECT operations done on the array associated with $u$, SELECT being performed for all integer indices. If there is no structure in L1 with uid $u$, the result of $StValue_1(u, H)$ is the null set. $StValue_2$ is also defined in the same manner, except that the heap must be in some state in L2.

**Definition 4-3:** Let $S_1, S_2 \in \text{State}$ in L2, where $S_1 = (\text{Act}_1, H_1, EIS_1, C_1)$ and $S_2 = (\text{Act}_2, H_2, EIS_2, C_2)$.

$S_1$ and $S_2$ are similar if

1. $\text{Act}_1 = \text{Act}_2$
2. $\forall u \in U [StValue_2(u, H_2) = StValue_2(u, H_2)]$
3. $(\forall u \in U)(\forall k \in N)(\forall s \in \text{Status}) [(u, k, s) \in EIS_1 \Rightarrow [(3 s' \in \text{Status}) : (u, k, s') \in EIS_2]]$
4. The chunkgraphs for $H_1$ in $S_1$ and $H_2$ in $S_2$ are isomorphic.

The last condition simply indicates that in two similar states the chunks exhibit the same sharing relationships, except that the chunkids are different.

It is easily seen that the similar relation is an equivalence relation, and the set of similar states constitutes an equivalence class.

**4.1 Proof of Equivalence of L1 and L2**

Informally, two machines are equivalent if they produce identical results for a given program. Moreover, there is a correspondence between computational states that each machine goes through.
We shall use an adaptation of the McGowan mapping technique for proving the equivalence of two machines [25]. A similar technique was used by Berry [6] to prove the equivalence of two information structure models of block structured languages.

Let $P$ be a program in the base language. Schematically, the computation of $P$ on L1 and L2 is shown in the Figure 18. A computation is a sequence of states starting with some initial state; if the computation terminates, then the final state contains the result. $\text{Translate}_{M}(P)$ produces an initial state for the program $P$ on the machine $M$. $S_n$ is the final state of L1 on computation of $P$ and $\text{ResultValue}_{M}(S_n)$ prints the value of the node on the heap which contains the result of the computation. Let $S_0$ be some initial state of machine L1. The computation of the machine L1 starting at the initial $S_0$ is denoted by $\text{MainLoop}_{L1}(S_0)$. Function $\text{Final}(\text{MainLoop}_{M}(S_0))$ gives the final state of the machine in a computation, if it halts. Similar notation is used for the machine L2. In the following discussion, rule definitions which have the same name in both L1 and L2 are distinguished by subscripted L1 or L2.

\[ \begin{array}{c}
\overset{\text{Translate}_{L1}}{S_0} \quad \overset{\text{Translate}_{L2}}{S'_0} \\
\varphi \quad \varphi \quad \varphi \quad \varphi \\
S_1 \quad S'_1 \quad S_k \quad S'_{n-1} \\
\rho \quad \rho \quad \rho \quad \rho \\
S_{n-1} \quad S_n \\
\overset{\text{ResultValue}_{L1}}{} \\
\overset{\text{ResultValue}_{L2}}{}
\end{array} \]

Figure 17: McGowan mapping of states of L1 and L2.

The mappings $\varphi$ and $\rho$ provide a map between the states of the machines L1 and L2 on the computation $P$. To show that L1 and L2 are equivalent for programs in the base language, it suffices to construct mappings $\varphi$ and $\rho$

\[
\forall S \quad \varphi : S \to S \\
\forall S \quad \rho : S \to S'
\]
where $S' \in \text{State}_{L_2}$ and $S \in \text{State}_{L_1}$ such that for all programs $P$ in the base language the following holds:

Let $S_0 = \text{Translate}_{L_1}(P)$ and $S_0' = \text{Translate}_{L_2}(P)$. $\text{MainLoop}_{L_1}(S_0)$ produces in succession the states $S_0, S_1, ...$ and $\text{MainLoop}_{L_2}(S_0')$ produces in succession $S_0', S_1', ...$

Then

1. $\rho(S_0) = S_0'$
   $\varphi(S_0') = S_0'$

2. If $\rho(S_i) = S_i'$ and $S_i \neq \text{Final}(\text{MainLoop}_{L_1}(S_i))$ then $\rho(S_{i+1}) = S_{i+1}'$.

   If $\varphi(S_i') = S_i$ and $S_i' \neq \text{Final}(\text{MainLoop}_{L_2}(S_i'))$ then $\varphi(S_{i+1}') = S_{i+1}$.

3. If $S_n = \text{Final}(\text{MainLoop}_{L_1}(S_n))$ then $\rho(S_n) = \text{Final}(\text{MainLoop}_{L_2}(S_n'))$

   If $S_n' = \text{Final}(\text{MainLoop}_{L_2}(S_n'))$ then $\varphi(S_n') = \text{Final}(\text{MainLoop}_{L_1}(S_n))$

4. $\text{ResultValue}_{L_2}(S_n') = \text{ResultValue}_{L_1}(\varphi(S_n'))$ and $\text{ResultValue}_{L_1}(S_n) = \text{ResultValue}_{L_2}(\rho(S_n))$ where $S_n = \text{Final}(\text{MainLoop}_{L_2}(S_n'))$ and $S_n' = \text{Final}(\text{MainLoop}_{L_1}(S_n))$

Theorem 4.4: $L_1$ is equivalent to $L_2$ for the base language.

Proof: We exhibit McGowan mappings $\varphi$ and $\rho$ to prove the equivalence. $\varphi$ maps a state $S'$ of $L_2$ to a corresponding state $S$ of $L_1$, and $\rho$ maps a state $S$ of $L_1$ to a corresponding state $S'$ of $L_2$.

First we describe mapping $\varphi(S') = S$.

Let $S' = (\text{Act}', H', \text{EIS'}, C')$.

1. Construct $\text{Act} = \text{Act}'$.

2. Construct $H$ as follows. Let $U_H = \{u : u$ is the uid of node on $H'\}$.

   $\forall u \in U_H \ [\text{if } H(u) \in \text{Structure} \text{ then } (u, A) \in H \text{ such that } A \in \text{Array} \text{ and } \text{SValue}_{L_1}(u, H) = \text{SValue}_{L_2}(u, H')$

   otherwise $(u, H'(u)) \in H]$

   Clearly, if the reference counting is done correctly, then the following holds:

   $\forall u
   \{i : 3\{u\} \times \{\text{arr}\} \times \text{Array} \in H \text{ and Select}(u', d) = u\} = A\text{.refcnt where } (\text{arr}, A) = H(u) \text{ and } A = (u, c_1, \{(c_1, \text{of}_{c_1}, c_3), ..., (c_N, \text{of}_{c_N}, c_N)\})$.

3. Construct $\text{EIS}$ as follows.

   $\text{EIS} = \{(u, k) : (\text{Status} \times \{(u, k)\}) \in EIS'\}$
Secondly, we describe the mapping $\rho : S \rightarrow S'$ such that $S' = \rho(S)$.

Let $S = (Act, H, EIS)$.

1. Construct $Act' = Act$

2. Construct $H'$ as follows.

   Let $U_R = \{u : u$ is the uid of a reachable node in $H\}$

   For $u \in U_R$, if $u$ is the uid of an array then $(u, c_1, \{c_1, af_1, cs_1\}, \ldots, (c_n, af_n, cs_n)) \in H'$ such that $StValue_{L_2}(u, H') = StValue_{L_1}(u, H)$. Also, $c_1.refcnt = (The\ number\ of\ occurrences\ of\ the\ uid\ u\ in\ unexecuted\ instructions\ of\ Act\ or\ in\ reachable\ nodes\ in\ H)$ and $c_1.setcnt = (The\ number\ of\ suspensions\ or\ EC-queues\ in\ the\ structure\ with\ uid\ u)$.

   If $u$ is not the uid of an array (or record or oneof) then $(u, H(u)) \in H'$.

3. $EIS' = \{(executable, u, k) : (u, k) \in EIS\}$

4. $C' = \{(c, af, cs) : \exists u \exists c_1 \exists C : (u, c_1, C) \in H' \text{ and } (c, af, cs) \in C\}$.

Observe that $S'$ is a member of an equivalence class under the similar relation.

Thirdly, we show that $\varphi$ and $\rho$ meet the requisite conditions.

Condition 1:
If $\varphi(S'_0) = S_0$ then $\rho(S'_0) = S''_0$, where $S''_0$ is a member of an equivalence class under the similar relation. Also, $\varphi(S''_0) = S_0$. By judicious of chunks, we can obtain $\rho(S'_0) = S''_0$.

Condition 2:
If $\varphi(S'_i) = S_i$ and $S'_i$ is not a final state then $\varphi(S'_{i+1}) = S'_{i+1}$. We shall consider each instruction and show that the maps hold after the execution of the instruction provided the map was correct prior to the execution of the instruction.

Let $\epsilon' = (executable, u, k) \in EIS'$, which is executed. Assume that all the chunks needed by $Interp_1$, during the execution of $\epsilon'$ are tagged accessible by some omniscient Fetch and PageOut functions. Essentially, we are ensuring that an instruction in L2 does not become dormant during execution. (We shall later see that this assumption places no restriction on the generality of the result).

For ease of exposition, define three functions $\varphi_{Act'}, \varphi_{EIS'}$ and $\varphi_{II}$ such that $\varphi(S') = \varphi((Act', H', EIS')) = (\varphi_{Act'}(Act'), \varphi_{II}(H'), \varphi_{EIS'}(EIS'))$.

1. Scalar instructions:
   The result of the instruction is sent to instructions in $Act'_{i+1}$, the resulting activation being $Act'_{i+1}$.
The same set of instructions in $Act_i$ receive the result of the scalar instruction in L1, to produce $Act_{i+1}$.

The actions of $SendResult_{L1}$ and $SendResult_{L2}$ on $Act_i$ and $Act'$ is the same, as is easily seen by looking at their definitions. Therefore

$$\varphi_{Act}(Act'_{i+1}) = Act_{i+1}$$

$$EIS'_{i+1} = (EIS_i' - \{e'\}) \cup \{(executable, u, k): (u, k) is a destination of e'$$
and $l.opcnt = 0$ and $l.sigcnt = 0\}$, where $l = (Act(u))(k)$.

Also, $EIS_{i+1} = (EIS_i - \{e\}) \cup \{(u, k): (u, k) is a destination of e' and l.opcnt = 0$ and $l.sigcnt = 0\}$, where $l = (Act(u))(k)$.

Since $\varphi_{EIS}(EIS') = EIS_i$, it is clear that $\varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1}$.

The heap is unaffected. Therefore $H'_{i+1} = H_i$ and $H_{i+1} = H_i$. Since $\varphi_{H}(H') = H$, it is obvious that $\varphi_{H}(H'_{i+1}) = H_{i+1}$.

Thus for scalar instructions the map $\varphi$ holds.

2. APPEND instruction

By the same reasoning as used for scalars, in the state produced by APPEND, $\varphi_{Act}(Act'_{i+1}) = Act_{i+1}$, under the stated preconditions.

We have $\varphi_{H} = H$. We will look at the different cases that arise during the execution of APPEND.

Let $u_1$ be the uid of the array $A$; $u_1$ is the first argument to APPEND.

a. Number of EC-elements or suspensions in the array $> 0$ : $EIS'_{i+1} = EIS_i$ in L2.

By precondition and the definition of $\varphi$ and $p$, in L1, this corresponds to the case that $\{|i: element i of A in H is a suspension or an EC-queue\}| > 1$. Therefore $EIS_{i+1} = EIS_i$. Therefore $\varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1}$.

The heap is unaffected since the instruction does not execute. Thus $\varphi_{H}(H'_{i+1}) = H_{i+1}$ since $\varphi_{H}(H') = H_i$.

Therefore $\varphi$ holds for this case of APPEND.

b. No suspensions or EC-elements and $A.refcnt = 1$.

As before, $\varphi_{Act}(Act'_{i+1}) = Act_{i+1}$ and $\varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1}$.

After $e'$ executes, $A$ is deleted and is absent in $H'_{i+1}$; a new array is placed on $H'_{i+1}$. All structures which were pointed at by $A$ have their refcnt field decremented. If they are pointed at only by $A$ then their refcnt will
become zero and they will be absent in $H'_{i+1}$.

Since $\varphi_{S'} = S_j$ and $A.refcnt = 1$, $A$ must be reachable from $H'_j$ only from $e$. Therefore, after $e$ executes, $A$ is no longer reachable in the new heap $H'_{i+1}$ and a new heap is added to this heap for the new array. All nodes in $H'_j$ which were reachable only through $A$ will also become unreachable in $H'_{i+1}$.

Therefore under the given preconditions, $\varphi_H(H'_{i+1}) = H_{i+1}$.

Thus, $\varphi(S'_{i+1}) = S_{i+1}$ for this case of APPEND.

c. No suspensions or FC-elements and $A.refcnt > 1$

As in the previous case, $\varphi_{Act}(Act'_{i+1} = Act_{i+1}$ and $\varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1}$.

After $e'$ executes, $A$ remains on the heap since its refcnt field does not become zero. A new array is added to the heap $H'_{i+1}$ and the value of its refcnt field is equal to the number of destination instructions of $e'$.

Since $\varphi_H(H'_{i}) = H'_j$, $A$ is pointed at by many objects in the current State, and is reachable from more than one instruction or reachable structure. After $e$ executes, $A$ becomes unreachable in $H'_{i+1}$ through $e$. However, $A$ remains reachable on the heap.

Therefore, given that $\varphi(S'_j) = S_j$, we have $\varphi_H(H'_{i+1}) = H_{i+1}$ for this case of APPEND.

This finally yields $\varphi(S'_{i+1}) = S_{i+1}$ for the APPEND instruction under the given preconditions.

3. SELECT instruction

Let $A$ and $j$ be the first and second operands of SELECT, respectively. We shall prove that $\varphi$ holds for SELECT by considering the various cases that arise during the execution of SELECT.

a. Element being accessed is not FC-queue or suspension: By reasoning as for scalars, $\varphi_{Act}(Act'_{i+1}) = Act_{i+1}$ and $\varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1}$.

In L2, if the result of SELECT is a structure $B$ then $B.refcnt = n$, the number of destinations of $e$. If $A.refcnt = 1$ then $A$ is deleted from $H'_{i+1}$. If $A$ is deleted from the resulting heap, the refcnt fields of the structures it points to must also be decremented.

In L1, if the result of the instruction is a structure $B$, it is reachable in $H'_{i+1}$ from the instructions which are the destinations of $e$. If $A$ were reachable only through $e$ then it becomes reachable unreachable in $H'_{i+1}$.
as do all structures which were reachable only through \( A \) in \( H_f \).

Thus we get that \( \varphi_{H_f}(H_{i+1}) = H_{i+1} \) showing that \( \varphi(S'_{i+1}) = S_{i+1} \) for this case of SELECT.

b. **Element being accessed is an EC-queue**

\[ EIS'_{i+1} = EIS_i - \{ e' \} \]

Also, the execution of \( e \) in L1 yields \( EIS_{i+1} = EIS_i - \{ e \} \).

Since \( \varphi_{EIS}(EIS') = EIS_i \), it is obvious that \( \varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1} \).

\[ Act'_{i+1} = Act'_{i} = Act_i = Act_{i+1} \]

so that \( \varphi_{Act}(Act'_{i+1}) = Act_{i+1} \).

On executing \( e' \), the heap \( H'_{i+1} \) is the same as \( H_i \) except that the EC-queue has an additional element \((u, k)\). On executing \( e \) in L1, the resulting heap \( H_{i+1} \) is the same as \( H_i \) except that the EC-queue has an additional element \((u, k)\).

Since \( \varphi_{H}(H') = H \), clearly it is the case that \( \varphi_{H}(H'_{i+1}) = H_{i+1} \).

Therefore, under the given preconditions, \( \varphi(S'_{i+1}) = S_{i+1} \) for this case of SELECT.

c. **The element is a suspension**

The execution of \( e' \) in L2 causes its removal from \( EIS'_{i+1} \). A signal is sent to the suspended instruction whose address is found in the suspension, which may become enabled.

The actions of L1 are the same so that \( \varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1} \) and \( \varphi_{Act}(Act'_{i+1}) = Act_{i+1} \) since \( \varphi(S') = S_f \).

In L2, if the \( i \)th element is a non-empty EC-queue then \((u, k)\) is added to the EC-queue and this change is reflected in the heap \( H'_{i+1} \). The same actions occur in L1 to yield \( H_{i+1} \).

In L2, if the \( i \)th element is an empty EC-queue then it is replaced by an EC-queue containing only \((u, k)\) and the EC-queue is added to the heap \( H'_{i+1} \). The same action occurs in L1.

Therefore, given that \( \varphi_{H_f}(H') = H \), it is the case that \( \varphi_{H_f}(H'_{i+1}) = H_{i+1} \).

Therefore, given the \( \varphi(S'_i) = S_{i+1} \) and \( S'_i \) is not a final state, \( \varphi(S'_{i+1}) = S_{i+1} \) after the SITE instruction is executed.

4. **SITE instruction**

Let \( u_i \) be the uid of the array \( A \) which is the first argument of SITE, and let \( j \) and \( x \).
be the second and third operands.

Signals are sent to the destinations of \( e' \), producing \( Act'_{i+1} \). In L1, signals are set to the corresponding destinations of \( e \) in \( Act_r \) yielding \( Act_{i+1} \).

Since \( \varphi_{Act'}(Act') = Act_r \), we get \( \varphi_{Act}(Act'_{i+1}) = Act_{i+1} \) after SET executes.

After \( e' \) executes it is removed from \( EIS'_{i+1} \). The addresses of the instructions in the EC-queue for the \( i \)th element of \( A \) are added to the set of enabled instructions. The instructions which receive signal from set and thus become enabled are added to the enabled instruction set, too, to yield \( EIS'_{i+1} \). The corresponding actions occur in L1; if \( (u', k') \) is an entry in the EC-queue then \((\text{executable}, u', k')\) is added to the set of enabled instructions in L1.

Therefore, given that \( \varphi_{EIS}(EIS') = EIS_r \), we get \( \varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1} \).

The heaps are affected in the following way. The EC-queue at the \( i \)th element \( A \) is replaced by \( x \). If the \( i \)th element is an EC-queue then decrement \( \text{Asetcnt} \).

The corresponding actions occur in L1 with heap \( H_r \). The EC-queue at the \( i \)th element of \( A \) becomes unreachable in \( H_{i+1} \). \( A \) now has one less EC-queue, if the \( i \)th element was an EC-queue.

Therefore, given that \( \varphi_{H}(H') = H_i \) we get \( \varphi_{H}(H'_{i+1}) = H_{i+1} \).

Therefore, given the condition that \( \varphi(S_i') = S_i \) and \( S_i' \) is not the final state, \( \varphi(S'_{i+1}) = S_{i+1} \) for the SET instruction.

5. **SETSUSP instruction**

Let the three arguments to the instruction be \( u_1, i \) and \( m \). Let \( u_1 \) be the uid of the array \( A \).

After \( e' \) executes, it disappears from \( EIS'_{i+1} \). If the \( i \)th element of \( A \) is not an empty EC-queue then a signal is sent to the destination instruction \((u, m)\). Corresponding action occurs in L1. Also, signals are sent to the destinations of \( e' \) and \( e \) in L2 and L1, respectively. Thus the corresponding set of instructions get enabled in the two machines.

\( \varphi(S_i') = S_i \) implies that after setsusp executes \( \varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1} \) and \( \varphi_{Act}(Act'_{i+1}) = Act_{i+1} \).

If the \( i \)th element of \( A \) is an empty EC-queue, then it is replaced by a suspension. The EC-queue is deleted from the heap \( H_{i+1} \). The same is done in L1 to produce the heap \( H_{i+1} \).

If the \( i \)th element of \( A \) is a non-empty EC-queue then \( H_{i+1} \) and \( H_{i+1} \) are identical to \( H_{i+1} \) and \( H_{i+1} \) respectively.
Thus for both cases, given that $\varphi_{H}(H_f) = H_i$, we conclude that $\varphi_{H}(H_{i+1}) = H_{i+1}$.

From the above analysis we conclude that given $\varphi_{s} = S_i$ and $S'_{i+1}$ is not the final state, $\varphi(S_{i+1}) = S_{i+1}$ for the SET instruction.

6. MKINTARRAY and MKINTARRAYEC instructions

After $e'$ executes it disappears from $EIS'_{i+1}$. The uid of the result of the instruction (or a signal) is sent to the destinations of $e'$, which may then become enabled and so would be added to the set of enabled instructions to yield $EIS'_{i+1}$. Corresponding actions are performed by the L1 machine. Hence, given that $\varphi(S') = S_i$ we conclude that $\varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1}$ and $\varphi_{Act}(Act'_{i+1}) = Act'_{i+1}$.

Suppose the instruction is MKINTARRAY. The execution of $e'$ causes a new node corresponding to an array to be added to the heap $H_{i+1}$. Any SELECT operation on the array produces an undefined value. In L1, a new node is added to the heap corresponding to an array due to the execution of $e$. Recall that in L1 an array is represented as a function mapping indices to values; the array added to the heap maps all indices to the undefined value.

Suppose the instruction is MKINTARRAYEC. $e'$ adds a new node to the heap such that an element within the bounds of the array points at an empty EC-queue. Any SELECT operation which tries to access an element outside the bounds of the array would produce an undefined value. In L1, the function is defined such that the indices within the specified bounds map to empty EC-queues and indices outside the bounds are mapped to undefined values.

Therefore, given the $\varphi_{H}(H_f) = H_f$, we get $\varphi_{H}(H_{i+1}) = H_{i+1}$.

Hence, given $\varphi(S') = S_i$ and $S'_{i+1}$ is not the final state, $\varphi(S_{i+1}) = S_{i+1}$ for the MKINTARRAY and MKINTARRAYEC instructions.

7. APPLY instruction

A new function activation is added to $Act'_{i+1}$ as a result of the execution of $e'$. The first three instructions of the new activation receive the closure, argument list and return link, respectively. Corresponding action occurs in L1. Hence, under the given preconditions, $\varphi_{Act}(Act'_{i+1}) = Act'_{i+1}$.

$e'$ is deleted from $EIS'_{i}$ and instructions which received the closure, argument list or return link, if they become enabled, are added to the set of enabled instructions to yield $EIS'_{i+1}$. L1 acts in the same manner to yield $EIS'_{i+1}$.

Therefore given that $\varphi_{H}(EIS'_{i}) = EIS_{f}$, we get $\varphi_{EIS}(EIS'_{i+1}) = EIS_{i+1}$, provided that $S_i$ is not a final state.

The heap is unaffected, so that $H_f = H_{i+1}$ and $H_i = H_{i+1}$. Since we have $\varphi_{H}(H_f) = H_f$, it is trivially true that $\varphi_{H}(H_{i+1}) = H_{i+1}$.
under the given preconditions, \( \varphi(S_{i+1}^{'}) = S_{i+1} \).

The cases for \text{TAILAPPLY} and \text{STREAM-TAILAPPLY} are similar and are left as an exercise.

8. \text{RETURN} instruction

A glance at the transition rules for the \text{RETURN} instruction in L1 and L2 tells us that there is no difference between them, except that the instructions which become enabled in L2 are tagged \textbf{executable}. The heap is unaffected. Thus we can conclude that \( \varphi(S_{i+1}^{'}) = S_{i} \) given that \( \varphi(S_{i}^{'}) = S_{i} \) and that \( S_{i}^{'} \) is not a final state.

9. \text{RELEASE} instruction

The execution of \( e' \) deletes the activation of \( e' \) from \( Act_{i}^{'} \) to produce \( Act_{i+1}^{'} \). Similarly, the activation of \( e \) disappears from \( Act_{i} \) to produce \( Act_{i+1} \). Since \( \varphi(S_{i}^{'}) = S_{i} \), we conclude that \( \varphi(Act_{i+1}^{'}) = Act_{i+1} \).

\[ EIS_{i+1}^{'} = EIS_{i}^{'} - \{ e' \} \] \[ EIS_{i+1} = EIS_{i} - \{ e \} \]

Therefore, under the stated preconditions, \( \varphi_{EIS}(EIS_{i+1}^{'}) = EIS_{i+1} \).

The assertions of the validity of the mapping for the heap needs some comments. The transition rule for this instruction is identical in both machines. In L1, all nodes in the heap \( H_{i} \), which were reachable only through instructions in the activation to which \text{RELEASE} belongs become unreachable.

The base language ensures that a \text{RELEASE} instruction is enabled only after all the instructions which have pointers to nodes corresponding to structures — arrays, records, etc. on the heap have been executed. This is done by arranging the data flow graph for a function such that the \text{RELEASE} instruction is the last instruction to become enabled. This ensures that when \text{RELEASE} is enabled, the instructions which received structure operands have already executed; if the structures were pointed at only by the instructions, then their \text{relevant} field must be 1 in \( H_{i} \) and would become zero after the execution of \( e' \) and would be deleted from the heap. Therefore, the mapping \( \varphi_{H} \) still holds.

Thus we conclude that if \( S_{i} \) is not a final state and \( \varphi(S_{i}^{'}) = S_{i} \) then after \text{RELEASE} executes, \( \varphi(S_{i+1}^{'}) = S_{i+1} \).

We demonstrated that the \( \varphi \) indeed provides the desired mapping. Similar reasoning may be used to show that the mapping \( \rho \) is preserved by instruction execution. The map \( \rho(S) \) is a member of an equivalence class under the similar relation. This is an artifact due to the different chunks which may be used to represent the same data structure, and hence, the heap.

Thirdly, we prove that \( \varphi \) and \( \rho \) satisfy the third condition for equivalence. Clearly, if \( S_{i} \) is \text{final} then \( \varphi(S_{i}^{'}) \) is final both will have \( EIS_{i} = EIS_{i}^{'} = \{ \} \). Similarly, if \( S_{i}^{'} \) is final then \( \rho(S_{i}^{'} \) is final.
Finally, let $S'_n$ and $S''_n$ be the final states of machines L1 and L2, respectively. By the previous two conditions, $\varphi(S''_n) = S'_n$ and $\rho(S'_n) = S''_n$, by a judicious choice of chunks.

We proved that at every step during the execution of program $P$ the set of values represented on the heap were the same. Therefore, $\text{ResultValue}_{L1}(S'_n) = \text{ResultValue}_{L2}(S''_n)$.

We now argue informally that even if the Fetch function is not omniscient, that is, it does not mark as accessible all the chunks required during the execution of the instruction, the result produced by the computation is the same as would be if it were omniscient. Recall that if a chunk which is required to be accessed during the execution of an instruction is marked inaccessible, the instruction is made dormant and no change is made to the Act, H and C components of the machine. The state of the machine is changed only by the fact that the status of the instruction changes in the EIS component of the state. Some other instruction may be selected for execution; eventually when the chunk becomes accessible the instruction becomes executable. The order of execution of instructions is thus different. Let the sequence of states during the computation of program $P$ on L2 (in which the instructions may become dormant) be $S''_1 S''_2 \ldots S''_n$. The map $\varphi$ gives us a corresponding sequence of states in L1. Since this is a non-deterministic state transition system, it can be shown that the result of this computation is the same as for the original sequence on L1, and so the result is the same.

4.2 Discussion

It was shown that L2 satisfies the specifications of VIM by proving that L1 and L2 are equivalent. This technique of taking a component of the VIM system (in this case, it is the structure memory) and designing an implementation which is formally proved to satisfy the specifications given by L1 can be extended to design the entire machine. At each step of the implementation, the resulting formal model must be proven to satisfy the specifications of L1. This methodology of designing by successive refinement with formal proofs of equivalence between the original model and its refinement will be very useful in the design of more complex implementations of VIM, such as in a computer system with multiple processing elements.
Chapter Five

A Base Language for VIM

I shall now present a base language for VIM; I shall do this by describing the base language graph corresponding to the VIMVAL language constructs. These graphs are characterised by a property called safety. A graph is safe if every instruction (in the graph) which receives a structure operand becomes enabled before the activation, of which the instruction is a part, is removed from the set of activations by the RELEASE instruction. The firing of the instruction would cause the reference counts of the the operand structures to be appropriately decremented. The discussion in chapter 4 described why the two machines are equivalent for this base language. In terms of pragmatics, the use of safe graphs would ensure that the chunks which contain garbage values are reclaimed in the actual implementation of VIM. The base language described is not the only possible one; the intent of this chapter is to give the reader a flavor of how one might go about designing the language. No formal proof will be given to demonstrate that the graphs are indeed safe and that a compiler using the base language as its target language will always generate safe graphs; the interested reader may convince himself of the safety of programs written in this base language by examining each graph and the operational semantics of each of its instructions.

5.1 The Let expression

The let construct permits local bindings to be expressed and is of the form:

\[
\begin{align*}
\text{let} & \quad x_1 = E_1, \\
& \quad x_2 = E_2, \\
& \quad \ldots \\
& \quad x_n = E_n \\
\text{in} & \quad f(x_1, x_2, \ldots, x_n) \\
\text{endlet}
\end{align*}
\]

where the \( f \)'s all represent data flow graphs.

The base language graph for the above let expression is shown in figure 19.
5.2 Conditional Expression

The conditional expression in VIMVAL is of the form:

\[
\begin{align*}
\text{if } f & \text{then } g(x_1, x_2, \ldots, x_n) \\
\text{else } & h(x_1, x_2, \ldots, x_n)
\end{align*}
\]

where \( g \) and \( h \) are data flow graphs and \( f \) denotes a graph that computes a boolean result. In some graphs, the values of all the operands of an instruction are known at the time of compilation. It is necessary to send a signal to such instructions to enable them. If there is such an instruction in either of graphs \( g \) and \( h \) and the instruction is not in the graph of a let or tagcase expression inside \( g \) or \( h \), then the SWITCH-SIGNAL instruction is used; it is omitted otherwise. Each branch of the if expression must produce the same number of signals; this is accomplished by feeding all the signals in an arm into a SIGNAL instruction which generates one signal. If neither of the arms produces signals, then the parts of the graph that deal with them may be omitted. If the RELEASE instruction is triggered by the signal produced by the if expression, and \( f, g \) and \( h \) are all safe graphs, then it is the case that the graph for the conditional expression is also safe.
5.3 The Tagcase Expression

Let \( T \) be a oneof with tags \( t_1, t_2, \ldots, t_n \). The tagcase expression is:

\[
\text{tag} t_1 : E_1(u_1, u_2, \ldots, u_d);
\]
\[
\text{tag} t_2 : E_2(v_1, v_2, \ldots, v_p);
\]
\[
\vdots
\]
\[
\text{tag} t_n : E_n(z_1, z_2, \ldots, z_m)
\]
\text{endtag}

Figure 19: Translation of a conditional expression
where the $E$'s are data flow graphs. The use of the \texttt{SWITCH-SIGNAL} and \texttt{SIGNAL} instructions is governed by the same considerations as for the conditional expression.

\begin{figure}
\begin{center}
\includegraphics[width=\textwidth]{figure20}
\end{center}
\caption{Translation of a \texttt{tagcase} expression}
\end{figure}
5.4 Function Application and Returns

The syntax for function application in #vimval# is:

\[ f(x_1, x_2, \ldots, x_n) \]

where \( f \) is the name of a function. In most cases, the function application is performed by the \texttt{APPLY} operator. The arguments \( x_1, \ldots, x_n \) are collected into a single record all of whose elements are initially \texttt{EC}-elements. The code is shown in figure 22. The \texttt{TAILAPPLY} instruction is used in place of the \texttt{APPLY} whenever the compiler can recognize that a function application is tail-recursive. The instruction \texttt{STREAM-TAILAPPLY} is used only in the body of stream producers and will be discussed in the next section.

\begin{figure}[h]
\centering
\includegraphics[width=\textwidth]{figure21}
\caption{Code for creating the argument record for function activation.}
\end{figure}

The \texttt{RETURN} instruction is used to dispatch the results computed by the function to the destinations listed in the return link, which is its first operand. The return-link was received by the template from the \texttt{APPLY} instruction. The results computed by the activation are packed into a record, the fields of which are \texttt{EC}-elements. Values returned from a function are similarly packed into a record. In most cases, one of the destinations for the signal produced by the \texttt{RETURN} instruction is the \texttt{RELEASE} instruction for the activation.
5.5 Stream Producers and Consumers

A stream is a sequence of values, all of the same type, that are passed in succession, one-at-a-time between functions. The operations on values of type stream of type $T$ are defined below where $S$ and $S'$ are streams, and $v$ is a value of type $T$.

1. $[[T]]$: returns an empty stream (of elements of type $T$) which is the sequence of length zero.

2. first($S$): The result is the value $v$ which is the first element of the stream $S$. If $S = []$ (the empty stream), then the result is $undef$.

3. rest($S$): The result is the stream left after removing the first element of $S$. If $S = []$ the result is $undef$.

4. affix($v$, $S$): The result is the stream $S'$ whose first element is $v$ and whose remaining elements are the stream $S$.

5. empty($S$): The result is $true$ if $S = []$, $false$ otherwise.
For a non-empty stream \( S \), the following property is satisfied:
\[
S = \text{affix}(\text{first}(S), \text{rest}(S))
\]

In Vim the storage representation for a stream is a chain of oneofs. Operations on streams are expressed in terms of operations on the components of the oneofs. The data structure for a stream whose elements are of type \( T \) is:

\[
\text{stream}[T] = \text{oneof}[\begin{array}{c}
\text{empty: null;}
\text{nonempty: record}[\begin{array}{c}
\text{first: } T;
\text{second: stream}[T]]
\end{array}\end{array}]
\]

The following discussion describes the rules using which the compiler can translate the VimVal text into data flow graph. The translation rules specified are by no means complete; only the simpler cases are dealt with in this thesis and the more complex cases need further investigation.

The expression [] for creating an empty stream is translated into an expression for creating a oneof with tag empty.

\[
\text{make}[\begin{array}{c}
\text{empty: nil}
\end{array}]
\]

\( \text{first}(S) \) is translated into the following code:

\[
\text{tagcase}(S)
\begin{array}{c}
\text{tag empty: undef;}
\text{tag nonempty: } S.\text{first}
\end{array}
\text{endtag}
\]

\( \text{rest}(S) \) is translated into the following code:

\[
\text{tagcase}(S)
\begin{array}{c}
\text{tag empty: undef;}
\text{tag nonempty: } S.\text{rest}
\end{array}
\text{endtag}
\]

The code generated for the \( \text{affix}(v, S) \) is shown below in figure 24. The instructions are so organized that the computation of the rest of the stream is suspended until some instruction demands it. When some computation attempts to perform the \( \text{rest} \) operation on the resulting stream, the suspension is replaced by a pointer to the next element of the stream and a signal is sent to the instruction which initiates the computation of the next element.
The translation for self-tail-recursive stream producers is quite interesting. I assume that the compiler can recognize stream producers which are self-tail-recursive. The use of tail recursion allows the activation template of the caller to be released before the computation of the callee is completed. This is a significant optimization since it results in a much lower amount of storage that is required for the computation. Mutually tail-recursive programs are translated naively, using simple APPLY and RETURN instructions without taking advantage of the STREAM-TAILAPPLY instruction.

Let \( f \) be a self-tail-recursive function that requires \( n \) arguments and produces a stream. Let the function be of the form:

```
function \( f(x_1, \ldots, x_n) \) returns stream[7];
body of function
endfun;
```

The compiler generates an auxiliary function \( f' \) from \( f \). The code for the body of the function \( f' \) is generated using the rules specified above, except that every instance of \( \text{affix}(v, f(...)) \) is translated into the graph in figure 25.

The \texttt{VIMVAL} text of \( f' \) is the same as that of the function \( f \). The only difference is that each
The correctness and generality of these translations are under current investigation and will be the subject of another treatise.

5.6 Discussion

We described a base language such that machines L1 and L2 are equivalent for all programs written in this language. Stated in another way, programs in this base language ensure that when a program halts, the only elements on the heap are those that represent the result value of the computation. Since the data structures are stored in chunks, this ensures that all chunks which were acquired during the computation and which are not part of the result structure are reclaimed.

The base language uses early-completion elements for creating argument lists for function invocations and for creating records in which the result of a function invocation is returned. The use of early-completion elements in these records allows a function to be invoked even if all the arguments
have not been evaluated. Similarly, early-completion queues in return records allows values to be returned to the caller even if the computation of all the values which are to be returned has not completed. The other use of the early-completion structures is in the construction of streams.

A major use of the early-completion feature of the language is in the construction of arrays. Creation of arrays whose elements are then initialized by large computations can benefit from the use of EC-queues. Such techniques for increasing the amount of parallelism in programs are the subject of ongoing research.
Chapter Six

Conclusion and Scope for Further Work

The objective of the thesis was to develop a storage management strategy for Vim. An abstract architecture for Vim was informally discussed and some of the distinctive features of Vim explained in an informal manner. This was followed by a formal model L1 of the abstract architecture. The thesis then went on to refine the model L1 to include hierarchical storage consisting of main memory and disk. Chunks, which are the unit of storage allocation and reclamation of storage and the unit of data transfer between main store and disk, were used as the constituent of a new data structure called a Vim-tree. Vim-trees are used to represent structure values (arrays, records and oneofs). An automatic storage reclamation strategy was developed using reference counting. Particular attention was paid to ensure that the machine L2 exhibited desirable behaviour in the presence of EC-queues and suspensions.

A concurrent objective of the thesis was to demonstrate the usefulness of the methodology of computer design by successive refinement. We started with an abstract machine and developed a formal specification for it. The machine model was then refined to include a storage model. In order to show that the refined model L2 (with hierarchical memory, paging, dormant instructions and tree structures for storing arrays, etc.) exhibited the same behaviour as L1 for programs written in the base language discussed in the thesis, we proceeded to prove the equivalence of the two machines. A modification of the McGowan mapping was used to accomplish this.

L2 represents a machine which is closer to an envisioned implementation. L2 may now be refined so that EC-queues, Function templates and activation templates would also be represented as data structures. This new model, say L3, may then be proved to be equivalent to L2, and hence to L1, using the kind of technique described in this thesis. Such successive refinement would finally yield a machine model which can be directly implemented to construct a real machine.
Future Research

There are a number of topics which are natural extensions of the ideas and issues addressed in this thesis.

1. **An implementation of VIM**: The abstract architecture may be successively refined to produce a model which reflects the characteristics of the physical elements of a machine — disk interaction, paging algorithms, process priorities, non-terminating computations, faults and exception handling, etc. Each model must be shown to be equivalent to the preceding model, and thus the final implementation would satisfy the specifications of L1 and the two would be computationally equivalent. It is a matter of conjecture as to how far this process of refinement can be performed before the designer is overwhelmed by the details of the machine formalism.

2. **Storage Management and Guardians**: Guardians are a special construct proposed in VIMVAL [13] which allows the programmer to express indeterminacy in computations. They are similar to the manager construct in Ld [3] and allow the programmer to write programs for, say, data base transactions. It remains to be investigated how the incorporation of guardians in the abstract model would affect the reference counting scheme.

3. **Storage Management in Multiprocessors**: It would be interesting to develop a model of VIM which has multiple processors and prove the equivalence of this model to L1 for the base language in consideration in this thesis. The issues of storage allocation and reclamation and instruction scheduling can be formally addressed in this model.

4. **Extending the Base Language**: The base language presented in this thesis is being extended to express efficient computations on arrays. Judicious use of EC-queues should significantly increase the amount of concurrency in the program. This increase in parallelism can be exploited to overlap disk activities in a single processor implementation of VIM, or by multiple processors. VIMVAL constructs which correspond to these new base language constructs must be developed; preliminary research shows that naive extensions of VIMVAL introduces in the type system of VIMVAL.
References


