<< , >> , Title

1. Introduction

When orthogonal persistence is introduced to a programming language, several requirements emerge which affect code generation:

  1. data in the system may persist,
  2. code in the system may persist, and
  3. the dynamic state of the system may persist.

Since all data may potentially persist, it must be held in a suitable form. Typically, a persistent object store will support one or more object formats onto which all data must be mapped. For example, objects must be self-describing to support automatic garbage collection and persistent object management. In particular, it must be possible to discover the location of all inter-object pointers contained in an arbitrary object. As a consequence, the code generation techniques employed must ensure that the objects constructed by the code conform to the appropriate object formats.

In languages that support first class functions and procedures, a further consequence of persistence is that these values may also persist. This implies that executable code must be mapped onto persistent objects. This requirement would defeat most traditional code generation techniques since the traditional link phase links together all the procedural values contained in a single compilation unit using relative addresses. If all code resides in relocatable persistent objects then the compiler/linker cannot determine the relative positions of code segments at run-time. Furthermore, facilities such as garbage collection and persistent object management may result in code segments moving during execution.

Persistent systems support potentially long-lived applications whose functionality may evolve over time. To accommodate this, many persistent systems provide facilities to dynamically generate new source code which is compiled and linked into the running system. This facility may be provided by making the compiler a persistent procedure [6,7].

In order to provide resilience to failure, many persistent systems periodically take snapshots. A system snapshot contains at least the passive data within the system but may also include the dynamic state of all executing programs. If a failure should occur, the data is restored from the last snapshot and if the dynamic state was saved, the system resumes execution. To support this, it is necessary to automatically preserve the state of a program at some arbitrary point in its execution and resume it later. This can give rise to problems in determining what constitutes the dynamic state of a program. For example, a traditional code generation technique includes a run-time stack containing return addresses, saved register values and expression temporaries. The task of saving state must establish what information on the stack should be saved and how it should be saved in order to support rebuilding the stack when the system is restarted.

In summary, code generation for a persistent programming language must address the following issues:

  1. mapping generated code onto relocatable persistent objects,
  2. linking generated code to the necessary run-time support,
  3. linking generated code to other generated code,
  4. preserving pointer values, including code linkage, over garbage collections,
  5. run-time compilation and execution of dynamically generated source, and
  6. preserving the dynamic state over checkpoint operations.
In this paper, the techniques employed to generate native code for the persistent programming language Napier88 [11] are presented. Napier88 is a persistent programming language which supports first class procedures, parametric polymorphism and abstract data types. The Napier88 system provides an orthogonally persistent integrated mono-lingual programming environment. The techniques described may be applied to any language with the features described above or a subset of these features.


<< , >> , Title