When orthogonal persistence is introduced to a programming language, several requirements emerge which affect code generation:
Since all data may potentially persist, it must be held in a suitable form. Typically, a persistent object store will support one or more object formats onto which all data must be mapped. For example, objects must be self-describing to support automatic garbage collection and persistent object management. In particular, it must be possible to discover the location of all inter-object pointers contained in an arbitrary object. As a consequence, the code generation techniques employed must ensure that the objects constructed by the code conform to the appropriate object formats.
In languages that support first class functions and procedures, a further consequence of persistence is that these values may also persist. This implies that executable code must be mapped onto persistent objects. This requirement would defeat most traditional code generation techniques since the traditional link phase links together all the procedural values contained in a single compilation unit using relative addresses. If all code resides in relocatable persistent objects then the compiler/linker cannot determine the relative positions of code segments at run-time. Furthermore, facilities such as garbage collection and persistent object management may result in code segments moving during execution.
Persistent systems support potentially long-lived applications whose functionality may evolve over time. To accommodate this, many persistent systems provide facilities to dynamically generate new source code which is compiled and linked into the running system. This facility may be provided by making the compiler a persistent procedure [6,7].
In order to provide resilience to failure, many persistent systems periodically take snapshots. A system snapshot contains at least the passive data within the system but may also include the dynamic state of all executing programs. If a failure should occur, the data is restored from the last snapshot and if the dynamic state was saved, the system resumes execution. To support this, it is necessary to automatically preserve the state of a program at some arbitrary point in its execution and resume it later. This can give rise to problems in determining what constitutes the dynamic state of a program. For example, a traditional code generation technique includes a run-time stack containing return addresses, saved register values and expression temporaries. The task of saving state must establish what information on the stack should be saved and how it should be saved in order to support rebuilding the stack when the system is restarted.
In summary, code generation for a persistent programming language must address the following issues: