Unique virtual addresses - Type secure persistent object stores

1.5 Type secure persistent object stores

2.2.6 Unique virtual addresses

There are a wide range of garbage collection techniques that may be successfully employed for a particular pattern of data use or scale of system. However, in certain situations, it may be necessary to introduce additional mechanisms to support the illusion of unbounded size. An example of this would be the implementation of a large scale distributed system. In such a system, it may be too costly to perform a garbage collection that scanned the entire distributed system. Hence, localised garbage collection techniques similar to the scavenging technique would have to be employed. This would allow the majority of unusable objects to be deleted automatically, with the exception of those that were reachable from a remote object. Eventually, individual systems within the distributed system may become full of

unreachable data. This spurious data would have to be deleted in order to allow the affected systems to function normally.

There are two techniques that can be employed when a system becomes full of unreachable data. One possibility is to extend the system's available storage by adding further disks. Depending on the organisation of the system, this may require the entire storage to be restructured, in addition to any costs involved in modifying the system's hardware. The total cost of this solution may prove prohibitive.

A second solution to the problem is to allow data objects to be explicitly deleted. This could be achieved by asking users to remove references to unnecessary data. A local garbage collection may then recover sufficient storage to allow the system to function correctly. Alternatively, the system could provide a system administrator with tools that can identify objects only referenced from a remote system. The system administrator would then have the responsibility of identifying and explicitly deleting any unreachable objects.

The deletion of an object requires that all references to an object to be invalidated. This can be achieved by either identifying all references to an object and then removing them or by performing all addressing via an indirection mechanism and then invalidating the object's indirection mapping.

Within a very large distributed system, it may be far too costly to scan the entire system to find all references to an object. Consequently, an indirect addressing mechanism must be used. In turn, this requires an object's address to be unique so that a deleted object's address will remain invalid. This can be achieved by making the number of available object addresses so large that they can never be exhausted. An example of a system that uses a combination of garbage collection and a very large number of addresses is given by Hydra[coh76].

2.3 Store speed

Although the provision of an infinitely fast persistent store is not technically feasible, it is possible to give the illusion of such a store by capitalising on potential parallelism.

One area of store design particularly suited to parallelism techniques is virtual address translation. For example, the majority of virtual address translation mechanisms make use of an associative store to support parallel searches for address mappings. On the VAX architecture this takes the form of a small address translation buffer that records the results of the last few translations.

Similar high speed address translation techniques have been used in the Atlas, Monads and IBM system 38. The Atlas uses an address translation register for every page fi-ame of main memory to form an associative store that can be searched in parallel, as discussed in Section 2.2.2. The techniques used by the Monads and IBM system 38 are based on hash coding to search the associative store and are described in Section 2.2.4.

Another example of parallel address translation is provided by the SUN workstation[thak86]. The SUN uses dynamic RAM chips to implement its main memory. Access to this DRAM memory is performed in two steps. Firstly, the low order bits of an address are used to cycle a row of main memory and then the remaining address bits are used to cycle a column of main memory. To optimise access to main memory, the first step is performed in parallel with the translation of a virtual address by the SUN's memory management unit. A successful address translation will then result in the appropriate high order address bits being available for the second step of the main memory access. The success of this scheme depends on the number of rows in main memory corresponding to the page size used by the memory management unit.

Another area of store design that may permit parallelism is the particular choice of garbage collection technique. For example, an architecture may be designed in which the available storage was incrementally garbage collected in parallel with active programs. It may be advantageous to dedicate one or more processors to the task of performing this parallel garbage collection. Further parallelism could also be achieved by dedicating additional processors to other specific functions within the architecture. These functions may include analysing the current paging strategy, dynamically clustering data on disk storage and so on. An example of an architecture designed specifically with the intention of employing these parallelism techniques is given by Snyder[sny79].

2.4 Store stability

The persistence abstraction attempts to hide all the physical attributes of data. Consequently, the components of a persistent store are also hidden, requiring any failures in the components to be hidden. Therefore the persistent store is conceptually failure free that is, it is stable.

The potential failures that may occur within a store can be categorised as either being hard failures or soft failures. A hard failure is a failure that results in physical damage to the store, such as a head crash on a disk. A hard failure destroys data. In contrast, a soft failure is a failure that may cause a system to halt, possibly resulting in some minor corruption of data. In general, it will not result in the destruction of data. The provision of a stable store must address the issues of protecting data from the potential side effects of both hard and soft failures.

In document Persistent object stores (Page 62-65)