• No results found

4.3 Ensemble VM Design and Implementation

4.3.5 Memory Model

The EVM is a stack-based VM, like the JVM. An alternative register-based model was rejected due to the large memory requirement, despite the better potential for optimisa- tions [119]. By comparison, stack-based bytecode tends to be slower but smaller than equiv- alent register-based code. Using fewer resources is generally beneficial, however, the need to limit memory consumption was required for certain hardware platforms, Section 4.6.

Slot Size

To support the stack-based model, a new call frame or stack frame is allocated for each procedure call. Like the JVM, the Ensemble VM allocates a new frame from the heap as required, as opposed to using a static number of pre-allocated frames.

Every call stack frame has an operand stack consisting of fixed-size slots. Local variables are stored in separate slots of the same size. The JVM specification requires slots to be 32 bits, to match the word size of common desktop CPUs. Values of all data types occupy one slot, except for long and double, which occupy two. However, instructions are defined in terms of the number of slots upon which they operate, with no reference to the actual size of the slots. This means that the slot size can be changed without any change to the bytecode, as long as each data type still occupies the same number of slots. This is relevant for the discussion in Section 4.6.2.

Stack Frames

A minimal stack frame with no slots and no local variables occupies 32 bytes of RAM, with each additional slot or variable requiring four bytes. The number of slots and variables used by a method is known at linktime, so the whole frame can be allocated as a single unit. This has the advantage that only the memory currently required is used by the VM stack, rather than pre-allocating a stack based on the worst case need. Memory must be allocated in advance for an actor’s C stack, which is used to run the interpreter and native methods. This is on the order of a few hundred bytes per actor when using InceOS, and is an internal default value when using the Pthread implementation on Linux.

A stack frame contains a pointer to the previous stack frame, the return address, a reference to the method being executed by the frame, and arrays for the operand stack and local variables. Additionally, stack frames contain bitmaps used to track which of the operand stack slots and local variables contain references; this is currently required for garbage collection

Objects

All objects are allocated on the heap. A class definition includes a reference to the class’s superclass (as in standard Java, all classes descend ultimately from Object), the size of its fields, and a virtual method table.

When an object is instantiated, space is allocated for its fields. Unlike stack slots, fields can differ in size, and are packed in memory. The size of a field must therefore be known when accessing it. A standard JVM keeps this information in the constant pool, but the EVM in- stead uses new type-specific versions of the getfield and putfield instructions. These

have been introduced for the different field sizes to reduce the amount of information con- tained in the classfile. The instruction to use in each case is chosen at linktime.

Some classes are treated specially. String contains a pointer to a native string. Arrays contain a pointer to a native array, as well as the size, dimensions, and element class of the array. The class of an array itself is the special placeholder array class, and variants of the instanceofand checkcast instructions have been introduced to test the element type and dimensionality of arrays. Additional space is allocated for these classes by the VM.

Static Fields

Ensemble does not support static fields as they could break the strict encapsulation of actors. They are not allowed in bytecode programs.

Garbage Collection

The VM uses the reference counting garbage collector provided by InceOS. All objects are reference counted. Bytecode instructions which manipulate objects also increment and decrement the reference counts appropriately. The choice to use reference counting is be- cause the EVM is built upon InceOS which was designed for embedded systems, where the need to efficiently return memory to the heap as soon as possible is required. This said, the choice of garbage collection technique is orthogonal to the use of actors.

It is necessary to monitor, at runtime, which slots and local variables in a stack frame cur- rently contain references, so that when a method returns, their reference counts can be decre- mented appropriately. This is done using bitmaps which are allocated along with the stack frame.

As with any basic reference counting system, the InceOS collector cannot handle cycles in the object graph. To some extent the Ensemble language mitigates this by always duplicat- ing complex data types which are sent over channels, however, as structures may reference each other, cycles are possible. Also, should the user circumvent the language rules (e.g. by providing hand-written Java code to javac), then the system cannot guarantee that objects will be collected. The presence of cycles can be mitigated through the use of cycle detec- tion [120] or the use of a tracing collector. This would either require a modification of the existing mechanism, or the implementation of a new garbage collector, respectively.

In the Ensemble VM, the use of reference counting influences how out-of-memory condi- tions are handled. In most reference counting systems, a tracing collector is present as a backup. This is run when there is not enough memory to service an allocation request, so that any cycles no longer needed can be collected. Only if there is still insufficient memory is

an out-of-memory error signalled. In the Ensemble VM, however, no such backup collector is currently present. An out-of-memory condition results in an exception being thrown; if this is not handled, the actor is restarted. If the VM itself has insufficient memory to generate the error, it will fail. This is particularly problematic on embedded systems where failure is hopefullyindicated by a flashing led.

Movability

The use of movability in Ensemble was primarily designed for highly resource-constrained platforms, as the increased heap usage and fragmentation can represent a non-trivial reduc- tion in the amount of available RAM. Consequently, it was important that the correctness of movability be determined at compiletime, rather than runtime. As a result, the only manifes- tation of movability at runtime is that the compiler will not generate code to duplicate data allocated from the movable heap before being sent over channels.

Also, even though the language model describes two heap spaces, there is only a single heap from which all data is allocated. The compiletime analysis ensures that data from the two conceptual heaps will not interact. Movable channels sent between actors must still be adopted upon receipt; even without duplication, they must transfer ownership from the sender to the receiver.