VMs provide a higher-level interface to the actual target machines. It pro- vides an intermediate language, most often in the form of bytecodes, that is targeted by compiler builders of higher-level languages. This facilitates implementing programming languages since compiler builders only have to go half-way. Nevertheless there are several problems with using a pre- compiled VM.
VMs are closed for extension. The Common Language Runtime is a VM for a whole family of languages [102]. It shows how a unified infrastructure can ease the development and maintenance of language implementations. Nevertheless, a single VM cannot anticipate all needs of languages it was not initially designed to support. For example, the JVM initially did not sup- port dynamic method invocations. Unfortunately, VMs are classically black boxes: they define fixed interfaces (i.e., bytecode and an API) for accessing the features they provide. To extend a VM, one must open the black box, and define a custom adaptation. Custom VMs, however, introduce branches in the VM implementation, sacrificing compatibility and risking rapid obsoles- cence. Additionally, users are forced to choose between features added by
3.3. Problem 1: The Tyranny of a Closed VM
different custom VMs, or they have to combine them into yet another cus- tom VM. Examples of now incompatible VMs include the object-flow VM [93] built as an extension to Squeak, and Iguana/J [120] which introduced fine-grained MOP extensions to the JVM.
Host language bias or lock-in. VMs are usually tailored towards the lan- guage for which they were initially developed. Since the JVM is built for Java, an object-oriented language, it lacks features required by other lan- guages that target it. To support languages from the LISP family for exam- ple, tail call elimination needs to be implemented. Since this feature is not available in the JVM, Clojure [72] does not try to work around this limitation but instead introduces an explicit new language feature that implements it- eration. The main problem is that language developers cannot reuse parts of the VM and replace others with their own code.
Reflection and meta programming restrictions. A VM implements the meta-level behavior of a programming language. As a consequence, sup- port for reflection and metaprogramming must be provided by a MOP. Re- flective extensions to the VM need to be supported up-front. For example, to allow arbitrary objects to be treated like regular method dictionaries, VMs generally include manually written tests:
if (method_dictionary.class == MethodDictionary) { Dictionary_at(method_dictionary, selector); } else {
send(method_dictionary, "at:", selector); }
The code checks whether the method dictionary is of the type known to the VM. If it is, the method dictionary is directly accessed. If it is another type of object however, the VM invokes the method at: on the object, passing the selector as argument. This approach is inconsistent with the polymorphic behavior normally exhibited by the language [32]. Rather than sending a polymorphic message, the VM developer needs to manually insert these ex- tension points wherever he sees fit. After compilation, the extension points are hard-wired into the runtime and inaccessible. It is not possible for a user to introduce unforeseen reflective capabilities to the system. Many Smalltalk VMs provide reflective access to the method dictionaries in classes, but do not support custom method dictionaries at runtime. Those VMs crash when an instance of a customized class receives a message, since they violate the encapsulation of the meta-level object by grabbing the method directly out of the dictionary’s memory.
Douance et al. [44] claim that it is useful to build custom interpreters that embed new reflective capabilities in an effort to optimize the amount of information that is actually reified. They propose to implement specific changes by modifying a meta-circular interpreter that is compiled to a new interpreter for each specific metaobject protocol.
Chapter 3. Background and Problems
Language interoperability issues. Interoperability is a key requirement for the evolution of a language. Even though we might not care about inter- operating with other languages, it is vital that new versions of a language will be able to interact with older versions of the same language.
While many modern programming languages are only slight variations in semantics of one another (especially in the case of different versions of the same language), they are generally not compatible in terms of libraries, run- times and tools. To run a second language in a runtime, the second language needs to adapt as well as possible to the first for performance and interoper- ability. Even if the second language has a better performance potential than the first, it is rather difficult for the second to surpass the performance of the first. Feature sharing between both languages is often problematic because of mismatches in object model and execution semantics. If a third language is implemented on top of the first language, interoperability between the second and third language is even less guaranteed to work out of the box.