Non-WSN Computing - A linguistic approach to concurrent, distributed, and adaptive programming

Xen [51] is a virtual machine monitor (or hypervisor) which allows a number of operating systems to run on a single machine simultaneously; Xen itself is the only element to run on the actual hardware. The other operating systems run within virtual machines known as domains. Xen manages each domain’s access to the physical resources and prevents different domains from interfering with each other - i.e. by two nodes trying to concurrently access and modify the same area on disk.

Xen uses a virtualisation technique known as paravirtualisation. This is contrast to full virtualisation as used by VMware [52]. No changes are required in a system to work with VMware, whereas to work with Xen an operating system must be ported in a similar fashion as a system is ported to a new piece of hardware. While this requires more work for the developer, the advantage comes from better runtime performance.

Within Xen it is possible to pause, save, move, and replay a XEN domain from one machine to another. Further work has been done to show that this can be done at runtime with very small overhead and downtime [53]. This work shows that it is possible to not only migrate domains as a proof of concept, but also in real world examples. This approach is designed specifically for the administration of clusters of computers. As adaptation is at the OS level, it is a heavy-weight and coarse-grained approach.

Mirage [54] proposes an alternative application of Xen to large-scale, distributed computing, sometimes referred to as cloud computing. Mirage is a programming framework that enables a user to write an application in a dialect of the Objective Caml language. Each application is then compiled into a custom, standalone Xen domain. This has the advantage of

removing the OS and many layers of software required in a traditional Xen domain, leaving an application-specific binary. This is in the same spirit as conditional compilation.

In terms of performance, testing against custom database benchmarks have shown that Mi- rage performs better than an equivalent application running on Linux at scale. However, certain details are missing from the description of the testing. Specifically, it is claimed that Mirage performs better as the scale of the application grows, however there is no discussion of the number of instances used or how these are allocated. The minimum binary size of a Mirage instance was 600KB, two orders of magnitude smaller than the Linux equivalent and several orders less than the Windows equivalent, however still at least one order of magnitude too big for a sensor mote.

The work of Giurgiu et al. [55] explores the migration of sections of an application between mobile devices and the cloud based on using Java and OSGI component modules. The work addresses the issue that static partitioning of applications for code offloading in mobile phones is not adequate. Instead, they have implemented a system which dynamically profiles an application, and decides what and when to offload code to the cloud. The system is built upon R-OSGI [56]. As applications in this system are written in Java, there is no linguistic mechanism to enforce loosely coupled applications, hence the developer must be relied upon to create suitably partitioned code.

MPI [57] is a well-known framework for message passing communication which is sup- ported by a number of different programming languages, and has been used by many systems, particularly in high performance computing8_{. The system is accessed via a per-}

language API. In MPI different loci of execution are known as processes, where processes are assigned to CPU cores. These processes can be spawned across local or remote machines, where each processes is uniquely identified by its rank. The total number of processes and their ranks are determined when an application is launched with MPI.

MPI supports remote creation of processes and location transparent communication, although users must manually marshal/demarshal complex data types in the language. Fur- thermore, there is no compiletime support for type checking two ends of the communication pipeline, although session types can be used to help address this issue, Section 6.3.1. MPI supports static hardware discovery based on predefined configuration files, however, is it not designed for a dynamic execution environment, where nodes come and go. MPI does not support transparent process migration. If desired, users must use a manual checkpoint and restart mechanism at the language level. Finally, due to the API-based nature of MPI, it leads to low-level, verbose applications [58]. While this gives fine-grained control over application development, it can act as a barrier to non-expert programmers.

An alternative to Java and the JVM is Forth [59, 60]. Forth is a language and runtime which

is composed of words, symbols, such as “+” or “-”, and numbers. These words can either be well knownor defined by the developer. Words are kept in a dictionary which is consulted at runtime to find the definition of a word. A program or word is expressed in reverse polish notation and consists of words, symbols and numbers. Forth interpreters are very simple, small, and easily extensible as most of the work is done in specifying the words for the dictionary. Forth code can either be interpreted or compiled. Different interpreters deal with missing words in different ways, although there is no clear strategy. Some will store missing words to be populated later, some throw errors. One advantage that Forth has over the JVM is that new words can easily be added to the dictionary at runtime, thus new functionality is very easily available. By contrast, the Java VM itself would need to be modified, recompiled and reinstalled in order to add a new bytecode.

In a similar ethos, both Python [61] and Ruby9_{support dynamic code generation and execu-}

tion at runtime, although this is done at a comparatively higher level within the programming language. While these would support the remote creation of code instances quite easily, there are a number of draw backs. The runtimes for these systems are comparatively large, requir- ing 13.1 MB and 6.1 MB for Python (3.2 minimal) and Ruby (1.9.1), respectively, to support a hello world application. These sizes are larger that the storage capability of some of the embedded devices targeted in this work. This is not to say that a smaller runtime could not be created, but a lower level intermediate representation offers a simpler runtime, requires less data during remote communications, and potentially provides more scope for compiletime/runtime optimisations.

Like the JVM, the common language runtime (CLR) [62] is a virtual machine providing services such as memory management, security, and exception handling, and is also designed to execute a common intermediate language. The CLR is a part of the .NET framework. Unlike the JVM, the CLR was designed to execute multiple different languages from the outset, whereas the JVM has become the target of many different languages [63]. Mono10is an open source version of the .NET framework.

Given the existing support for Java, a subset of the Java bytecodes were chosen as the starting point for the Intermediate Representation (IR) of the language. These bytecodes were then extend, and a custom VM was implemented across the scale space. The modifications made to the bytecodes and the custom VM are discussed in Chapter 4.

In document A linguistic approach to concurrent, distributed, and adaptive programming across heterogeneous platforms (Page 34-36)