• No results found

5.3 The Actor as the Abstraction for Adaptive Programming

5.3.5 Media Player

In order to be useful as a media player, the migration delay of an actor within the media application must be minimal to prevent the user noticing the transition. For video playback, a succession of images are displayed at a rate of 24 frames every second to provide the illusion of movement. This displays a frame every 41.6 ms, hence the migration time must be less than this. Figure 5.14b shows the time taken to migrate the display actor between a desktop machine and a RaspberryPi, including any images currently referenced by the actor. The worst case delay is approximately 29 ms, with the average time being approximately 18 milliseconds. This delay includes the time to reconstruct the actor, its state, and all channel connections, meaning that after this time the actor will be able to continue operation. Six images are used in the media player example; the sizes of each image is shown in Table 5.7. Figure 5.14b also shows the time taken to send the largest image remotely to the migrated actor. The combination of the average migration time (18 ms) with the average transmission time (5 ms) is lower than the 40 ms frame rate delay. It is important to note that this transmission time is less than that seen in Figure C.8 in Appendix C for a similar data size. This is because the latter figure uses a more complex data type which takes longer to marshal/demarshal.

5.4

Summary

Given the large number of heterogeneous computing platforms which are currently

available, a number of esoteric programming styles have emerged. The work in this chapter has argued that the actor-model of computation can be used to present a simpler

homogeneous programming model both within and across different computing domains, and that executing such a model on a VM does not compromise performance and enables fine-grained adaptation.

By comparison with nesC and TinyOS, the current de-facto choice in WSN programming, Section 5.1 shows that the actor-based model of computation has been shown to be both

simpler and as performant on highly resource-constrained embedded hardware, across a number of representative and realistic applications. The execution of Ensemble applications on the Ensemble VM facilitates this without being onerous in either space or time. The use of a movable memory space enables the shared-nothing semantics of the actor model and automated garbage collection, without incurring increased memory consumption and fragmentation.

By noting the parallels between shared-nothing actors, which communicate via explicit message-passing, and accelerator-based computation, which requires explicit data movement, actors have been used as an abstraction for accelerator-based programming. Section 5.2 describes that by marking an actor as a kernel, with its behaviour clause becoming the logic for the kernel, and using channels to convey data between the

kernel-actor and other actors, the actor model can completely abstract the large amount of boilerplate code require for OpenCL, and provide a more natural programming model for such computation. The use of movability provides a type-safe way for developers to take advantage of common GPU programming optimisations, while again maintain the encapsulation of actors. The performance of using this approach is comparable to hand written C code using OpenCL.

Finally, given the numerous operating conditions which are presented by heterogeneous hardware platforms connected by different networks types, it is no longer sufficient to think in terms of static software configurations for homogeneous devices. By using the actor as the unit of adaptation, Section 5.3 has described the performance cost of adapting

actor-based applications across a set of heterogeneous devices. These results show that this is possible on highly resource-constrained devices, as well as showing minimal cost for medium to highly provisioned devices, including GPUs.

Chapter 6

Conclusions and Future Work

Two major trends in computing hardware in the last decade are the increasing number of processing cores, both in single CPU chips and in dedicated peripheral devices, such as GPUs and co-processors, and the increase in ubiquitous heterogeneous distributed systems. While these advances present significant potential benefits to performance and enable new forms of digital interaction over traditional desktop computing, programming these devices is challenging at best. To aid the use of these platforms for non-experts or non-computing scientists who seek to benefit greatly from these hardware advances, it is essential to provide better programming abstractions and runtime support.

The goal of this dissertation is to provide such support by showing that an actor-based programming abstraction can greatly simplify programming such hardware devices and systems, without incurring any notable performance penalty, enabling developers to focus on solving problems.

6.1

Thesis Statement Revisited

This section revisits the thesis statement presented at the start of the dissertation to assess the impact of the work presented in this dissertation. The thesis is as follows:

The use of encapsulated, shared-nothing loci of computation and explicit message passing, found in the actor programming model, will both enable and simplify the programming of concurrent, distributed, and adaptive applications across heterogeneous platforms at different levels of computing scale.

To prove this assertion, the following work has been done:

Chapter 3 describes the design of an actor-based programming language. The decision to create a new language was due to the limitations of other actor-based approaches. This

language supports the creation of applications based on shared-nothing loci of computation which interact via explicit, typed message-passing. As well as automated garbage collection of heap allocated memory, the language supports the idea of a simple movable heap space to address the increased heap usage and fragmentation caused by the use of shared-nothing semantics and automated garbage collection. Furthermore, this chapter describes language support for accelerator-based computation via an actor-based abstraction. This model fits well due to the parallels between these two idioms.

Chapter 4 describes the process of compiling Ensemble applications into a custom class file format, as well as the implementation of a runtime which interprets these applications. The runtime natively supports the concepts expressed in the language, such as actors and channel-based communication, but also the actor-based abstraction and execution of

OpenCL kernels, and the discovery and runtime adaptation of actors and stages, as well as a channel-based abstraction of network communication. There is also a discussion of porting the runtime to a number of different hardware platforms, including resource-constrained embedded systems.

The main justification of using an actor-based abstraction for programming concurrent, distributed, embedded, and adaptive applications is made in Chapter 5. The chapter is split into three sections:

Firstly, a justification of the actor as the unit of abstraction for embedded programming is made. To show this, a number of applications which covered the different equivalence classes of activity found in embedded applications were used to compare this work with the popular TinyOS/nesC system. In terms of linguistic complexity, Ensemble applications express much simpler, functionally equivalent code when compared to nesC equivalents. Performance comparisons show that when Ensemble is compiled to C code directly, it provides at least equivalent performance to TinyOS, and still had plenty of RAM and ROM available on the embedded hardware. This chapter also discussed the runtime cost of interpreting Ensemble applications by the custom Ensemble VM on resource-constrained hardware. The results show comparable performance for typical embedded applications to the native performance, and do not show excessive resource consumption. Hence,

interpreted applications are a valid base to explore runtime adaptation on embedded

devices. Also, this section proves that the movable memory space can be used to reduce the increased heap fragmentation and allocation costs incurred by the use of automatic garbage collection and shared-nothing semantics.

Secondly, a justification that using an actor-based abstraction for programming

accelerator-based concurrency is made. Specifically, by noting the parallels between the two programming models, an actor is used to abstract the representation of a kernel, and channel-based communication abstracts the explicit data movements between actors, kernel

or otherwise. The use of objective software complexity metrics have shown that the

actor-based abstraction is significantly simpler when compared to equivalent C and simpler when compared to equivalent OpenACC implementations across a range of applications. Also, performance results show significant improvements when compared to OpenACC, and comparable performance to C. By using the movable heap space, developers are able to leave data on an accelerator (a common optimisation) to improve performance, without violating the shared nothing semantics of the actor model.

Thirdly, a justification of using actors as an abstraction for adaptive computation is made. Given the description in previous chapters of how adaptation, as expressed in Ensemble, overcomes or addresses the limitations in other actor-based languages, this section focuses on performances results. There are performance results for an embedded device

communicating over Bluetooth to show that such adaptation is both possible and feasible; limitations in the hardware prevented a full set of experiments. Also, there was reference to a suite of micro benchmarks in Appendix C showing the minimal cost of individual

adaptation operations, as well as a discussion and results showing the ease of applying adaptation to kernel-actors and the limited cost of doing so. To motivate the argument of the actor as the unit of adaptation, two applications were created to represent a number of use cases for adaptation: a draughts computer game and a mobile media player. The results showed that little effort was required to apply adaptation to these applications, and that the performance costs for discovery, spawn, or migration were minimal compared to the potential performance improvements that runtime adaptation offered.