The Mu IR program which the micro virtual machine compiles and executes comes from the client.
6.2.1 Bundle as the Unit of Loading
The unit of Mu IR code loading is bundle. A bundle is the counterpart of a JVM
.classfile or an LLVM module. As shown in Figure 5.1, a Mu IR bundle contains many top-level definitions, which are types, function signatures, constants, static cells, functions, function versions and exposed functions. The client constructs and submits the Mu IR code bundles to Mu via the API.
Conceptually, a Mu instance has one global bundle which is initially empty. Every time the client loads a bundle, all top-level definitions are merged into the global bundle. Therefore, at any time, the global bundle contains all of the top-level defini- tions, such as types and functions, that the client has ever submitted to Mu. Although Mu may implement parallel bundle loading, the Mu specification requires that the loading of all bundles to be serialised, so that all bundles appear to be loaded in one particular order. Therefore, the code in each bundle may only use the top-level defi- nitions, such as types and functions, defined in the current bundle or any previously loaded bundle, but not from bundles loaded in the future.
This model is very different from the C programming language, where all source files are ‘parallel’ — they are compiled independently and linked together, and each file can still refer to symbols defined elsewhere and the linker resolves the inter- dependency. C and Java programmers may find the Mu bundle loading model counter-intuitive, thinking that Mu bundles should mirror the high-level C source codes or JVM .class files. However, if we think from the perspective of Mu, the design is logical. As shown in Figure 6.1, when we observe from Mu’s perspective, the exact organisation of language-level modules is its implementation detail which Mu is oblivious of. The only relation between bundles is the order in which they are loaded from the unknown outside world called the client. In this way, we see the process of
§6.2 Bundle Building and Loading 57
Mu
client
mod
mod
mod
logical structure loading(a) The client’s perspective
client
Mu
bundle
bundle
bundle
te m po ra l or de r loading (b) Mu’s perspectiveFigure 6.1: Bundle loading from different perspectives. If we focus on the client as in Figure 6.1(a), the modules should represent the structure of the high-level program, and Mu is just a destination of those modules. But if we focus on Mu as in Figure 6.1(b), then all details inside the client are beyond the concern of Mu. Mu only sees many bundles loaded from the client one after another, and the only relation between bundles is the temporal order of loading.
loading one bundle after another as the process in which Mu gradually gains more knowledge about the program that the client intends to execute. Naturally, to simplify the implementation of Mu, we require each bundle to only refer to knowledge, i.e. top-level definitions, which Mu has already gained (in previously loaded bundles) or is about to gain (in the current bundle), and does not require Mu to keep note of unresolved top-level definitions.
A bundle is the unit of loading. It does not need to match the logical module of the language the client is implementing. A bundle may be as small as a single function the client has just optimised. A bundle may also be as big as the amalgamation of several inter-dependent modules. There is not restriction in size, so we expect the client to build and load a bundle whenever it has the need to submit any code.
6.2.2 The IR-building API
The client builds Mu IR bundles using the API.
58 Mu’s Client Interface
IR inside Mu. The client can order Mu to load the bundle when it is completely built, or abort the IR-building process at any time. The irbuilderrefopaque reference type refers to anIR builderobject which holds temporary AST nodes while building a bundle.
AST nodes refer to each other by symbolic IDs rather than direct pointers. The Mu IR inevitably contains cyclic references between nodes. For example, a basic block refers to a list of instructions, and the BRANCH instruction refers to a basic block. Functional languages may have difficulties handling cyclic references, and this general design makes the API itself (and Mu itself) implementable in both imperative and functional languages.
The purpose of the IR-building API is to communicate Mu IR code between the client and Mu as efficiently as possible. It is not intended to be used as a data structure for the client to perform code transformation, which happens to be the purpose of the LLVM IR. Adapting our API for transformation will greatly complicate the API by adding more functions for modification and query, which will increase the burden on Mu which should be kept minimalist. To compensate this, client side libraries can be developed for the client to perform Mu-IR-to-Mu-IR optimisations.