Intrinsics - Instruction Set - Micro Virtual Machines: A Solid Foundation for Managed Language

5.3 Instruction Set

5.3.8 Intrinsics

With the design of Mu continuing to evolve, more and more primitive operations, such as object pinning, are added to the Mu instruction set. These primitive operations cannot be expressed with existing instructions, but inventing a new instruction for every new operation will cause the instruction set to explode.

Mu has a mechanism — intrinsics21 — which allows the instruction set to be extended without inventing new instruction formats. All intrinsics are encoded in a common format: ‘%rv = INTRINSIC @name <@T1 @T2 ...> (%arg1 %arg2 ...)’. New intrinsics can be added by simply inventing new names.

This mechanism is similar to intrinsic functions in C and LLVM. However, unlike intrinsic functions provided by C implementations, Mu intrinsics are fully standard- ised by the Mu specification, and understood by all Mu implementations. Unlike LLVM, Mu intrinsics have a richer syntax than function calls. For example, Mu intrinsics may accept type arguments in addition to value arguments, making type- polymorphic intrinsics easy to encode, while LLVM intrinsic functions have to encode the type arguments as part of the function name , such as thellvm.sqrt.f32and

llvm.sqrt.f64intrinsic functions for different floating point types, which may be ugly when the types are complex.

All functions in the Mu client API are also available as intrinsics. This allows meta-circular clients, i.e. Mu clients which are themselves Mu IR programs, to invoke the API via intrinsics without resorting to the native interface.

5.4 Summary

This chapter presented the Mu intermediate representation (IR). The Mu IR is designed using LLVM as a frame of reference. The Mu type system is low-level, but has built-in support for garbage-collected reference types. The Mu instruction set addresses the major concerns of Mu, namely execution, concurrency and garbage collection, and also allows direct interaction with native programs.

In the next chapter, we will introduce the Mu client interface, the API via which the client communicates with Mu.

21_{The current Mu specification calls them ‘common instructions’ because all such instructions have}

a common encoding. This name is misleading because these instructions are much less common than other instructions, such asADDandSUB. We plan to change this name to ‘intrinsics’.

54 Mu Intermediate Representation

1 .global @CookieToObjectMap <@refToHashMap> 2

3 .funcdef @foo VERSION %1 <@foo.sig> { 4 %entry():

5 %v = INTRINSIC @uvm.native.get_cookie 6 %ctx_obj = ??? // TODO: get the object from %v

7 // ...

8 } 9

10 .const @MY_COOKIE1 <@i64> = 100 11 .const @MY_COOKIE2 <@i64> = 200 12

13 .expose @nativefoo1 = @foo #DEFAULT @MY_COOKIE1 14 .expose @nativefoo2 = @foo #DEFAULT @MY_COOKIE2

(a) Mu IR 1 foo: 2 // prologue 3 push rbp 4 mov rbp, rsp 5 6 // cookie is in rax

7 // foo body continues here 8

9 nativefoo1:

10 mov rax, 100 // load the cookie value

11 jmp foo // jump to the actual foo

13 nativefoo2:

14 mov rax, 200 15 jmp foo

(b) x64 Assembly

Figure 5.8: Cookies of exposed functions. Figure 5.8(a) shows a Mu function@foowhich is exposed to two function pointers@nativefoo1 and@nativefoo2, with the same default C calling convention, but different cookies. When the native program calls either exposed function,@foowill be executed, but the value of%vwill be 100 if called via@nativefoo1, and 200 if called via@nativefoo2. This value can be used to to lookup the context object (such as the object of the method) using certain client-designed global map data structure. Figure 5.8(b) shows one possible implementation on x64. The registerraxis reserved for the cookie as it is not used by the C calling convention. The exposed functions load the literal value 100 or 200 intoraxbefore jumping to the actual@foo, where the cookie is available in theraxregister. Mu can bulk-allocate arrays of suchmov-jmpsequences and reuse them when Mu functions are exposed and unexposed at run time.

Chapter6

Mu’s Client Interface

The preceding chapter presented the Mu intermediate representation for programs executed on Mu. This chapter presents the Mu client interface (API), which allows the language client to control Mu and handle trap events at run time.

This chapter is structured around the use of the Mu API. Section 6.1 presents a high-level overview of the API design; Section 6.2 discusses the API for the run-time loading of Mu bundles; Section 6.3 discusses the API for trap handling and run-time optimisation; Section 6.4 summarises this chapter.

6.1 Overview

Mu provides a bi-directional API to communicate with its client. The client can send messages to Mu for the purposes of: (1) building and loading Mu IR code bundles, (2) accessing Mu memory, and (3) introspecting and manipulating the state of Mu threads and stacks. Mu sends messages to the client if aTRAPorWATCHPOINT

instruction is executed.

The client API is expressed in the specification in the form of a header in the C language. This makes C the canonical language for the interface between Mu and the client, but language bindings for other languages can be created.

The client API is different from the ‘unsafe native interface’ introduced in Sec- tion 5.3.7. The purpose of the native interface is interacting with native libraries, while the purpose of this API is the communication between the micro virtual machine and the client.

How tightly the client is coupled with Mu is not specified. The client may be a meta-circular client which itself is a Mu IR program. The client may be a C program running in the same process as Mu. It can also be running in a different process, or a different machine, that controls the micro virtual machine remotely.

Like JNI, the API lets the client hold Mu values, including traced references, via opaque handles tracked by Mu. This hides the representation of Mu values, especially opaque reference types, from the client.

The API can create manyclient contexts. Each context is an entity in Mu that holds handles on behalf of the client, and the garbage collector may trace all references held by all contexts in order to perform exact GC. The context also lets the client

56 Mu’s Client Interface

allocate objects in the Mu memory, access the Mu memory (including the heap), and create other objects such as Mu threads and stacks. For efficient implementation, the context is intentionally not thread-safe. Each context should be used by at most one client thread at a time, allowing operations on the context to be implemented without excessive synchronisation. For example, each context might have its own heap allocation buffer, allowing the memory allocation fast-path to avoid taking a lock on the entire heap.

The states held by client contexts are similar to those held by Mu threads. A Mu thread holds many Mu values as local variables on the stack, whereas a client context holds many Mu values as handles. Both Mu threads and client contexts may have local GC allocation buffers. In some sense, a client context enables a client thread to perform operations that otherwise could only be performed by Mu threads.

In document Micro Virtual Machines: A Solid Foundation for Managed Language Implementation (Page 71-74)