Mu is a micro virtual machine designed to support high-level programming languages. It focuses on three basic concerns:
- garbage collection
- concurrency
- just-in-time compiling
Many programming languages are implemented on virtual machines.
There are many aspects in a language implementation. There are high-level aspects which are usually language-specific, including:
- parsing high-level language programs, or loading high-level byte codes
- object oriented programming, including classes, inheritance and polymorphism (if applicable)
- functional programming, including high-order functions, pattern matching, etc. (if applicable)
- eager or lazy code/class loading
- high-level optimisation
- a comprehensive standard library
as well as low-level aspects which are usually language-neutral, including:
- an execution engine (for example, JIT compiling)
- a model of threads and a memory model
- garbage collection
A "monolithic" VM implements everything listed above. JVM is one of such VMs. Creating such a VM is a huge amount of work. It takes two decades and billions of dollars for the JVM to have a high-quality implementation. Such man-power and investment is usually unavailable for other languages.
We coined the term "micro virtual machine". It is an analogue to the term "microkernel" in the operating system context. A micro virtual machine, like a micro kernel, only does what absolutely needs to be done in the micro virtual machine, and pushes most high-level aspects to its client, the counterpart of the "services" of a microkernel.
In a language implementation with the presence of a micro virtual machine, the micro virtual machine shall handle those three low-level aspects, namely concurrency, JIT and GC, and the client handles all other high-level aspects.
Take JVM as an example. If JVM were implemented as a client of a micro virtual machine, it only needs to handle JVM-specific features, including the byte-code format, class loading and aspects of object-oriented programming.
Traditional JVM +-------------------+ +---------------------------+ | | | | | *JVM* | | *Java Client* | | byte code format | | byte-code format | | class loading | | class loading | | OOP | | OOP | | GC | | | | concurrenty | +---------------------------+ | JIT compiling | | *micro virtual machine* | | | | GC,concurrency,JIT | +-------------------+ +---------------------------+ | *OS* | | *OS* | +-------------------+ +---------------------------+
Mu is a concrete micro virtual machine.
The main part of this project is this specification which defines the behaviour of Mu and the interaction with the client. This allows multiple compliant implementations.
The specification mainly includes the type system, the instruction set and the Mu client interface (sometimes called "the API").
The whole system is divided into a language-specific client and a language-neutral micro virtual machine (in this case, it is Mu).
| source code or byte code v +-----------------+ | client | +-----------------+ | ^ Mu IR / | | traps/watchpoints/ API call | | other events v | +-------------------+ manages +---------+ | Mu (the micro VM) |----------->| Mu heap | +-------------------+ +---------+
A typical client implements a high-level language (e.g. Python or Lua). Such a client would be responsible for loading, parsing and executing the source code or byte code.
The client submits programs to Mu in a language called Mu Intermediate Representation, a.k.a. Mu IR. The Mu IR code is then executed on Mu.
The client can directly manipulate the states of Mu using the Mu client Interface, a.k.a. the API. The API can access the Mu memory (including the heap), create Mu threads, stacks, introspect stack states and so on. The Mu IR code mentioned above is submitted via the API, too.
There are events which Mu cannot handle alone. These include lazy code loading, requesting for optimisation/deoptimisation and so on. In these cases, Mu generates events to be handled by the client.
Mu handles garbage collection internally. Mu can identify all references held inside Mu and also tracks all references held by the client. So exact GC is possible in Mu without the intervention from the client.
The Mu type system has scalar and vector integer and floating point types, aggregate types including structs, arrays and hybrids, as well as reference types. The type system is low level, similar to the level of C, but natively supports reference types.
Mu is agnostic of the type hierarchy in high-level languages, but the client can implement its language-specific type system and run-time type information on top of the Mu type system.
See Type System for more details.
The Mu instruction set is similar to (and is actually inspired by) the LLVM's instruction set. There are primitive arithmetic/logical/relational/conversion operations and control flow instructions.
Mu has its own exception handling, not depending on system libraries as C++ does. Mu IR programs can throw and catch exceptions, but the client needs to implement its own exception type hierarchy if applicable.
There are garbage-collection-aware memory operations, including memory allocation, addressing and accessing. The client does not need to implement garbage collection algorithms; it only needs to use reference types and related instructions and Mu handles the rest.
Trap instructions let Mu IR programs talk back to the client for events it cannot handle.
There are also instructions for handling stack and threads.
See Instruction Set and Common Instructions for more details.
The Mu client interface (API) allows the client to directly manipulate the state of Mu.
The API can load Mu IR code.
The API can create threads and stacks. The usual way to start a Mu program is to create a new stack with a function on the bottom of the stack and create a new thread on it to start execution. (The concept of threads and stacks are discussed later.)
The API can directly allocate and access the Mu memory. References are indirectly exposed to the client as handles rather than raw pointers for the ease of garbage collection. (The JVM takes the same approach.)
The client also handles trap events generated by the Mu IR code. The client can introspect the selected local variables on the stack and perform on-stack replacement (i.e. OSR. Discussed later.)
See Client Interface for more details.
The (unsafe) native interface is designed to directly interact with native programs. It gives Mu program direct access to the memory via pointers, and allows pinning Mu objects so that they can be accessed by native programs. It also allows Mu to call a native (usually C) function directly, and allows native programs to call back to selected Mu functions. (The .NET CLR takes similar approach, i.e. giving the high-level program "unsafe" access to the lower level.)
This interface is different from the client API. The main purpose is to implement the system-interfacing part of the high-level language, such as the IO and the networking library.
See Native Interface for more details.
Mu supports threads. Mu threads are usually implemented with native OS threads, but this specification does not enforce this. Multiple Mu threads may execute simultaneously.
Mu has a C11/C++11-like memory model. There are atomic memory access with different memory orders. The client should generate code with the appropriate memory order for its high-level language.
Mu provides a Futex-like mechanism similar to the counterpart provided by the Linux kernel. It is the client's responsibility to implement mutex locks, semaphores, conditions, barriers and so on using atomic memory accesses and the futex.
See Threads and Stacks for details about threads and Memory Model for the Mu memory model.
Mu distinguishes between threads and stack. In Mu, a thread is the unit of CPU scheduling and a stack is the context in which a thread executes. An analogy is "workers and jobs".
A stack has multiple frames, each of which is a context of a function activation, including local variables and the current instruction.
A swap-stack operation unbinds a thread from a stack (the old context) and bind to another stack (the new context). As a result, the old context of execution is paused and can be continued when another swap-stack operation binds another thread (may not be the same old thread) to that stack. This is similar to letting a worker stop doing one job and continue with another job.
The swap-stack operation is essentially a model of symmetric coroutines. It allows the client to implement coroutines in high-level languages (e.g. Ruby, Lua, Go as well as Python and ECMAScript 6).
It also allows the client to implement its own light-weight thread. This is particularly useful for languages with massively many threads (e.g. Erlang).
See Threads and Stacks for details.
It is a common strategy to use a fast compiler to compile high-level programs to suboptimal low-level code, and only optimise when the implementation decided at run time that a function (or loop) is hot. Then an optimised version is compiled. Optimising compilation usually takes longer, but the code runs faster.
In Mu, a function can have zero or more versions. When a function is called, it always calls the newest version.
The semantic of function definition in Mu is to create a new version of a function. What the client should do is to generate an optimised version of the high-level code in Mu IR and submit it to Mu. All call sites and function references are automatically updated.
If a function has zero versions, it is "undefined". Calling such a function will "trap" to the client. Such functions behave like "stubs" and this gives the client a chance to implement lazy code/class loading.
See Intermediate Representation for the definition of functions and versions, and see Client Interface for the code loading interface.
At the same time when an optimised version of a function is compiled, there are existing activations on the stack still running the old version. On-stack Replacement (OSR) is the operation to replace an existing stack frame with another frame.
Mu provides two primitives in its API:
- Pop the top frame of a stack.
- Given a function and its arguments, create a new frame and push it on the top of a stack.
Note that Mu is oblivious about whether the new version is "equivalent to" or "better than" the old version. The responsibility of optimisation is pushed to the client.
See Client Interface for more details.
The Memory chapter provides more detail about garbage collection and memory allocation/accessing.
The Portability chapter describes the requirements of implementations. It summarises corner cases which may result in different or undefined behaviours in different platforms.