Skip to content
This repository has been archived by the owner on Aug 2, 2019. It is now read-only.

Latest commit

 

History

History
511 lines (345 loc) · 17.9 KB

common-insts.rest

File metadata and controls

511 lines (345 loc) · 17.9 KB

Common Instructions

This document specifies Common Instructions.

Common Instructions are instructions that have a common format and are used with the COMMINST super instruction. They have:

  1. An ID and a name. (This means, they are identified. See uvm-ir.rest.)
  2. A flag list.
  3. A type parameter list.
  4. A value parameter list.
  5. An optional exception clause.
  6. A possibly empty (which means optional) keep-alive clause.

Common instructions are a mechanism to extend the Mu IR without adding new instructions or changing the grammar.

NOTE: Common instructions were named "intrinsic function" in previous versions of this document. The name was borrowed from the LLVM. However, the common instructions in Mu are quite different from the usual concept of intrinsic functions.

Intrinsic functions usually mean a kind a function that is understood directly by the compiler. The C function memcpy is considered an intrinsic function by some compilers. In JikesRVM, methods of the Magic class are a kind of intrinsic functions. They appear like ordinary functions in the language and bypass all front-end tools including the C parser and javac, but they are understood by the backend. Their purpose is to perform tasks that cannot be expressed by the high-level programming language, including direct raw memory access in Java.

Common instructions only differ from ordinary Mu instructions in that they have a common format and are called by the COMMINST super instruction. The purpose is to add more instructions to the Mu IR without having to modify the parser.

Common instructions are not Mu functions and cannot be called by the CALL instruction, nor can it be directly used from the high-level language that the client implements. The Mu client must understand common instructions because it is the only source of IR code of Mu. That is to say, there is no way any higher-level program can express anything which Mu knows but the client does not. For special high-level language functions that cannot be directly implemented in the high-level programming language, like the methods in the java.lang.Thread class, the client must implement those special high-level language functions in "ordinary" Mu IR code, which may or may not involve common instructions. For example, creating a thread is a "magic" in Java, but it is not more special than executing an instruction (NEWTHREAD) in Mu. Some Java libraries require Mu to make a CCALL to some C functions which are provided by the JVM, and they slip under the level of Mu. But Mu and the client always know the fact that "it call C function" and it is not magic.

This document uses the following notation:

[id]@name [F1 F2 ...] < T1 T2 ... > <[ sig1 sig2 ... ]> ( p1:t1, p2:t2, ... ) excClause KEEPALIVE -> RTs
  • id is the ID and @name is the name.
  • [F1 F2 ...] is a list of flag parameters.
  • [T1 T2 ...] is a list of type parameters. The users pass types into the common instruction via this list.
  • <[ sig1 sig2 ... ]> is a list of function signature parameters.
  • (p1:t1, p2:t2, ...) is a list of pairs of symbolic name and type. It is the value parameter list with the type of each parameter. The user only passes values via this list, and the types are only parts of the documentation.

If any of the above list is omitted in this document, it means the respective common instruction does not take that kind of parameters.

If excClause or KEEPALIVE are present, they mean that the common instruction accepts exception clause or keepalive clause, respectively. Otherwise the common instruction does not branch to exception destinations nor support any keep-alive variables.

RTs are the return types. If the return type is omitted, it means it produces no results (equivalent to -> ()).

The names of many common instructions are grouped by prefixes, such as @uvm.tr64.. In this document, their common prefixes may be omitted in their descriptions when unambiguous.

Thread and Stack operations

[0x201]@uvm.new_stack <[sig]> (%func: funcref<sig>) -> stackref

Create a new stack with %func as the stack-bottom function. %func must have signature sig. Returns the stack reference to the new stack.

The stack-bottom frame is in the state READY<Ts>, where Ts are the parameter types of %func.

This instruction continues exceptionally if Mu failed to create the stack. The exception parameter receives NULL.

[0x202]@uvm.kill_stack (%s: stackref)

Destroy the given stack %s. The stack %s must be in the READY state and will enter the DEAD state.

[0x203]@uvm.thread_exit

Stop the current thread and kill the current stack. The current stack will enter the DEAD state. The current thread stops running.

[0x204]@uvm.current_stack -> stackref

Return the current stack.

[0x205]@uvm.set_threadlocal (%ref: ref<void>)

Set the thread-local object reference of the current thread to %ref.

[0x206]@uvm.get_threadlocal -> ref<void>

Return the current thread-local object reference of the current thread.

64-bit Tagged Reference

[0x211]@uvm.tr64.is_fp  (%tr: tagref64) -> int<1>
[0x212]@uvm.tr64.is_int (%tr: tagref64) -> int<1>
[0x213]@uvm.tr64.is_ref (%tr: tagref64) -> int<1>
  • is_fp checks if %tr holds an FP number.
  • is_int checks if %tr holds an integer.
  • is_ref checks if %tr holds a reference.

Return 1 or 0 for true or false.

[0x214]@uvm.tr64.from_fp  (%val: double) -> tagref64
[0x215]@uvm.tr64.from_int (%val: int<52>) -> tagref64
[0x216]@uvm.tr64.from_ref (%ref: ref<void>, %tag: int<6>) -> tagref6
  • from_fp creates a tagref64 value from an FP number %val.
  • from_int creates a tagref64 value from an integer %val.
  • from_ref creates a tagref64 value from a reference %ref and the integer tag %tag.

Return the created tagref64 value.

[0x217]@uvm.tr64.to_fp  (%tr: tagref64) -> double
[0x218]@uvm.tr64.to_int (%tr: tagref64) -> int<52>
[0x219]@uvm.tr64.to_ref (%tr: tagref64) -> ref<void>
[0x21a]@uvm.tr64.to_tag (%tr: tagref64) -> int<6>
  • to_fp returns the FP number held by %tr.
  • to_int returns the integer held by %tr.
  • to_ref returns the reference held by %tr.
  • to_tag returns the integer tag held by %tr that accompanies the reference.

They have undefined behaviours if %tr does not hold the value of the expected type.

Math Instructions

TODO: Should provide enough math functions to support:

  1. Ordinary arithmetic and logical operations that throw exceptions when overflow. Example: C# in checked mode, java.lang.Math.addOvf added in Java 1.8.
  2. Floating point math functions. Example: trigonometric functions, testing NaN, fused multiply-add, ...

It requires some work to decide a complete list of such functions. To work around the limitations for now, please call native functions in libc or libm using CCALL.

Futex Instructions

See threads-stacks.rest for high-level descriptions about Futex.

Wait

[0x220]@uvm.futex.wait <T> (%loc: iref<T>, %val: T) -> int<32>
[0x221]@uvm.futex.wait_timeout <T> (%loc: iref<T>, %val: T, %timeout: int<64>) -> int<32>

T must be an integer type.

wait and wait_timeout verify if the memory location %loc still contains the value %val and then put the current thread to the waiting queue of memory location %loc. If %loc does not contain %val, return immediately. These instructions are atomic.

  • wait waits indefinitely.
  • wait_timeout has an extra %timeout parameter which is a 64-bit unsigned integer that represents a time in nanoseconds. It specifies the duration of the wait.

Both instructions are allowed to spuriously wake up.

They return a signed integer which indicates the result of this call:

  • 0: the current thread is woken.
  • -1: the memory location %loc does not contain the value %val.
  • -2: spurious wakeup.
  • -3: timeout during waiting (wait_timeout only).

Wake

[0x222]@uvm.futex.wake <T> (%loc: iref<T>, %nthread: int<32>) -> int<32>

T must be an integer type.

wake wakes N threads in the waiting queue of the memory location %loc. This instruction is atomic.

N is the minimum value of %nthread and the actual number of threads in the waiting queue of %loc. %nthread is signed. Negative %nthread has undefined behaviour.

It returns the number of threads woken up.

Requeue

[0x223]@uvm.futex.cmp_requeue <T> (%loc_src: iref<T>, %loc_dst: iref<T>, %expected: T, %nthread: int<32>) -> int<32>

T must be an integer type.

cmp_requeue verifies if the memory location %loc_src still contains the value %expected and then wakes up N threads from the waiting queue of %loc_src and move all other threads in the waiting queue of %loc_src to the waiting queue of %loc_dst. If %loc_src does not contain the value %expected, return immediately. This instruction is atomic.

N is the minimum value of %nthread and the actual number of threads in the waiting queue of %loc. %nthread is signed. Negative %nthread has undefined behaviour.

It returns a signed integer. When the %loc_src contains the value of %expected, return the number of threads woken up; otherwise return -1.

Miscellaneous Instructions

[0x230]@uvm.kill_dependency <T> (%val: T) -> T

Return the same value as %val, but %val does not carry a dependency to the return value.

NOTE: This is supposed to free the compiler from keeping dependencies in some performance-critical cases.

Native Interface

Object pinning

[0x240]@uvm.native.pin   <T> (%opnd: T) -> uptr<U>
[0x241]@uvm.native.unpin <T> (%opnd: T)

T must be ref<U> or iref<U> for some U.

  • pin adds one instance of the reference %opnd to the pinning multiset of the current thread. Returns the mapped pointer to the bytes for the memory location. If T is ref<U>, it is equivalent to pinning the memory location of the whole object (as returned by the GETIREF instruction). If opnd is NULL, the result is a null pointer whose address is 0.
  • unpin removes one instance of the reference %opnd from the pinning multiset of the current thread. It has undefined behaviour if no such an instance exists.

Mu function exposing

[0x242]@uvm.native.expose [callconv] <[sig]> (%func: funcref<sig>, %cookie: int<64>) -> U

callconv is a platform-specific calling convention flag. U is determined by the calling convention and sig.

expose exposes a Mu function func as a value according to the calling convention callConv with cookie cookie.

Example:

.funcdef @foo VERSION ... <@foo_sig> (...) { ... }

%ev = COMMINST @uvm.native.expose [#DEFAULT] <[@foo_sig]>
[0x243]@uvm.native.unexpose [callconv] (%value: U)

callconv is a platform-specific calling convention flag. U is determined by the calling convention.

unexpose removes the exposed value.

[0x244]@uvm.native.get_cookie () -> int<64>

If a Mu function is called via its exposed value, this instruction returns the attached cookie. Otherwise it returns an arbitrary value.

Metacircular Client Interface

These are additional instructions that enables Mu IR programs to behave like a client.

Some types and signatures are pre-defined. They are always available. Note that the following are not strict text IR syntax because some types are defined in line:

.typedef @uvm.meta.bytes   = hybrid <int<64> int<8>>    // ID: 0x260
.typedef @uvm.meta.bytes.r = ref<@uvm.meta.bytes.r>     // ID: 0x261
.typedef @uvm.meta.refs    = hybrid <int<64> ref<void>> // ID: 0x262
.typedef @uvm.meta.refs.r  = ref<@uvm.meta.refs.r>      // ID: 0x263

.funcsig @uvm.meta.trap_handler.sig       = (stackref int<32> ref<void>) -> ()   // ID: 0x264

In bytes and refs, the fixed part is the length of the variable part. bytes represents a byte array. ASCII strings are also represented this way.

ID/name conversion

[0x250]@uvm.meta.id_of (%name: @uvm.meta.bytes.r) -> int<32>
[0x251]@uvm.meta.name_of (%id: int<32>) -> @uvm.meta.bytes.r
  • id_of converts a textual Mu name %name to the numerical ID. The name must be a global name.
  • name_of converts the ID %id to its corresponding name. If the name does not exist, it returns NULL. The returned object must not be modified.

They have undefined behaviours if the name or the ID in the argument do not exist, or %name is NULL.

Bundle/HAIL loading

[0x252]@uvm.meta.load_bundle (%buf: @uvm.meta.bytes.r)
[0x253]@uvm.meta.load_hail   (%buf: @uvm.meta.bytes.r)

load_bundle and load_hail loads Mu IR bundles and HAIL scripts, respectively. %buf is the content.

TODO: These comminsts should be made optional, and the IR Builder API should be provided as comminsts, too.

Stack introspection

[0x254]@uvm.meta.new_cursor         (%stack: stackref) -> framecursorref
[0x255]@uvm.meta.next_frame         (%cursor: framecursorref)
[0x256]@uvm.meta.copy_cursor        (%cursor: framecursorref) -> framecursorref
[0x257]@uvm.meta.close_cursor       (%cursor: framecursorref)

In all cases, cursor and stack cannot be NULL.

  • new_cursor allocates a frame cursor, referring to the top frame of %stack. Returns the frame cursor reference.
  • next_frame moves the frame cursor so that it refers to the frame below its current frame.
  • copy_cursor allocates a frame cursor which refers to the same frame as %cursor. Returns the frame cursor reference.
  • close_cursor deallocates the cursor.
[0x258]@uvm.meta.cur_func           (%cursor: framecursorref) -> int<32>
[0x259]@uvm.meta.cur_func_Ver       (%cursor: framecursorref) -> int<32>
[0x25a]@uvm.meta.cur_inst           (%cursor: framecursorref) -> int<32>
[0x25b]@uvm.meta.dump_keepalives    (%cursor: framecursorref) -> @uvm.meta.refs.r

These functions operate on the frame referred by %cursor. In all cases, %cursor cannot be NULL.

  • cur_func returns the ID of the frame. Returns 0 if the frame is native.
  • cur_func_ver returns the ID of the current function version of the frame. Returns 0 if the frame is native, or the function of the frame is undefined.
  • cur_inst returns the ID of the current instruction of the frame. Returns 0 if the frame is just created, its function is undefined, or the frame is native.
  • dump_keepalives dumps the values of the keep-alive variables of the current instruction in the frame. If the function is undefined, the arguments are the keep-alive variables. Cannot be used on native frames. The return value is a list of object references, each of which refers to an object which has type T and contains value v, where T and v are the type and the value of the corresponding keep-alive variable, respectively.

On-stack replacement

[0x25c]@uvm.meta.pop_frames_to (%cursor: framecursorref)
[0x25d]@uvm.meta.push_frame <[sig]> (%stack: stackref, %func: funcref<sig>)

%cursor, %stack and %func must not be NULL.

  • pop_frames_to pops all frames above %cursor.
  • push_frame creates a new frame on top of the stack %stack for the current version of the Mu function %func. %func must have the signature sig.

Watchpoint operations

[0x25e]@uvm.meta.enable_watchpoint  (%wpid: int<32>)
[0x25f]@uvm.meta.disable_watchpoint (%wpid: int<32>)
  • enable_watchpoint enables all watchpoints of watchpoint ID %wpid.
  • disenable_watchpoint disables all watchpoints of watchpoint ID %wpid.

Trap handling

[0x260]@uvm.meta.set_trap_handler (%handler: funcref<@uvm.meta.trap_handler.sig>, %userdata: ref<void>)

This instruction registers a trap handler. %handler is the function to be called and %userdata will be their last argument when called.

This instruction overrides the trap handler registered via the C-based client API.

A trap handler takes three parameters:

  1. The stack where the trap takes place.
  2. The watchpoint ID, or 0 if triggered by the TRAP instruction.
  3. The user data, which is provided when registering.

A trap handler is run by the same Mu thread that caused the trap and is executed on a new stack.

A trap handler usually terminates by either executing the @uvm.thread_exit instruction (probably also kill the old stack before exiting), or SWAPSTACK back to another stack while killing the stack the trap handler was running on.

Notes about dynamism

These additional instructions are not dynamic. Unlike the C-based API, these instructions do not use handles. Arguments, such as the additional arguments of push_frame are also statically typed. If the client needs dynamically typed handles, it can always make its own. For example, push_frame can be wrapped by a Mu function which takes a dynamic argument list, checks the argument types, and executes a static @uvm.meta.push_frame instruction on the unboxed values.

Some dynamic lookups, such as looking up constants by ID, are not available, either. It can be worked around by maintaining a HashMap<id,value> (in the form of Mu IR programs) which is updated with each bundle loading. In other words, if the client does not maintain such a map, Mu will have to maintain it for the client.