Skip to content

TCG Plugins vs PANDA Plugins

Andrew Fasano edited this page Nov 14, 2022 · 2 revisions

QEMU TCG Plugin Overview

This writeup describes the TCG plugin interface added to QEMU 6+ and discusses how it compares to the PANDA plugin architecture.

Overview

TCG plugins were added to QEMU in v4.2.0 to allow "users to run experiments taking advantage of the total system control emulation can have over a guest". Plugins compile into a single shared object (for all guest architectures) and are compatible across QEMU builds and versions so long as the plugin API version remains unchanged. Since its initial release, the APIs for TCG plugins have significantly expanded, but the API is still lacking a few key features as of V7.1.

The TCG plugin system is currently maintained by Alex Bennée. The design and APIs are documented in docs/devel/tcg-plugins.rst.

Building Plugins

Configure QEMU with ./configure --enable-plugins (protip: for faster testing use ../configure --enable-plugins --target-list=x86_64-softmmu --disable-docs) then build the emulator with make.

In-tree plugins live in contrib/plugins/. Running make plugins will build all the plugins listed in the NAMES variable defined in contrib/plugins/Makefile. Out of tree plugins can be also be built against the TCG Plugin APIs.

Running Plugins

After building QEMU with plugin support, run a qemu-system binary with the -plugin argument accompanied by a path to a plugin. In-tree plugins are built in build/contrib/plugins/plugin_name.so

For example (from the build directory):

./qemu-system-x86_64 -plugin ./contrib/plugins/my_plugin.so ..

Arguments can be passed by adding a comma then key=value after the plugin name. For example

./qemu-system-x86_64 -plugin ./contrib/plugins/my_plugin.so,arg1=value,arg2=anotherval ...

Plugin Architecture

The Plugin API is defined in include/qemu/qemu-plugin.h.

Developing Plugins

Plugin Skeleton

#include <qemu-plugin.h>

QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;

QEMU_PLUGIN_EXPORT
int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
                        int argc, char **argv)
{
    g_autoptr(GString) report = g_string_new("Hello world, I'm a plugin");
    const char msg[] = "hello world";
    qemu_plugin_outs(msg);
    return 0;
}

The QEMU_PLUGIN_EXPORT macro marks a function or variable as being visible. Functions and variables marked with this can be accessed or called by the emulator core.

All plugins define the API version they were built against with qemu_plugin_version.

qemu_plugin_install

When a plugin is loaded, its qemu_plugin_install function will be run.

  • The id argument contains a unique ID for this plugin.
  • The info argument will provide information about QEMU and the guest such as the architecture, number of CPUs, minimum/maximum supported TCG API version.
  • The argc count and argv array describe the arguments passed to the plugin.

If this function returns 0, the plugin has loaded successfully, otherwise an error has occurred.

qemu_plugin_outs

This function outputs a char* if QEMU is run with -d plugin, otherwise it is a no-op.

Callback Registration

...

static void plugin_exit(qemu_plugin_id_t id, void *p)
{
    char* msg[] = "The plugin exit callback has triggered";
    qemu_plugin_outs(msg);
}

QEMU_PLUGIN_EXPORT
int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
                        int argc, char **argv)
{
    qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
    return 0;
}

qemu_plugin_register_atexit_cb

The qemu_plugin_register_*_cb functions register a user-provided function to run on a callback event. This event, atexit runs at the end of emulation. The function is given the plugin ID, the callback function, and a void * of custom user-data to pass to the callback functions (same the context argument added to PANDA callbacks in PR 1105).

Currently Supported TCG Callbacks

qemu_plugin_register_atexit_cb;             # Run when execution has finished
qemu_plugin_register_flush_cb;			    # On TB flush

qemu_plugin_register_vcpu_exit_cb;          # vCPU exit?
qemu_plugin_register_vcpu_idle_cb;          # vCPU idle
qemu_plugin_register_vcpu_init_cb;          # vCPU created
qemu_plugin_register_vcpu_resume_cb;        # On CPU resume

qemu_plugin_register_vcpu_insn_exec_cb;     # Trigger CB on insn
qemu_plugin_register_vcpu_insn_exec_inline; # Run inline TCG ops on insn

qemu_plugin_register_vcpu_mem_cb;           # Trigger CB on mem access
qemu_plugin_register_vcpu_mem_inline;       # Run inline TCG ops on mem access

qemu_plugin_register_vcpu_tb_exec_cb;       # Trigger CB before block exec
qemu_plugin_register_vcpu_tb_exec_inline;   # Run inline TCG ops on mema ccess

qemu_plugin_register_vcpu_syscall_cb;      # Before syscall
qemu_plugin_register_vcpu_syscall_ret_cb;  # On syscall return

qemu_plugin_register_vcpu_tb_trans_cb;     # Before block translate

Callback Deep Dive: tb_exec vs BEFORE_BLOCK_EXEC

Consider the following block counting plugin as an example (based on QEMU's bb.c). This is simplified to only support a single-core guest.

#include <glib.h>
#include <qemu-plugin.h>

QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;


static uint64_t bb_count = 0;
static uint64_t insn_count = 0;

static void plugin_exit(qemu_plugin_id_t id, void *p)
{
	// Print results
	g_autoptr(GString) report = g_string_new("");
	g_string_printf(report, "bb's: %" PRIu64", insns: %" PRIu64 "\n",
					bb_count, insn_count);
   qemu_plugin_outs(report->str);
}

static void vcpu_tb_exec(unsigned int cpu_index, void *udata)
{
	// Update counters at the start of every block, size is passed from tb_trans
    uintptr_t n_insns = (uintptr_t)udata;
    insn_count += n_insns;
    bb_count++;
}


static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
{
	size_t n_insns = qemu_plugin_tb_n_insns(tb);

	// Trigger a callback before this block runs. Mark that this CB will
	// not read/write any guest registers. Pass n_insns as an argument.
	qemu_plugin_register_vcpu_tb_exec_cb(tb, vcpu_tb_exec,
                                        QEMU_PLUGIN_CB_NO_REGS,
                                        (void *)n_insns);
}

QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
                                           const qemu_info_t *info,
                                           int argc, char **argv)
{
	qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
	qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
	return 0;
}

This code uses three callbacks: tb_trans, tb_exec, and atexit. Like with PANDA's BEFORE_BLOCK_TRANSLATAE, tb_trans is triggered before each block is translated. When this callback triggers, functions can calculate the number of instructions in the block (here this uses an API function, would be more complex in PANDA).

During block translation, each block is set to trigger the tb_exec function. Information from the tb_trans callback is passed through to the tb_exec function via the void* udata argument. This happens here because most of the TCG plugin logic is implemented through changing the TCG stream: when a block is translated to TCG, the calls to the user-provided tb_exec function are added to the IR.

During each tb_exec callback, the counters are updated. In the full bb.c example plugin, this callback properly handles multiple guest CPUs and uses mutexes to ensure the results are calculated correctly. This example skips that to keep it simplified.

Finally, in the tb_exit callback, the results are printed. Note that this information will only be shown if QEMU is run with the -d plugin flag.

Missing Features

Plugins cannot directly interact with core QEMU internals, they must make requests through the TCG Plugin API. The following features are often used by PANDA

  • Reading guest registers and memory. Alex Bennée is working on a branch with support for this.
  • Writing guest registers and memory.
  • Inter plugin interactions (e.g., PANDA's PPP) but I've submitted an RFC about this and gotten good feedback from Alex Bennée, planning to address the suggested changes and try getting an interface for this merged soon.

Design of TCG Plugins vs PANDA Plugins

  • TCG plugins compile to a single shared object instead of one per-architecture.
  • TCG plugins must use the provided (versioned) API to interact with the emulated guest; Plugins cannot access CPUState directly as they do in PANDA. As such plugins are portable across emulator versions so long as the API version remains stable.
  • TCG plugins are mostly based around inserting new logic into TCG stream vs into core emulation logic. Some support for injecting a plugin's analyses directly into the TCG stream instead of calling into the plugin's compiled code (looks promising).

Differences Between Plugin Systems

  • TCG plugins support multi-core guests. PANDA is limited to single core.
  • TCG plugins support both system and user-mode emulation. PANDA only supports system mode.
  • TCG plugins have 15 callbacks. PANDA has ~40.
  • TCG plugins cannot currently read/write guest registers/memory.
  • TCG plugins cannot currently interact with other plugins.

Andrew's Recommendation

The TCG plugin system is a well-architected, stable interface for plugins to introspect on QEMU guests. Plugins built atop this interface are generally portable across QEMU versions. Providing PANDA analyses as a set of TCG plugins would ensure that these analyses can run with modern QEMU with a minimal maintanance burden for us.

The TCG plugin interface needs a few more APIs and callbacks before we could port most PANDA analyses to it. If we were to work with upstream to address these shortcomings, the long term maintenance burden on us would be much lower. We could try merging PANDA plugins in tree or keep them in a fork. At a minimum, I suspect we'll want to merge a few to demonstrate how to use the APIs we'd add. If any plugins live out of tree, we would need to update them whenever the TCG API version changes. If they live in tree, we may be responsible for their maintenance though I expect this to be relatively little work.

If the upstream maintainers are unwilling to merge one or more API functions we require (perhaps they'll be opposed to modifying guest state from a plugin?), we could maintain a QEMU patch to add these APIs. This patch would be relatively small but would require updates to keep in sync with upstream. We wouldn't be able to merge any plugins that depend on this patch into QEMU.

I think we should work with upstream to expand the TCG plugin APIs as necessary and port PANDA plugins to use these interfaces. I think we should try getting as many of our plugins merged into upstream as they'll take. This will require some up-front work but provide many long term benefits such as reducing our maintenance burden, getting our analyses integrated with a much larger community, and keeping our analyses up to date with QEMU in the future.

Limitations of this analysis

The following PANDA features are unrelated to this discussion and wouldn't benefit from moving PANDA analyses to the TCG system: PANDA's record/replay system, PANDA's LLVM mode (as required by our taint plugins), and PyPanda.