Skip to content

Profiles and new platform support

Chris Hathhorn edited this page Jun 4, 2019 · 12 revisions

Match/C-semantics profiles

If a program is strictly-conforming, then the basic C semantics with no extensions should be enough to interpret that program and detect undefined behavior. But the reality of how C is actually used means it is much more common for programs to depend on implementation-specific details of particular hardware or software platforms. Profiles are meant to allow the interpreter generated from the C semantics to analyze programs in the context of such platform-specific behavior.

Platform-specific extensions

Platform-specific extensions and behavior should optimally be confined to a profile. These behaviors can broadly be understood as behaviors originating from the compiler, the operating system, or the underlying hardware. We have broken them down into more specific categories below.

  • Command-line interface/flags -- the specifics of the interface a particular compiler uses need to be supported for RV-Match to be able to replace that compiler in a build system.
  • Preprocessor macros and extensions -- the specific preprocessor used by an implementation and the value of feature-test macros.
  • Implementation-specific choices -- the C standard leaves implementations with several choices, such as the size of int, that need to be made in order to interpret programs.
  • Language extensions: pragmas, attributes -- language extensions that generally don't require extra support by the parser.
  • Language extensions: built-in functions -- language extensions that generally behave like function calls, so don't require extra support by the parser.
  • Language extensions: new syntax -- language extensions involving novel syntax that requires extra support by the parser (e.g., gcc's statement expressions: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html#Statement-Exprs).
  • Operating system interface and libraries -- e.g., the Linux system call interface or POSIX library functions.
  • Native execution/FFI support -- the components necessary for executing functions on the native platform when they can't be analyzed by Match (e.g., when they contain inline assembly).

Current profiles

The list of installed profiles can be queried with kcc -version. We currently support a few different profiles, all based on gcc and x86_64:

  • x86_64-linux-gcc-glibc -- the default profile intended to act as a drop-in replacement for gcc with -std=c11 set.
  • x86-gcc-limited-libc -- like the x86_64-linux-gcc-glibc profile, except that it has limited support for the C standard library because it does not use the standard library implementation installed on the system. This is similar to the profile the open source semantics uses: https://github.com/kframework/c-semantics/tree/master/profiles/x86-gcc-limited-libc
  • x86_64-linux-gcc-glibc-reverse-eval-order -- like the above x86_64-linux-gcc-glibc profile, except rules are evaluated in the reverse order, which can sometimes find errors due to non-determinately sequenced expressions.
  • x86_64-linux-gcc-glibc-gnuc and x86_64-linux-gcc-glibc-gnuc-reverse-eval-order -- these are profiles are both analogous to the similarly-named ones above, except they are meant to behave like gcc with -std=gnu11, which is the default behavior of gcc and might be required in certain cases. Functionally, these profiles provide the same GNUC language extensions as the ones above, the only difference is the value of feature-test macros.

In the past, we have also experimented with an 78k0r profile.

All of the above profiles are based on the behavior of gcc. Here we outline the extensions to the basic C semantics these profiles support in each of the categories mentioned above.

Command-line interface/flags

Implemented as: extensions to the kcc script, not in the profile.

The command line interface to Match is provided by the kcc script, which is intended to be compatible with gcc. It supports many of gcc's most commonly-used flags in addition to flags specific to Match. Supporting other another compiler's interface would generally require writing a similar wrapper script.

The current state of the command line interface supported by kcc is described by kcc --help. This piece of platform-specific behavior is not kept in profiles and is the same for all of them (though some flags are not supported by the open source profile). The actual command line parsing done by kcc happens here:

https://github.com/kframework/c-semantics/blob/master/scripts/RV_Kcc/Opts.pm

Extending kcc to support more flags would start by updating this file with the new flags.

Preprocessor macros and extensions

Implemented as: preprocessor script in the profile.

Currently, kcc uses gcc as a C preprocessor. The default profile also unsets the __GNUC__ feature-test macro in order to discourage the use of GNU language extensions.

The preprocessor used by Match is set by the profile. E.g., this is the preprocessor used by the open source profile:

https://github.com/kframework/c-semantics/blob/master/profiles/x86-gcc-limited-libc/pp

Implementation-specific choices

Implemented as: semantic extensions in the profile.

Implementation-specific behavior described in the standard are set in various *-settings.k files in the profile, e.g.: https://github.com/kframework/c-semantics/blob/master/profiles/x86-gcc-limited-libc/semantics/c-settings.k

For example:

     rule cfg:sizeut => unsigned-long-int
     rule cfg:wintut => unsigned-int
     rule cfg:wcharut => int
     rule cfg:ptrsize => 4
     rule cfg:ptrdiffut => int

     rule cfg:alignofBool => 1
     rule cfg:alignofSignedChar => 1
     rule cfg:alignofShortInt => 2
     rule cfg:alignofInt => 4
     rule cfg:alignofLongInt => 4
     rule cfg:alignofLongLongInt => 8
     rule cfg:alignofOversizedInt => 16
     rule cfg:alignofFloat => 4
     rule cfg:alignofDouble => 8
     rule cfg:alignofLongDouble => 16
     rule cfg:alignofOversizedFloat => 16
     rule cfg:alignofPointer => 4
     // Max of the above.
     rule cfg:alignofMalloc => 16

Language extensions: pragmas, attributes

Implemented as: semantic extensions in the profile.

Some gcc attributes we support:

  • __attribute__((packed)), __attribute__((__packed__))

Some kcc attributes we support:

  • kcc_absolute
  • kcc_extends_tu_of

Language extensions: built-in functions

Implemented as: semantic extensions (and sometimes C source) in the profile.

List of gcc built-in functions for which we have some explicit semantic support:

__builtin_abort
__builtin_aligned_alloc
__builtin_calloc
__builtin_exit
__builtin_free
__builtin_malloc
__builtin_realloc
__builtin_asin
__builtin_atan2
__builtin_atan
__builtin_cos
__builtin_exp
__builtin_floor
__builtin_fmod
__builtin_log
__builtin_sin
__builtin_sqrt
__builtin_tan
__builtin_longjmp
__builtin_setjmp
__builtin_fprintf
__builtin_fputc
__builtin_fputs
__builtin_fscanf
__builtin_fwrite
__builtin_printf
__builtin_putc
__builtin_putchar
__builtin_puts
__builtin_scanf
__builtin_snprintf
__builtin_sprintf
__builtin_sscanf
__builtin_vfprintf
__builtin_vfscanf
__builtin_vprintf
__builtin_vscanf
__builtin_vsnprintf
__builtin_vsprintf
__builtin_vsscanf
__builtin_memset
__builtin_strcpy
__builtin_strlen
__builtin_alloca
__builtin_offsetof

Some kcc built-ins (also, generally any gcc built-in with the __builtin_ prefix replaced with __kcc_):

__kcc_breakpoint()

Additionally, we auto-generate wrappers to implement many of the gcc builtin functions we do not have explicit semantics for. An incomplete list of gcc builtin functions here:

https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

Language extensions: new syntax

Implemented as: semantic extensions in the profile and parser extensions (outside profiles).

Some gcc language extensions we support:

  • Case ranges.
  • Statement expressions.
  • typeof
  • Pointer arithmetic on void and function pointers.
  • Zero length arrays.
  • __int128, __int128_t, __uint128_t
  • __float128
  • auto type.

Also, types, misc kcc stuff:

  • __kcc_int{8,16,32,64}_t, __kcc_uint{8,16,32,64}_t

List of gcc language extensions here:

https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html

Operating system API and libraries

Implemented as: semantic extensions in the profile.

For Match to detect errors or misuse of APIs, it needs semantics. In addition to part of the C standard library (the semantics of which is https://github.com/kframework/c-semantics/tree/master/semantics/c/library), the proprietary extensions to the C semantics include (at least partial) semantics for the POSIX Threads API.

Native execution/FFI support

Implemented as: part of the profile; a natively-compiled program running on the target platform.

The native server is a program intended to run on the native platform. It accepts compiled, executable program fragments from Match, executes the fragment, and returns the result to Match. We have an example implementation, with documentation, here:

https://github.com/kframework/c-semantics/tree/master/native-server

Clone this wiki locally