-
Notifications
You must be signed in to change notification settings - Fork 41
Profiles and new platform support
If a program is strictly-conforming, then the basic C semantics with no extensions should be enough to interpret that program and detect undefined behavior. But the reality of how C is actually used means it is much more common for programs to depend on implementation-specific details of particular hardware or software platforms. Profiles are meant to allow the interpreter generated from the C semantics to analyze programs in the context of such platform-specific behavior.
Platform-specific extensions and behavior should optimally be confined to a profile. These behaviors can broadly be understood as behaviors originating from the compiler, the operating system, or the underlying hardware. We have broken them down into more specific categories below.
- Command-line interface/flags -- the specifics of the interface a particular compiler uses need to be supported for RV-Match to be able to replace that compiler in a build system.
- Preprocessor macros and extensions -- the specific preprocessor used by an implementation and the value of feature-test macros.
- Implementation-specific choices -- the C standard leaves implementations with several choices, such as the size of
int
, that need to be made in order to interpret programs. - Language extensions: pragmas, attributes -- language extensions that generally don't require extra support by the parser.
- Language extensions: built-in functions -- language extensions that generally behave like function calls, so don't require extra support by the parser.
- Language extensions: new syntax -- language extensions involving novel syntax that requires extra support by the parser (e.g.,
gcc
's statement expressions: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html#Statement-Exprs). - Operating system interface and libraries -- e.g., the Linux system call interface or POSIX library functions.
- Native execution/FFI support -- the components necessary for executing functions on the native platform when they can't be analyzed by Match (e.g., when they contain inline assembly).
The list of installed profiles can be queried with kcc -version
. We currently support a few different profiles, all based on gcc and x86_64:
-
x86_64-linux-gcc-glibc
-- the default profile intended to act as a drop-in replacement for gcc with-std=c11
set. -
x86-gcc-limited-libc
-- like thex86_64-linux-gcc-glibc
profile, except that it has limited support for the C standard library because it does not use the standard library implementation installed on the system. This is similar to the profile the open source semantics uses: https://github.com/kframework/c-semantics/tree/master/profiles/x86-gcc-limited-libc -
x86_64-linux-gcc-glibc-reverse-eval-order
-- like the abovex86_64-linux-gcc-glibc
profile, except rules are evaluated in the reverse order, which can sometimes find errors due to non-determinately sequenced expressions. -
x86_64-linux-gcc-glibc-gnuc
andx86_64-linux-gcc-glibc-gnuc-reverse-eval-order
-- these are profiles are both analogous to the similarly-named ones above, except they are meant to behave like gcc with-std=gnu11
, which is the default behavior of gcc and might be required in certain cases. Functionally, these profiles provide the same GNUC language extensions as the ones above, the only difference is the value of feature-test macros.
In the past, we have also experimented with an 78k0r profile.
All of the above profiles are based on the behavior of gcc. Here we outline the extensions to the basic C semantics these profiles support in each of the categories mentioned above.
Implemented as: extensions to the kcc
script, not in the profile.
The command line interface to Match is provided by the kcc
script, which is intended to be compatible with gcc
. It supports many of gcc's most commonly-used flags in addition to flags specific to Match. Supporting other another compiler's interface would generally require writing a similar wrapper script.
The current state of the command line interface supported by kcc
is described by kcc --help
. This piece of platform-specific behavior is not kept in profiles and is the same for all of them (though some flags are not supported by the open source profile). The actual command line parsing done by kcc
happens here:
https://github.com/kframework/c-semantics/blob/master/scripts/RV_Kcc/Opts.pm
Extending kcc
to support more flags would start by updating this file with the new flags.
Implemented as: preprocessor script in the profile.
Currently, kcc
uses gcc
as a C preprocessor. The default profile also unsets the __GNUC__
feature-test macro in order to discourage the use of GNU language extensions.
The preprocessor used by Match is set by the profile. E.g., this is the preprocessor used by the open source profile:
https://github.com/kframework/c-semantics/blob/master/profiles/x86-gcc-limited-libc/pp
Implemented as: semantic extensions in the profile.
Implementation-specific behavior described in the standard are set in various *-settings.k
files in the profile, e.g.: https://github.com/kframework/c-semantics/blob/master/profiles/x86-gcc-limited-libc/semantics/c-settings.k
For example:
rule cfg:sizeut => unsigned-long-int
rule cfg:wintut => unsigned-int
rule cfg:wcharut => int
rule cfg:ptrsize => 4
rule cfg:ptrdiffut => int
rule cfg:alignofBool => 1
rule cfg:alignofSignedChar => 1
rule cfg:alignofShortInt => 2
rule cfg:alignofInt => 4
rule cfg:alignofLongInt => 4
rule cfg:alignofLongLongInt => 8
rule cfg:alignofOversizedInt => 16
rule cfg:alignofFloat => 4
rule cfg:alignofDouble => 8
rule cfg:alignofLongDouble => 16
rule cfg:alignofOversizedFloat => 16
rule cfg:alignofPointer => 4
// Max of the above.
rule cfg:alignofMalloc => 16
Implemented as: semantic extensions in the profile.
Some gcc
attributes we support:
-
__attribute__((packed))
,__attribute__((__packed__))
Some kcc
attributes we support:
kcc_absolute
kcc_extends_tu_of
Implemented as: semantic extensions (and sometimes C source) in the profile.
List of gcc
built-in functions for which we have some explicit semantic support:
__builtin_abort
__builtin_aligned_alloc
__builtin_calloc
__builtin_exit
__builtin_free
__builtin_malloc
__builtin_realloc
__builtin_asin
__builtin_atan2
__builtin_atan
__builtin_cos
__builtin_exp
__builtin_floor
__builtin_fmod
__builtin_log
__builtin_sin
__builtin_sqrt
__builtin_tan
__builtin_longjmp
__builtin_setjmp
__builtin_fprintf
__builtin_fputc
__builtin_fputs
__builtin_fscanf
__builtin_fwrite
__builtin_printf
__builtin_putc
__builtin_putchar
__builtin_puts
__builtin_scanf
__builtin_snprintf
__builtin_sprintf
__builtin_sscanf
__builtin_vfprintf
__builtin_vfscanf
__builtin_vprintf
__builtin_vscanf
__builtin_vsnprintf
__builtin_vsprintf
__builtin_vsscanf
__builtin_memset
__builtin_strcpy
__builtin_strlen
__builtin_alloca
__builtin_offsetof
Some kcc
built-ins (also, generally any gcc
built-in with the __builtin_
prefix replaced with __kcc_
):
__kcc_breakpoint()
Additionally, we auto-generate wrappers to implement many of the gcc builtin functions we do not have explicit semantics for. An incomplete list of gcc
builtin functions here:
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
Implemented as: semantic extensions in the profile and parser extensions (outside profiles).
Some gcc
language extensions we support:
- Case ranges.
- Statement expressions.
typeof
- Pointer arithmetic on
void
and function pointers. - Zero length arrays.
-
__int128
,__int128_t
,__uint128_t
__float128
-
auto
type.
Also, types, misc kcc
stuff:
-
__kcc_int{8,16,32,64}_t
,__kcc_uint{8,16,32,64}_t
List of gcc
language extensions here:
https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html
Implemented as: semantic extensions in the profile.
For Match to detect errors or misuse of APIs, it needs semantics. In addition to part of the C standard library (the semantics of which is https://github.com/kframework/c-semantics/tree/master/semantics/c/library), the proprietary extensions to the C semantics include (at least partial) semantics for the POSIX Threads API.
Implemented as: part of the profile; a natively-compiled program running on the target platform.