-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memcached #40
Comments
The vectorization and intrinsics only seem to be used by the vendored xxHash library that provides alternative intrisincs-free implementations that we could analyze instead. |
On tag ./autogen.sh
./configure
bear -- make
#if defined(__GNUC__)
# if defined(__AVX2__)
# include <immintrin.h>
# elif defined(__SSE2__)
# include <emmintrin.h>
# elif defined(__ARM_NEON__) || defined(__ARM_NEON)
# define inline __inline__ /* circumvent a clang bug */
# include <arm_neon.h>
# undef inline
# endif
# elif defined(_MSC_VER)
# include <intrin.h>
#endif With option goblint compile_commands.json --conf goblint.json With this conf{
"ana": {
"activated": ["base", "mallocWrapper", "escape", "mutex", "mutexEvents", "access", "race", "assert"],
"base": {
"privatization": "none",
"context": {
"non-ptr": false
}
},
"thread": {
"domain": "plain",
"include-node": false
},
"race": {
"free": false
}
},
"sem": {
"unknown_function": {
"spawn": false,
"invalidate": {
"globals": false
}
}
},
"exp": {
"earlyglobs": true
},
"warn": {
"assert": false,
"behavior": false,
"integer": false,
"cast": false,
"race": true,
"deadcode": false,
"analyzer": false,
"unsound": true,
"program": false,
"imprecise": true,
"unknown": false,
"error": true,
"warning": true,
"info": false,
"debug": false,
"success": false
},
"pre": {
"cppflags": [
"-DXXH_VECTOR=0"
]
}
} analyzed for 2h13min, produced an output file of 1.39GB with 775 race warnings 🙃. First time I tried with the Also, the race warnings are awfully long due to the locksets, e.g.:
|
The locksets look a bit curious: Is it plausible that there a code paths which acquire 255 distinct locks and never release any of them? |
Other thing: It might be worth trying with at least a simple thread analysis: Maybe the first two accesses can be excluded based on MHP, and then the other ones would be fine. |
I was able to isolate an example of why these awfully long locksets happen: Example#include <goblint.h>
#include <pthread.h>
/* Slab sizing definitions. */
#define POWER_SMALLEST 1
#define POWER_LARGEST 16 /* actual cap is 255 */
#define LARGEST_ID POWER_LARGEST
/* Locks for cache LRU operations */
pthread_mutex_t lru_locks[POWER_LARGEST];
static pthread_t item_crawler_tid;
int myglobal;
void memcached_thread_init(int nthreads, void *arg) {
int i;
for (i = 0; i < POWER_LARGEST; i++) {
pthread_mutex_init(&lru_locks[i], NULL);
}
}
void *item_crawler_thread(void *arg) {
int i;
for (i = POWER_SMALLEST; i < LARGEST_ID; i++) {
pthread_mutex_lock(&lru_locks[i]);
myglobal = myglobal + 1; // RACE!
}
return NULL;
}
int main(void) {
int i;
int num_threads = rand();
memcached_thread_init(num_threads, NULL);
pthread_create(&item_crawler_tid, NULL, item_crawler_thread, NULL);
for (i = POWER_SMALLEST; i < LARGEST_ID; i++) {
pthread_mutex_lock(&lru_locks[i]);
myglobal = myglobal + 1; // RACE!
pthread_mutex_unlock(&lru_locks[i]);
}
pthread_join(&item_crawler_tid, NULL);
return 0;
} It happens when one forgets to unlock a mutex in a loop. In void *item_crawler_thread(void *arg) {
int i, r1, r2;
for (i = POWER_SMALLEST; i < LARGEST_ID; i++) {
pthread_mutex_lock(&lru_locks[i]);
myglobal = myglobal + 1; // RACE!
if (r1) {
pthread_mutex_unlock(&lru_locks[i]);
continue;
}
if (r2) {
int x = 0;
}
}
return NULL;
} Also, we might want to add a warning similar to the one in |
The place where these locksets arise involves a loop with lots of different if (!active_crawler_mod.mod->needs_lock) {
pthread_mutex_unlock(&lru_locks[i]);
}
active_crawler_mod.mod->eval(&active_crawler_mod, search, hv, i);
if (hold_lock)
item_trylock_unlock(hold_lock);
if (active_crawler_mod.mod->needs_lock) {
pthread_mutex_unlock(&lru_locks[i]);
} Depending on the value of the global So it's a weird combination of our imprecision on values (not refining globals) and precision on mutexes (path-sensitivity) that leads to such explosion. It doesn't have an easy solution though, because it boils down to goblint/analyzer#101. Path-sensitivity would need a proper widening that can widen growing cardinality of path sets like this. Currently there's no widening across path buckets. |
Some more findings. I was curious how the analysis would behave when, instead of analyzing based on a compilation database, we would analyze a program that has been merged with CIL (similar to the Concrat benchmarks). I merged the project with: Then ran Goblint with: goblint compile_commands.json --conf examples/large-program.json
--set pre.cppflags[+] "-DXXH_VECTOR=0" --enable warn.deterministic
--disable sem.unknown_function.spawn --disable sem.unknown_function.invalidate.globals
|& tee nonmerged.txt and goblint memcachedcil.c --conf examples/large-program.json
--set pre.cppflags[+] "-DXXH_VECTOR=0" --enable warn.deterministic
--disable sem.unknown_function.spawn --disable sem.unknown_function.invalidate.globals
|& tee merged.txt The analysis with the compilation database took 1h and 15 min, and the analysis on CIL merged took 1h and 3 min. There was no significant performance gain. However, it is a bit suspicious that there is a slightly different number of warnings.
Some of the differences can be since for analysis with compilation DB, there are many warnings of the following type:
but analyzing the merged one does not have them. This might be that CIL actually renames these, and the latter could be more precise? At the same time, @sim642 implemented an option not to track the indexes of locks, which made the analysis so much faster. The analysis for both runs was completed with 30sec 🤯. Although this option breaks one regression test, the number of warnings in this project stayed the same. A big win. There are also a bunch of unknown functions, making the analysis imprecise and due to the time restrictions so far, I have only used a minimal set (the one in large-program conf) of analyses. I will now try and configure a bit to see how much we can decrease the number of warnings with easy stuff. |
https://github.com/memcached/memcached
Initial attempt
Goblint version: heads/master-0-ge223b4f36-dirty.
Checked out git tag
1.6.17
and executed:Crashes due to parsing errors in intrinsics headers:
The text was updated successfully, but these errors were encountered: