Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize handling of .d files: Accept .d files with multiple targets, eat -MF arguments to preprocessor #557

Closed
wants to merge 3 commits into from

Conversation

michael-schwarz
Copy link
Member

Skip instead of failing if there is no .d file for some entry in the compilation database.
Needed for goblint/bench#9 after #535

@sim642
Copy link
Member

sim642 commented Jan 21, 2022

What type of files does this happen for?

I'm thinking maybe we should instead of #530 only include .c files and skip anything else. Because if they're some non-C stuff, we won't be able to parse them anyway.

@michael-schwarz
Copy link
Member Author

It is a perfectly normal .i file that we get as input, the source file is .c.

@sim642
Copy link
Member

sim642 commented Jan 21, 2022

I think then we should actually add support such that .i files bypass preprocessing and are directly returned as having been preprocessed. This also makes sense in SV-COMP to avoid unnecessarily calling cpp on everything with no effect.

@michael-schwarz
Copy link
Member Author

I think we might have misunderstood each other. What happens is:

Preprocessing logger.c
  to /home/michael/Documents/goblint-cil/analyzer/.goblint/preprocessed/logger.i
  using 'gcc' '-I' '/home/michael/Documents/goblint-cil/analyzer/includes' '-E' '-MMD' '-MT' 'logger.c' '-DHAVE_CONFIG_H' '-I.' '-I..' '-fPIC' '-I../include/wget' '-I.' '-I../lib' '-I../lib' '-fvisibility=hidden' '-DBUILDING_LIBWGET' '-DWGETVER_FILE="../include/wget/wgetver.h"' '-DBUILDING_LIBWGET' '-DNDEBUG' '-g' '-O2' '-MT' 'libwget_logger_la-logger.lo' '-MD' '-MP' '-MF' '.deps/libwget_logger_la-logger.Tpo' '-c' '-o' '/home/michael/Documents/goblint-cil/analyzer/.goblint/preprocessed/logger.i' 'logger.c'
  in /home/michael/Documents/bench-repos/wget2/libwget

@sim642
Copy link
Member

sim642 commented Jan 21, 2022

Oh, I see what the real problem is there: the command line already provides some -MD etc options, which I guess override ours and we never see the dependency file.
Just giving up in that case will yield to our dependency information being silently incomplete.

@michael-schwarz michael-schwarz changed the title Skip processing .d file if there is no .d file for an entry in the compilation database Generalize handling of .d files: Accept .d files with multiple targets, eat -MF arguments to preprocessor Jan 27, 2022
@michael-schwarz
Copy link
Member Author

Now, we try to fix the call to the compiler if it contains -MF path arguments that would redirect the dependency output. Also we accept more general .d files, such as the one we get for zstd:

../lib/common/debug.c obj/conf_16d91981455463eb5dee29f585802f4a/debug.o: \
 ../lib/common/debug.c ../lib/common/debug.h

../lib/common/debug.h:

@sim642
Copy link
Member

sim642 commented Jan 27, 2022

I'm starting to doubt this approach. You added -MF removal to the arguments based entry, but the same logic would have to be implemented for the command based entry and also the basic preprocessing (and possibly even the basic non-compilation database preprocessing).

The extended Makefile parsing is also becoming more of a hack because the command line still contains other cpp Makefile arguments. For example from the example you gave above:

 '-MT' 'libwget_logger_la-logger.lo' '-MD' '-MP' '-MF' '.deps/libwget_logger_la-logger.Tpo'

All these -M... options change the produced Makefile, not just -MF. Moreover, some of them have arguments, some don't, so it's not so easy to filter them out as well, because that'd be approaching implementing all cpp argument parsing logic to know which has what argument and to even do it in the command approach.

I think the other approach I suggested in #535 would be much more robust and less error-prone: extend Cil.file to contain a list of filenames CIL saw in #line pragmas while parsing it. I think we should instead implement that and revert #535.

@michael-schwarz
Copy link
Member Author

I think the other approach I suggested in #535 would be much more robust and less error-prone: extend Cil.file to contain a list of filenames CIL saw in #line pragmas while parsing it. I think we should instead implement that and revert #535.

I am a bit skeptical whether this is the general solution it appears to be though: If a header file itself contains no declarations but only includes other header files, we would not see it in this process at all. However, if this file changes to now include different things, we would have to re-run the preprocessor anyway. Or am I missing something here?

@sim642
Copy link
Member

sim642 commented Jan 31, 2022

If a header file itself contains no declarations but only includes other header files, we would not see it in this process at all.

Why not? If something is included, there will be a #line for it in the preprocessed output. That would be the whole point of adding the extra collecting of filenames from #line pragmas. The dependencies that -MD outputs are exactly those, but getting them out indirectly like that has become too much of a hack.

What wouldn't work is iterating over all the globals in the preprocessed file and collecting the filenames from their locations.

@michael-schwarz
Copy link
Member Author

The case that I was worried about was the one similar to the following one.

File 1.c

#include "1.h"
int main() {
    int i =0;
    int j = 0;
    for(; i<5; i++) {
        j += i*i;
    }
    assert(i == 5);
  }

File 1.h

#include "2.h"

File 2.h

int blarg();

Since 1.h contains nothing but includes, I was wondering if it will show up at all in the pre-processed output. If it wouldn't that would be a problem as someone could change 1.h to include something different from 2.h without us being able to tell.

But apparently it does:

michael@michael-ThinkPad-X1-Carbon-6th:~/Documents/goblint-cil/analyzer$ cpp 1.c
# 1 "1.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "<command-line>" 2
# 1 "1.c"
# 1 "1.h" 1
# 1 "2.h" 1
int blarg();
# 1 "1.h" 2
# 2 "1.c" 2
int main() {
    int i =0;
    int j = 0;
    for(; i<5; i++) {
        j += i*i;
    }
    assert(i == 5);
  }

@sim642
Copy link
Member

sim642 commented Feb 4, 2022

Closing in favor of #587.

@sim642 sim642 closed this Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants