False hits in test coverage from gcov #6958

gilles-peskine-arm · 2023-01-23T20:51:13Z

It appears that our test coverage reports are overly optimistic: they report certain lines as hit, but the lines are actually not hit. We initially thought this was an issue in lcov (the coverage reporting tool), but it's actually due to gcov (the coverage measurement tool), which is part of the GCC toolchain.

It appears that clang/LLVM works better. @bensze01 writes:

First, apply this patch:

diff --git a/scripts/gcov_tool.sh b/scripts/gcov_tool.sh
new file mode 100755
index 000000000..7c663d8da
--- /dev/null
+++ b/scripts/gcov_tool.sh
@@ -0,0 +1,2 @@
+#!/bin/sh
+exec ${GCOV_TOOL:-gcov} "$@"
diff --git a/scripts/lcov.sh b/scripts/lcov.sh
index 8d141eedf..02e9c25ec 100755
--- a/scripts/lcov.sh
+++ b/scripts/lcov.sh
@@ -46,10 +46,10 @@ set -eu
 lcov_library_report () {
     rm -rf Coverage
     mkdir Coverage Coverage/tmp
-    lcov --capture --initial --directory library -o Coverage/tmp/files.info
-    lcov --rc lcov_branch_coverage=1 --capture --directory library -o Coverage/tmp/tests.info
-    lcov --rc lcov_branch_coverage=1 --add-tracefile Coverage/tmp/files.info --add-tracefile Coverage/tmp/tests.info -o Coverage/tmp/all.info
-    lcov --rc lcov_branch_coverage=1 --remove Coverage/tmp/all.info -o Coverage/tmp/final.info '*.h'
+    lcov --gcov-tool "$PWD/scripts/gcov_tool.sh" --capture --initial --directory library -o Coverage/tmp/files.info
+    lcov --gcov-tool "$PWD/scripts/gcov_tool.sh" --rc lcov_branch_coverage=1 --capture --directory library -o Coverage/tmp/tests.info
+    lcov --gcov-tool "$PWD/scripts/gcov_tool.sh" --rc lcov_branch_coverage=1 --add-tracefile Coverage/tmp/files.info --add-tracefile Coverage/tmp/tests.info -o Coverage/tmp/all.info
+    lcov --gcov-tool "$PWD/scripts/gcov_tool.sh" --rc lcov_branch_coverage=1 --remove Coverage/tmp/all.info -o Coverage/tmp/final.info '*.h'
     gendesc tests/Descriptions.txt -o Coverage/tmp/descriptions
     genhtml --title "mbed TLS" --description-file Coverage/tmp/descriptions --keep-descriptions --legend --branch-coverage -o Coverage Coverage/tmp/final.info
     rm -f Coverage/tmp/*.info Coverage/tmp/descriptions

Then build the coverage data like this:

make CC=clang-10 CFLAGS='--coverage -g3 -O0' LDFLAGS='--coverage' -C tests test_suite_x509parse
(cd tests && ./test_suite_x509parse)
make lcov GCOV_TOOL="llvm-cov-10 gcov"

(replace clang-10 and llvm-cov-10 with the version you have installed)

The goal of this task is to investigate the coverage measurement differences between different versions of GCC and LLVM. In particular:

Is this a bug that is fixed in the latest GCC?
Was this bug already present in older GCC?
Is this a known bug in GCC?
Where there are differences, is LLVM always correct?

If LLVM is more reliable than GCC, we'll switch our scripts to use it.

The text was updated successfully, but these errors were encountered:

AndrzejKurek · 2023-01-26T11:36:58Z

I don't think this will be an s-sized task. Basic comparison for just one test case from:

here for gcov (updated version mentioned in Inconsistent/wrong line coverage linux-test-project/lcov#185), vs
here for llvm-cov

Shows that llvm-cov also manifests some faulty behaviour:

it shows coverage for line 789 of x509.c, while the line isn't executed. gcov coverage works fine in this particular case:

Lcov using gcov on the other hand has some workarounds that seem to work fine (in the limited scope I investigated), but I haven't studied the impact when running all tests.

Run commands:
gcov version:

make CFLAGS='--coverage -g3 -O0' LDFLAGS=--coverage -C tests test_suite_x509parse 
(cd tests && ./test_suite_x509parse)
make lcov
firefox Coverage/index.html

llvm-cov version:

make CC=clang-14 CFLAGS='--coverage -g3 -O0' LDFLAGS='--coverage' -C tests test_suite_x509parse
(cd tests && ./test_suite_x509parse)
make lcov GCOV_TOOL="llvm-cov-14 gcov"

diff generation (requires removing rm -f Coverage/tmp/*.info Coverage/tmp/descriptions line from lcov.sh first):

genhtml -o upd_gcov_opt_vs_llvm_diff --baseline-file ./Coverage_upd_gcov_opt_on/tmp/final.info ./Coverage_upd_llvm_cov/tmp/final.info --branch-coverage --function-coverage -filter line,branch,function

Results of coverage runs (and two diffs, llvm-cov vs updated gcov is what interests us):
coverage_tests.tar.gz

Please note that the diff's LBC might also be inconsistent, as shown here.

It seems that for some reason it's hard to get flawless results with our code. Bugs in tools? Any options that we don't use, and we should? I don't have the answers for now.

gilles-peskine-arm added bug size-s Estimated task size: small (~2d) component-test Test framework and CI scripts labels Jan 23, 2023

gilles-peskine-arm removed the size-s Estimated task size: small (~2d) label Jan 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

False hits in test coverage from gcov #6958

False hits in test coverage from gcov #6958

gilles-peskine-arm commented Jan 23, 2023 •

edited by bensze01

Loading

AndrzejKurek commented Jan 26, 2023 •

edited

Loading

False hits in test coverage from gcov #6958

False hits in test coverage from gcov #6958

Comments

gilles-peskine-arm commented Jan 23, 2023 • edited by bensze01 Loading

AndrzejKurek commented Jan 26, 2023 • edited Loading

gilles-peskine-arm commented Jan 23, 2023 •

edited by bensze01

Loading

AndrzejKurek commented Jan 26, 2023 •

edited

Loading