`kani-cov`: A coverage tool for Kani #3121

adpaco-aws · 2024-04-03T22:04:59Z

This PR introduces kani-cov, a coverage-oriented tool for Kani. This new tool will help users aggregate raw coverage results produced by Kani and generate human-readable coverage summaries and reports.

kani-cov allows using three different subcommands: merge, summary and report. In the following, we will show how each one of these commands can be used.

The `merge` subcommand

The merge subcommand takes raw coverage results from Kani (AKA "kaniraw" files) and generates aggregated coverage results which are stored into another file (AKA a "kanicov" file). This new file can then be used for generating summaries or reports.

For example, if we assume this Rust code is in main.rs:

1 fn _other_function() {
2    println!("Hello, world!");
3 }
4
5 fn test_cov(val: u32) -> bool {
6     if val < 3 || val == 42 {
7         true
8     } else {
9         false
10    }
11 }
12
13 #[cfg_attr(kani, kani::proof)]
14 fn main() {
15    let test1 = test_cov(1);
16    let test2 = test_cov(2);
17    assert!(test1);
18    assert!(test2);
19 }

and we execute Kani with coverage enabled

kani main.rs --coverage -Zsource-coverage

the raw coverage results will be saved to a folder

[info] Coverage results saved to /home/ubuntu/coverage-experiments/src/kanicov_2024-09-23_23-49

we now can aggregate those results with the merge subcommand

kani-cov merge kanicov_2024-09-23_23-49/*kaniraw.json

which by default produces a default_kanicov.json file. We can also use the --output <path> option to change the name of the file.

The type of file we just produced is known as the "profile" or the "kanicov" file, and contains aggregated coverage results which may have been originated in multiple verification sessions. This file is a required input for the other two subcommands.

The `summary` subcommand

Now that we have both the "kanicov" file and the "kanimap" file, we are ready to produce coverage metrics with the summary subcommand:

kani-cov summary kanicov_2024-09-23_23-49/kanicov_2024-09-23_23-49_kanimap.json --profile default_kanicov.json

This outputs a table including coverage metrics for functions, lines and regions for each file referenced in the coverage mappings:

| Filename                                      | Function (%) | Line (%)     | Region (%)  |
| --------------------------------------------- | ------------ | ------------ | ----------- |
| /home/ubuntu/coverage-experiments/src/main.rs | 2/3 (66.67)  | 8/12 (66.67) | 4/6 (66.67) |

Since the default format is markdown, the table can also be rendered as in this PR description:

Filename	Function (%)	Line (%)	Region (%)
/home/ubuntu/coverage-experiments/src/main.rs	2/3 (66.67)	8/12 (66.67)	4/6 (66.67)

At present, no other formats are available. But the JSON and CSV formats would be really nice and easy to have.

The `report` subcommand

Coverage reports go a little further and allow users to visualize coverage information on the source code.

The report requires the same input files as the summary subcommand:

kani-cov report kanicov_2024-09-23_23-49/kanicov_2024-09-23_23-49_kanimap.json --profile default_kanicov.json

When calling this command from a terminal, it will render like this:

However, if you're redirecting the output to a file, it will back off to another text-based mode which simply uses ``` and ''' to represent opening and closing escapes for the coverage regions. This is just another format called escapes which can be explicitly requested as

kani-cov report kanicov_2024-09-23_23-49/kanicov_2024-09-23_23-49_kanimap.json --profile default_kanicov.json --format=escapes

Similar to the summary command, this will iterate over the files in the coverage mappings, compute coverage information and highlight any uncovered regions (except #3543).

Testing

This PR also changes compiletest to automatically call the merge and report subcommands on the coverage test suite. The tests have been blessed using the --fix-expected option and this Python3 script that removes the first line (path to the file) and adds \ after each line so an exact match is expected.

import os

def process_files(directory):
    for filename in os.listdir(directory):
        if filename == "expected":
            file_path = os.path.join(directory, filename)
            with open(file_path, "r") as file:
                lines = file.readlines()
            
            # Remove the first line
            lines = lines[1:]
            
            # Add the character `\` after every line
            processed_lines = []
            for i, line in enumerate(lines):
                processed_lines.append(line.replace("\n", "\\\n"))
            
            # Write the processed lines back to the file
            with open(file_path, "w") as file:
                file.writelines(processed_lines)

Callouts / Known issues

This PR only introduces kani-cov. It doesn't plug kani-cov into kani-driver so it's automatically run when using the --coverage option, nor it prepares the kani-cov binary for distribution. This is better to address later.

This PR also spawns a few issues:

Some coverage results point to non-existing regions #3543 forces us to filter out certain coverage results to avoid issues with the report command. It is important that we investigate where these coverage results come from and their semantics as this represents a soundness issue.
Check structures and other code is duplicated in kani-cov #3541 is to avoid duplicating certain data structures that currently live in kani-driver. We should probably have a different crate for these structures and import them from kani-driver and kani-cov, but this is more of a distribution problem that doesn't need to be tackled here.
Filenames stored in raw coverage results should be absolute #3542 requires us to match filenames when extracting function coverage results. It's not a big deal, but it'll likely improve the tool's robustness in the long term.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

celinval

Are we planning to release this tool or is it only for developers?

Cargo.toml

tools/kanicov/Cargo.toml

adpaco-aws · 2024-04-09T19:52:55Z

Are we planning to release this tool or is it only for developers?

The plan is to release the tool. More details in the upcoming RFC.

adpaco-aws · 2024-04-20T15:46:03Z

Note to self: Consider highlighting uncovered spans in red as the default behavior. At the very least, discuss in the RFC.

This PR replaces the line-based coverage instrumentation we introduced in #2609 with the standard source-based code coverage instrumentation performed by the Rust compiler. As a result, we now insert code coverage checks in the `StatementKind::Coverage(..)` statements produced by the Rust compiler during compilation. These checks include coverage-relevant information[^note-internal] such as the coverage counter/expression they represent [^note-instrument]. Both the coverage metadata (`kanimap`) and coverage results (`kaniraw`) are saved into files after the verification stage. Unfortunately, we currently have a chicken-egg problem with this PR and #3121, where we introduce a tool named `kani-cov` to postprocess coverage results. As explained in #3143, `kani-cov` is expected to be an alias for the `cov` subcommand and provide most of the postprocessing features for coverage-related purposes. But, the tool will likely be introduced after this change. Therefore, we propose to temporarily print a list of the regions in each function with their associated coverage status (i.e., `COVERED` or `UNCOVERED`). ### Source-based code coverage: An example The main advantage of source-based coverage results is their precision with respect to the source code. The [Source-based Code Coverage](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html) documentation explains more details about the LLVM coverage workflow and its different options. For example, let's take this Rust code: ```rust 1 fn _other_function() { 2 println!("Hello, world!"); 3 } 4 5 fn test_cov(val: u32) -> bool { 6 if val < 3 || val == 42 { 7 true 8 } else { 9 false 10 } 11 } 12 13 #[cfg_attr(kani, kani::proof)] 14 fn main() { 15 let test1 = test_cov(1); 16 let test2 = test_cov(2); 17 assert!(test1); 18 assert!(test2); 19 } ``` Compiling and running the program with `rustc` and the `-C instrument-coverage` flag, and using the LLVM tools can get us the following coverage result: ![Image](https://github.com/model-checking/kani/assets/73246657/9070e390-6e0b-4add-828d-d9f9caacad07) In contrast, the `cargo kani --coverage -Zsource-coverage` command currently generates: ``` src/main.rs (main) * 14:1 - 19:2 COVERED src/main.rs (test_cov) * 5:1 - 6:15 COVERED * 6:19 - 6:28 UNCOVERED * 7:9 - 7:13 COVERED * 9:9 - 9:14 UNCOVERED * 11:1 - 11:2 COVERED ``` which is a verification-based coverage result almost equivalent to the runtime coverage results. ### Benchmarking We have evaluated the performance impact of the instrumentation using the `kani-perf.sh` suite (14 benchmarks). For each test, we compare the average time to run standard verification against the average time to run verification with the source-based code coverage feature enabled[^note-line-evaluation]. The evaluation has been performed on an EC2 `m5a.4xlarge` instance running Ubuntu 22.04. The experimental data has been obtained by running the `kani-perf.sh` script 10 times for each version (`only verification` and `verification + coverage`), computing the average and standard deviation. We've split this data into `small` (tests taking 60s or less) and `large` (tests taking more than 60s) and drawn the two graphs below. #### Performance comparison - `small` benchmarks ![performance_comparison_small](https://github.com/user-attachments/assets/679cf412-0193-4b0c-a78c-2d0fb702706f) #### Performance comparison - `large` benchmarks ![performance_comparison_large](https://github.com/user-attachments/assets/4bb5a895-7f57-49e0-86b5-5fea67fad939) #### Comments on performance Looking at the small tests, the performance impact seems negligible in such cases. The difference is more noticeable in the large tests, where the time to run verification and coverage can take 2x or even more. It wouldn't be surprising that, as programs become larger, the complexity of the coverage checking grows exponentially as well. However, since most verification jobs don't take longer than 30min (1800s), it's OK to say that coverage checking represents a 100-200% slowdown in the worst case w.r.t. standard verification. It's also worth noting a few other things: * The standard deviation remains similar in most cases, meaning that the coverage feature doesn't have an impact on their stability. * We haven't tried any SAT solvers other than the ones used by default for each benchmark. It's possible that other solvers perform better/worse with the coverage feature enabled. ### Call-outs * The soundness issue documented in #3441. * The issue with saving coverage mappings for non-reachable functions documented in #3445. * I've modified the test cases in `tests/coverage/` to test this feature. Since this technique is simpler, we don't need that many test cases. However, it's possible I've left some test cases which don't contribute much. Please let me know if you want to add/remove a test case. [^note-internal]: The coverage mappings can't be accessed through the StableMIR interface so we retrieve them through the internal API. [^note-instrument]: The instrumentation replaces certain counters with expressions based on other counters when possible to avoid a part of the runtime overhead. More details can be found [here](https://github.com/rust-lang/rustc-dev-guide/blob/master/src/llvm-coverage-instrumentation.md#mir-pass-instrumentcoverage). Unfortunately, we can't avoid instrumenting expressions at the moment. [^note-line-evaluation]: We have not compared performance against the line-based code coverage feature because it doesn't seem worth it. The line-based coverage feature is guaranteed to include more coverage checks than the source-based one for any function. In addition, source-based results are more precise than line-based ones. So this change represents both a quantitative and qualitative improvement. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

tools/kani-cov/src/merge.rs

tools/kani-cov/src/coverage.rs

feliperodri

Thanks for implementing all my comments. I have one last before approving it. Why the issue #3543 is classified as a soundness issue now? As far as I understood, the coverage report feature will crash in the cases you mentioned in the issue. Will it report false coverage reports in any case? This is important to fix before stabilization, but it seems to me unnecessary to classify this as a soundness issue.

carolynzech

I'd recommend making the type aliases public and incorporating them throughout merge.rs and summary.rs as well, but not a blocker for this PR.
Excited to use this feature!

tools/kani-cov/src/coverage.rs

tools/kani-cov/src/summary.rs

adpaco-aws · 2024-10-07T23:05:37Z

Why the issue #3543 is classified as a soundness issue now? As far as I understood, the coverage report feature will crash in the cases you mentioned in the issue. Will it report false coverage reports in any case? This is important to fix before stabilization, but it seems to me unnecessary to classify this as a soundness issue.

The reason #3543 was classified as a soundness issue is because we were filtering out the results which could lead to a crash in kani-cov. Filtering out these results can be considered a soundness issue because, even if they're spurious failures coming from the Rust coverage instrumentation, we could be hiding other issues. It all depends on how we define soundness for coverage and the origin of these coverage artifacts.

Regardless, I've changed how we handle the checks with these out-of-bounds regions to replicate the spurious failure instead. Since we only need to extend the line by one space to represent the check in such cases, we just do that now. In other words, we are now reporting the same spurious failures, which may be more confusing for users but avoids soundness concerns. For example, for the test in tests/coverage/assert we get this report:

/home/ubuntu/kani/tests/coverage/assert/test.rs
   1|     | // Copyright Kani Contributors
   2|     | // SPDX-License-Identifier: Apache-2.0 OR MIT
   3|     | 
   4|     | #[kani::proof]
   5|    1| fn foo() {
   6|    1|     let x: i32 = kani::any();
   7|    1|     if x > 5 {
   8|     |         // fails
   9|    1|         assert!(x < 4);
  10|    1|         if x < 3 ```{'''
  11|    0| ```            // unreachable'''
  12|    0| ```            assert!(x == 2);'''
  13|    0| ```        }'''``` '''
  14|    1|     } else {
  15|    1|         // passes
  16|    1|         assert!(x <= 5);
  17|    1|     }
  18|     | }

Note the single region represented at the end of line 13? In the original source code, that space doesn't exist. But we can create that space in this representation and report the same result that the instrumentation gives us. It's less noticeable in the terminal output, but we also "draw it" there:

In addition to this change, I've added some more documentation to the code handling this special case and updated #3543 (labeled as "spurious failure" now). Please don't hesitate to ask for more clarifications if you need them. @carolynzech I've also implemented your suggestions to use the type aliases in other modules of kani-cov. Thanks!

kanicov: A tool to visualize coverage in Kani

c487f2a

github-actions bot added the Z-BenchCI Tag a PR to run benchmark CI label Apr 3, 2024

celinval reviewed Apr 5, 2024

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

celinval reviewed Apr 5, 2024

View reviewed changes

tools/kanicov/Cargo.toml Outdated Show resolved Hide resolved

Rename and more

eddb363

adpaco-aws changed the title ~~[Draft] kanicov: A tool to visualize coverage in Kani~~ [Draft] kani-cov: A tool to visualize coverage in Kani Apr 9, 2024

adpaco-aws changed the title ~~[Draft] kani-cov: A tool to visualize coverage in Kani~~ [Draft] kani-cov: A coverage tool for Kani Apr 9, 2024

feliperodri assigned adpaco-aws Jun 10, 2024

This was referenced Aug 15, 2024

Adopt Rust's source-based code coverage instrumentation #3119

Merged

Coverage metadata should include mappings for all source code #3445

Open

adpaco-aws added 3 commits August 21, 2024 15:01

Merge branch 'main' into kanicov-tool

935dace

Scaffold for kani-cov

13c26fd

Parse and combine raw results

bee8f2d

adpaco-aws added 13 commits August 28, 2024 20:27

Merge branch 'main' into kanicov-tool

e0b7a10

Add missing copyright notices

0aa25b9

Remove atty dependency

debeef4

Add license to Cargo.toml

9e9575a

Move unused code to report

dd53e38

Save combined results through merge

ad632f3

Start summary command

a659c1a

Some format fixes

76351fd

Fixes for clippy

304cfd2

Add description to Cargo.toml

c69f465

Parse function info with tree-sitter

d0bb478

Use file-function pairs to combine results

5512d28

Change combined results again

0e1dcc0

adpaco-aws added 2 commits October 1, 2024 20:41

Merge branch 'main' into kanicov-tool

28e392e

Update Cargo.lock file

d29c3f3

carolynzech reviewed Oct 2, 2024

View reviewed changes

tools/kani-cov/src/merge.rs Outdated Show resolved Hide resolved

tools/kani-cov/src/coverage.rs Outdated Show resolved Hide resolved

tools/kani-cov/src/coverage.rs Show resolved Hide resolved

carolynzech mentioned this pull request Oct 4, 2024

List Subcommand (Implementation) #3523

Merged

adpaco-aws added 9 commits October 4, 2024 20:33

Use aliases for coverage types

7741fb1

Use usize instead of u32

c7443f5

Document CovResult

babdc85

Format

0137b03

Fix bug and use partition

d0d1cf4

Update tests after fix

094ad75

Merge branch 'main' into kanicov-tool

a6abae5

more clippy fixes

8ea744d

Reformat code

ffa738a

adpaco-aws requested review from feliperodri and carolynzech October 4, 2024 21:51

feliperodri requested changes Oct 5, 2024

View reviewed changes

carolynzech approved these changes Oct 7, 2024

View reviewed changes

adpaco-aws added 6 commits October 7, 2024 21:30

Do not filter out results with out-of-bound regions

3f9b01b

Auto-fmt

4f0b913

Update tests

421cde4

More typenames and use across all of kani-cov

c7606f3

Merge branch 'main' into kanicov-tool

cb123e5

clippy fixes

7311f49

feliperodri approved these changes Oct 9, 2024

View reviewed changes

Merge branch 'main' into kanicov-tool

42314e8

adpaco-aws added this pull request to the merge queue Oct 10, 2024

Merged via the queue into model-checking:main with commit 53d9a52 Oct 10, 2024
27 checks passed

adpaco-aws deleted the kanicov-tool branch October 10, 2024 16:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`kani-cov`: A coverage tool for Kani #3121

`kani-cov`: A coverage tool for Kani #3121

adpaco-aws commented Apr 3, 2024 •

edited

Loading

celinval left a comment

adpaco-aws commented Apr 9, 2024

adpaco-aws commented Apr 20, 2024

feliperodri left a comment

carolynzech left a comment

adpaco-aws commented Oct 7, 2024 •

edited

Loading

kani-cov: A coverage tool for Kani #3121

kani-cov: A coverage tool for Kani #3121

Conversation

adpaco-aws commented Apr 3, 2024 • edited Loading

The merge subcommand

The summary subcommand

The report subcommand

Testing

Callouts / Known issues

celinval left a comment

Choose a reason for hiding this comment

adpaco-aws commented Apr 9, 2024

adpaco-aws commented Apr 20, 2024

feliperodri left a comment

Choose a reason for hiding this comment

carolynzech left a comment

Choose a reason for hiding this comment

adpaco-aws commented Oct 7, 2024 • edited Loading

`kani-cov`: A coverage tool for Kani #3121

`kani-cov`: A coverage tool for Kani #3121

adpaco-aws commented Apr 3, 2024 •

edited

Loading

The `merge` subcommand

The `summary` subcommand

The `report` subcommand

adpaco-aws commented Oct 7, 2024 •

edited

Loading