Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support coverage-guided generation #84

Open
vlsi opened this issue Jan 19, 2020 · 74 comments
Open

Support coverage-guided generation #84

vlsi opened this issue Jan 19, 2020 · 74 comments

Comments

@vlsi
Copy link
Contributor

vlsi commented Jan 19, 2020

Hi,

There's https://github.com/rohanpadhye/jqf by @rohanpadhye
The idea here is that a randomized-input generation can be guided based on the feedback from the test execution.

For instance, guidance can observe code coverage and it could attempt to produce inputs that explore uncovered branches.
Another example might be to measure the performance (e.g. performance counters or just heap allocation) and attempt to produce an input that triggers performance issues.

Relevant information

https://www.fuzzingbook.org/html/MutationFuzzer.html#Guiding-by-Coverage (and other parts of the book)

Suggested Solution

If we take Zest (which is coverage-guided fuzzer), then ZestGuidance provides an input stream that can be considered as the source of randomness: https://github.com/rohanpadhye/jqf/blob/48d3b663ad68a7b615c2a8b9716da0ca8b6ef4e6/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java#L136

As far as I can see, the current RandomizedShrinkablesGenerator never exposes Random instance, so there's no way to take a random stream input from the guidance and pass it to RandomizedShrinkablesGenerator (see https://github.com/jlink/jqwik/blob/2517d6d8cd612137fcff730a9114169260fad4bf/engine/src/main/java/net/jqwik/engine/execution/CheckedProperty.java#L155 )

As far as I understand, jqwik does not provide a way to plug guided fuzzing: https://github.com/jlink/jqwik/blob/5632ef8ca3d51ff083380257fb0b2b9bd7383920/engine/src/main/java/net/jqwik/engine/properties/GenericProperty.java#L38

So I would like to hear your opinion on the way to integrate guided fuzzing.
It looks like it would be nice to be able to use pluggable guidances, so what do you think if the random in question was taken from a Guidance instance and the guidance get notified on the test outcomes?

PS. I wonder if jqwilk engine can be integrated with JUnit5's default one.

I know JUnit5's default engine does not suit very well for property-based tests, however, can you please clarify what are the major issues that prevent the use of JUnit5 engine for jqwik?

For instance, JUnit5 (e.g. TestTemplate) keeps all the test outcomes (which results in OOM), however, in property-based tests we don't need that, and we need to just count the number of passed tests.
On the other hand, it looks like that can be improved in JUnit (e.g. by adding @AggregateSuccessfulExecutionsAsCount or something like that).

I just thought that the default JUnit5 engine would make the tests easier to write as the documentation would be consolidated.
For instance, there's net.jqwik.api.Disabled, and there's org.junit.jupiter.api.Disabled which are more or less the same thing. It is unfortunate that regular annotations do not work with jqwik. Of course, it would be illegal to mix @jupiter.api.Test with @Property, however, that looks like a corner case that can be validated.

@jlink
Copy link
Collaborator

jlink commented Jan 20, 2020

@vlsi Thanks for the suggestion. Guided generation looks interesting. Can you give a concrete example how you imagine the integration of ZEST with jqwik could look like for the user?

As for the integration, it would most probably require a new kind of generation mode and a way to inject the random stream and the guidance. My intuition says it should be possible but it will require deeper understanding on my side how Zest and JQF work. If you can convince me with good examples from user perspective I'm absolutely willing to tackle it :-)

@vlsi
Copy link
Contributor Author

vlsi commented Jan 20, 2020

Can you give a concrete example how you imagine the integration of ZEST with jqwik could look like for the user?

Let us take Apache Calcite as an example.
It is SQL optimization and execution engine.

SQL has expressions (e.g. foo is null and foo >0), and, as you might guess, Calcite uses those expressions all over the place.

Of course, there's an implementation of the expression optimizer.
Here's a test for that: https://github.com/apache/calcite/blob/3acb30875525e029be96726a357ee9950cce3310/core/src/test/java/org/apache/calcite/rex/RexProgramTest.java#L1734-L1738
It verifies that NOT COALESCE(null, true) is simplified to false.

So the optimizer has properties to be verified. For instance, the optimizer must not throw exceptions while doing its optimization.

The current jqwik / junit-quickcheck approach would be to generate expression trees at random, and pass them to the optimizer.
However, ZEST allows to pick input tress better, so it skews input generation in such a way, so it increases the probability to trigger a bug.

This case for the end user could look as follows:

public class RexFuzzerTest {
  private RexSimplify rexOptimizer;

  public JqfFuzzerTest() {
    rexOptimizer = null; // setup the optimizer appropriately
  }

  @Property
  // or @Property(coverageGuidance=true)
  // or @Property @Guidance(type=ZEST) @ZestParams(include="org.apache.calcite.*")
  public void hello(@ForAll RexNode value) {
    rexOptimizer.simplify(value); // we check that optimizer does not throw exceptions 
  }
}

Note: I am not sure regarding the way guidance should be configured. At the end of the day, I don't think you can (easily) add a javaagent via @Annotation :)

The user would have to enable JQF's bytecode instrumentation somehow. It is available as a custom classloader or as a javaagent.
It would be fine to leave javaagent configuration to the build systems. I think it would be more-or-less manageable.

Note: currently JQF is integrated with junit-quickcheck, and JQF itself has only one annotation property

Does that answer your question?

In other words, from an end-user perspective, the very basic case is the same as with simple jqwik.

What ZEST does, it analyzes the execution outcomes, and it provides you with a random bytes, so the next randomized inputs would explore uncovered branches better.

If you ask for more advanced cases then it would be nice to be able to see extra statistics from ZEST itself. For instance, it would be nice to see coverage% in the real-time, so the user can see if the fuzzer is healthy or not.

it will require deeper understanding on my side how Zest and JQF work

You might want to check Guidance interface. It is quite small.

https://github.com/rohanpadhye/jqf/blob/06376192315a80309d6bd298b7517ef50f5bdb16/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/guidance/Guidance.java#L53-L72

@jlink
Copy link
Collaborator

jlink commented Jan 20, 2020

@vlsi As for your second question:

I wonder if jqwilk engine can be integrated with JUnit5's default one.

I'm probably never going to do that. Here's my argument:

Jupiter, the default engine, is a jack-of-all-trades, which in turn means that it's not well suited for special needs. Jupiter exentsion points have been designed to be composable, that's e.g. the reason why JUnit4's model of wrapping the execution of tests and suites was given up and replaced by before and after hooks. Nevertheless, generic composability is difficult to get right, and it requires to stick with some limitations. Jupiter's lifecycle is not really well suited for running an indeterminate number of tests scheduled by the feedback coming from previous test runs. Dynamic test generation allows that to some degree but then you will miss out on shrinking, statistics reporting and a few other subtleties that make developer experience of jqwik the way it is now - and as I like it.

That said, it would probably be possible to come up with a jqwik extension for Jupiter that could inject parameters, run the individual tries as dynamic tests and even do standard shrinking. But it would

  • Not offer the same developer experience as for the lifecycle (rerun with previous failure, run failing props first, switch platform reporting on/off, etc.)
  • Be (much) harder to integrate in Jupiter's lifecycle and make it usable/composable with most other extensions. Just think of the effect that other injected parameters can have on the predictability for shrinking results. And if you take composability away then a Jupiter integration has IMO no additional benefit.

Junit platform test engines - as jqwik is an example - have been introduced to fill the hole that Jupiter extensions leave empty: Different syntax to specify tests, different test lifecycle, difficulties with composability and side effects. BTW, there's a reason that junit-quickcheck has its own JUnit4 runner and is not just a set of JUnit 4 rules: Runners in JUnit 4 are (to some degree) what engines are in JUnit 5; they are not composable.

As you mention with @Disabled the engine approach has one major drawback: some functionality that could be considered cross-engine must be implemented multiple times and might function inconsistently/differently from engine to engine. Since I am still - despite my exit from the JUnit 5 team - in good contact with those folks we've always been looking for opportunities to extract common functionality from individual engines into the platform. Sometimes we find a reasonable solution (e.g. testing kit, annotation discovery), sometimes we don't.

@jlink
Copy link
Collaborator

jlink commented Jan 20, 2020

The user would have to enable JQF's bytecode instrumentation somehow. It is available as a custom classloader or as a javaagent.
It would be fine to leave javaagent configuration to the build systems. I think it would be more-or-less manageable.

javaagent configuration is mostly done by the IDE or the build system, so this should be doable.

I still don't get how providing random bytes can direct the generation of certain values since the values generated from certain random bytes cannot really be predicted without knowing the implementation of all generators involved - and that can be many. What do I miss or fail to understand?

@vlsi
Copy link
Contributor Author

vlsi commented Jan 20, 2020

I still don't get how providing random bytes can direct the generation of certain values since the values generated from certain random bytes cannot really be predicted without knowing the implementation of all generators involved - and that can be many

I guess it is described in zest paper (see 3.1 Parametric Generators)

direct

AFAIU, the idea is pretty much the same as the mutation and selection in a typical genetic optimizer directs the population to the desired goal.
In the same way, Zest mutates the input sequence, and it measures the resulting coverage.

Genetic optimizers do not need to know the implementation of the underlying systems.

cannot really be predicted without knowing the implementation of all generators involved

It does not need to predict the output.
For instance, it might learn that mutating bits 0..32 is very bad for coverage (==the first 32bits are magic sequence), so it might focus on mutating other bits. It does not require to know the implementation of the generators as long as they are predictable given the source of randomness.

@luvarqpp
Copy link
Contributor

luvarqpp commented Jan 20, 2020

I would like to see annotation in line with existing approach. For example:

@Property(generation = GenerationMode.ZEST_DRIVEN, shrinking = ShrinkingMode.FULL)
@Report(Reporting.COVERAGE)
void arbitraryUserIsAlwaysValidForSerDes(@ForAll("arbitraryUser") UserForTests arbitraryUser) {
}

PS: I like proposed idea. Does it mean that tuning properties to coverage various cases would not be needed generally? (genetics programming can be good in many cases)

@vlsi
Copy link
Contributor Author

vlsi commented Jan 20, 2020

Does it mean that tuning properties to coverage various cases would not be needed generally?

Frankly speaking, I have no idea at this point. In general, it looks to be useful even without tuning knobs.
ZestGuidance itself has multiple hard-coded parameters that might need to be configurable in the future.
I want to try the thing with real code, and then we could see what needs to be tuned.

The use cases I have in mind are Apache Calcite (RexNode fuzzing or even full SQL fuzzing), and PostgreSQL JDBC driver (e.g. JDBC API fuzzing).

PS. It would probably be fun to have Arbitrary implementation based on JavaCC grammars.

@luvarqpp
Copy link
Contributor

Does it mean that tuning properties to coverage various cases would not be needed generally?

Frankly speaking, I have no idea at this point. In general, it looks to be useful even without tuning knobs.
ZestGuidance itself has multiple hard-coded parameters that might need to be configurable in the future.
I want to try the thing with real code, and then we could see what needs to be tuned.

Thanks for info. I can use it (when integrated with jqwik) as soon as there is something to test. I have spring-data-rest real world project (in alpha version through) which will reach its customers this year and testing is one of our priority.

The use cases I have in mind are Apache Calcite (RexNode fuzzing or even full SQL fuzzing), and PostgreSQL JDBC driver (e.g. JDBC API fuzzing).

To find bugs more easily, you can try also "reactive jdbc driver". See its postgresql client code for example: https://github.com/r2dbc/r2dbc-postgresql

@vlsi
Copy link
Contributor Author

vlsi commented Jan 20, 2020

I have spring-data-rest real world project (in alpha version through) which will reach its customers this year and testing is one of our priority.

I'm not sure the current Zest/JQF supports multi-threaded applications :-/

@luvarqpp
Copy link
Contributor

I have spring-data-rest real world project (in alpha version through) which will reach its customers this year and testing is one of our priority.

I'm not sure the current Zest/JQF supports multi-threaded applications :-/

Jop, you are right, through using it when business logic is in main thread only, would be OK. See rohanpadhye/JQF#41. On the other side, this is not my case. In spring-data-rest there is some tomcat in main thread (perhaps), or main thread is my web test client (http client, which is sending http requests from jqwik test).

@jlink
Copy link
Collaborator

jlink commented Jan 20, 2020

@vlsi I looked at the examples coming with JQF. They don't seem to define any special case guidance which confuses me. Take this simple example:

@Fuzz
public void insert(int @Size(min=100, max=100)[] elements) {
     BinaryTree b = new BinaryTree();
     for (int e : elements) {
         b.insert(e);
     }
}

What difference in generated data would you expect - if any - compared to

@Property
public void insert(int @Size(min=100, max=100)[] elements) {
     BinaryTree b = new BinaryTree();
     for (int e : elements) {
         b.insert(e);
     }
}

running on Vanilla junit-quickcheck?

@vlsi
Copy link
Contributor Author

vlsi commented Jan 20, 2020

I looked at the examples coming with JQF. They don't seem to define any special case guidance which confuses me

PS. We can have a small conversation (e.g. Zoom, or https://meet.jit.si/) so screen sharing+voice might be more productive.

JQF implements its own JUnit runner: https://github.com/rohanpadhye/jqf/blob/f0979677984bf61e505104fcbda27b442cc561a1/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/JQF.java#L50

which uses FuzzStatement for test execution: https://github.com/rohanpadhye/jqf/blob/f0979677984bf61e505104fcbda27b442cc561a1/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/JQF.java#L87

FuzzStatement itself asks Guidance for the next input data to try (==next random bytes): https://github.com/rohanpadhye/jqf/blob/48d3b663ad68a7b615c2a8b9716da0ca8b6ef4e6/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java#L136-L137

And it calls generators with the given random (it is the place where usual generators are made to be guided ones): https://github.com/rohanpadhye/jqf/blob/48d3b663ad68a7b615c2a8b9716da0ca8b6ef4e6/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java#L139-L141

Then it informs Guidance on the actual test outcome (pass/fail/assumption violation/etc): https://github.com/rohanpadhye/jqf/blob/48d3b663ad68a7b615c2a8b9716da0ca8b6ef4e6/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java#L194

@vlsi
Copy link
Contributor Author

vlsi commented Jan 20, 2020

Note: ZestGuidance is instantiated and passed via static field in GuidedFuzzing
I don't quite like that approach, but it works (tm).

Frankly speaking, the current Zest / JQF code seems to be focused to be used as public static void main application rather than to be used as a library.
For instance, ZestGuidance is instantiated in ZestCli

Is it what you are looking for?

@jlink
Copy link
Collaborator

jlink commented Jan 20, 2020

PS. We can have a small conversation (e.g. Zoom, or https://meet.jit.si/) so screen sharing+voice might be more productive.

Would be cool. When do you have time this week?

@vlsi
Copy link
Contributor Author

vlsi commented Jan 20, 2020

Would be cool. When do you have time this week?

Feel free to select a suitable timeslot (15-20-30 min?) at https://doodle.com/vlsi

@luvarqpp
Copy link
Contributor

Would be cool. When do you have time this week?

Feel free to select a suitable timeslot (15-20-30 min?) at https://doodle.com/vlsi

Please, post some "key points" as result of talk if possible. It is nice reading and interesting approach getting alive :)

@rohanpadhye
Copy link

Hi! I am the main developer of JQF.

@jlink

What difference in generated data would you expect

I added an example in the README to demonstrate a good use case. I can copy that here:

    @Fuzz  /* The args to this method will be generated automatically by JQF */
    public void testMap2Trie(Map<String, Integer> map, String key) {
        // Key should exist in map
        assumeTrue(map.containsKey(key));   // the test is invalid if this predicate is not true

        // Create new trie with input `map`
        Trie trie = new PatriciaTrie(map);

        // The key should exist in the trie as well
        assertTrue(trie.containsKey(key));  // fails when map = {"x": 1, "x\0": 2} and key = "x"
    }

Running only random sampling reveals no assertion violations even after hours of input generation. In fact, most inputs do not even satisfy the assume() statement. With JQF we can find an assertion violation (and thus the bug COLLECTIONS-714) in just a few seconds. JQF uses feedback from the result of assume as well as from the code coverage of classes such asPatriciaTrie.java.

(You might argue that this property test can be written in a better way that might help find the bug via random sampling alone, but this is just an extreme example showing the utility of coverage guidance)

@vlsi

Note: ZestGuidance is instantiated and passed via static field in GuidedFuzzing
I don't quite like that approach, but it works (tm).

Just to clarify, this is so that instrumented class files can report coverage events to the unique guidance when executing branches and method calls. This is similar to coverage tools such as JaCoCo, which use static fields to collect code coverage. Having multiple guidances live simultaneously would require multiple versions of an instrumented class. Although it could be possible to bind guidances to class-loader instances, there hasn't been a need for such a setup as of yet.

@vlsi

Frankly speaking, the current Zest / JQF code seems to be focused to be used as public static void main application rather than to be used as a library. For instance, ZestGuidance is instantiated in ZestCli

JQF has a Maven plugin, which lets you run (Zest by default) via mvn jqf:fuzz, though it can also be run via CLI. It may be possible to make guided fuzzing run as part of the standard test suite (e.g. mvn test in Maven), as long as it is possible to specify a classloader that instruments application classes to report coverage information.

@vlsi
Copy link
Contributor Author

vlsi commented Jan 21, 2020

Rohan: Just to clarify, this is so that instrumented class files can report coverage events to the unique guidance when executing branches and method calls. This is similar to coverage tools such as JaCoCo, which use static fields to collect code coverage. Having multiple guidances live simultaneously would require multiple versions of an instrumented class

Thanks, that is useful, however, it looks like you answer to a slightly different question.

I agree you need to talk to the instrumentation engine.
And you do that by SingleSnoop.setCallbackGenerator(guidance::generateCallBack) in https://github.com/rohanpadhye/jqf/blob/47d20a9e16decc0800227247f029c0f1696afd02/fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/junit/GuidedFuzzing.java#L163

That is fine, and you call instrumentation API (which is SingleSnoop) and ask it to provide events.

However, GuidedFuzzing#setGuidance and GuidedFuzzing#getCurrentGuidance are never used in the instrumentation. The guidance is passed as a static field GuidedFuzzing#guidance which makes it harder to understand the way instantiation of zest guidance is connected with its use in the fuzz junit4 runner.

I agree JUnit4 runner does not have room for passing guiding instance, however, I guess it would make code easier to follow if you called GuidedFuzzing#getCurrentGuidance in JQF.java and passed it to new FuzzStatement(...).
In other words, FuzzStatement obviously requires knowledge of the coverage information. However, FuzzStatement constructor receives nothing related to coverage. That implies FuzzStatement is doing some magic in the implementation, which makes the thing hard to understand.

Of course, JQF.java and FuzzStatement can't really be reused in jqwik, so the discussion here is for understanding purposes only.

Rohan: JQF has a Maven plugin, which lets you run (Zest by default) via mvn jqf:fuzz, though it can also be run via CLI. It may be possible

@rohanpadhye , What I mean by library is as follows: https://www.programcreek.com/2011/09/what-is-the-difference-between-a-java-library-and-a-framework/
Currently, JQF/Zest is inclined towards the framework rather than a library, which makes it harder to integrate it into other libraries and applications.

The only use of mvn jqf:fuzz for jqwik is it can be used as a source of inspiration.
It does not mean jqf-maven-plugin is bad. It means just that zest/jqf as a library use case is not documented/tested.

@jlink
Copy link
Collaborator

jlink commented Jan 21, 2020

@rohanpadhye

Hi! I am the main developer of JQF.

Thanks for chiming in. Much appreciated.

The example is a motivating one that I could strive for to get running with jqwik as a first step. When I try to run it with mvn jqf:fuzz but my Maven won't find the jqf plugin. Any hint on that?

As for the instrumenting classloader, would it suffice to load the test container class through this classloader? I'm wondering which kind of lifecycle hook might be needed.

Last question: Would you be willing to move all junit4/junit-quickcheck related code in a separate artefact - or the other way round to create a jqf-core module?

@rohanpadhye
Copy link

@vlsi

I guess it would make code easier to follow if you called GuidedFuzzing#getCurrentGuidance in JQF.java and passed it to new FuzzStatement(...).

That's a reasonable suggestion and I'll open an issue to make this change.

Currently, JQF/Zest is inclined towards the framework rather than a library, which makes it harder to integrate it into other libraries and applications.

Yes, that is correct. There wasn't a need for exposing the fuzzing engine as a library as of now. However, if separating components into reusable library-like packages helps projects like jqwik, I am fine with refactoring.

@jlink

When I try to run it with mvn jqf:fuzz but my Maven won't find the jqf plugin. Any hint on that?

I created a standalone example here: https://github.com/rohanpadhye/jqf-zest-example. Let me know if following the README works.

As for the instrumenting classloader, would it suffice to load the test container class through this classloader? I'm wondering which kind of lifecycle hook might be needed.

I am not sure of this. JQF currently uses a separate plugin because of the control of specifying a custom classloader. I am not sure how to change the classloader used by the default test runner.

Would you be willing to move all junit4/junit-quickcheck related code in a separate artefact - or the other way round to create a jqf-core module?

Handling the dependency on junit-quickcheck should be straightforward, since all the dependent classes are already isolated into one directory. Regarding dependency on JUnit4, I will have to investigate further as to whether there is anything outside of this package that depends on JUnit classes. @vlsi already opened an issue in rohanpadhye/JQF#80 for this, which can be used if there is a need for this refactoring.

@rohanpadhye
Copy link

I am not sure of this. JQF currently uses a separate plugin because of the control of specifying a custom classloader. I am not sure how to change the classloader used by the default test runner.

To clarify, this is what the JQF Maven plugin currently does:

https://github.com/rohanpadhye/jqf/blob/143e42f3e070492de8ee34f2a94d4fdac0cb2eeb/maven-plugin/src/main/java/edu/berkeley/cs/jqf/plugin/FuzzGoal.java#L250-L254

I am not sure how to get this effect without using a custom plugin (i.e., how to change the classloader used by mvn test).

@jlink
Copy link
Collaborator

jlink commented Jan 21, 2020

@rohanpadhye I could get the isolated example to run (many thanks!). It does not find a unique failure, though. Using mvn:repro fails with

Cannot find or open file target/fuzz-results/examples.PatriciaTrieTest/testMap2Trie/failures/id_000000

BTW, the following jqwik property is able to detect the PatriciaTrie bug:

@Property(tries = 10000, maxDiscardRatio = 100)
void testMap2Trie(
		@ForAll Map<@StringLength(min = 1, max = 10) String, Integer> map,
		@ForAll @StringLength(min = 1, max = 10) String key
) {

	// Key should exist in map
	Assume.that(map.containsKey(key));   // the test is invalid if this predicate is not true

	// Create new trie with input `map`
	Trie trie = new PatriciaTrie(map);

	// The key should exist in the trie as well
	assertTrue(trie.containsKey(key));  // fails when map = {"x": 1, "x\0": 2} and key = "x"
}

You have to use the latest snapshot version "1.2.3-SNAPSHOT" though, because I had forgotten to add \u0000 as default edge case for character generation.

I am not sure how to get this effect without using a custom plugin (i.e., how to change the classloader used by mvn test).

This will require some experimentation to find out. In the best case an annotation will be enough to swap out classloaders for parts of the system. In a worse case a javaagent corresponding to your class loader might be required. In the very worst case both a Maven and Gradle plugin will be necessary.

@jlink
Copy link
Collaborator

jlink commented Jan 21, 2020

@rohanpadhye Tried the example a few more times. Eventually I got a unique failure.

After some code reading I guess that integrating JQF with jqwik will require an agent similar to what QuickTheories is doing. Here's an excerpt from https://github.com/quicktheories/QuickTheories#coverage-guidance:

There are some disadvantages to coverage guidance. In order to measure coverage QuickTheories must attach an agent to the JVM. The agent will be active from the moment it is installed until the JVM exits - this means it may be active while non QuickTheory tests are running. This will result in an ~10% reduction in performance, and may interfere with JaCoCo and other coverage systems if they are also active.

This probably means that any kind of integration should be in an optional artefact.

@vlsi
Copy link
Contributor Author

vlsi commented Jan 21, 2020

@jlink

This probably means that any kind of integration should be in an optional artefact

There might be other guidances.
For instance, one might want to find an input that triggers pathological memory allocation.

Luckily, there's ThreadMXBean.getThreadAllocatedBytes that can provide that information, and it does not really need javaagent or other instrumentation. It is already present in JVM.

So I would expect that jqwik should provide pluggable points for implementation of guided fuzzing, and implementation of zest-jqwik might indeed be located in a different artifact (or even in a different repository).

@jlink
Copy link
Collaborator

jlink commented Jan 21, 2020

@vlsi Even then. The dependency on JQF (without Zest) would be too strong IMO to put it into jqwik's core. Alternatively jqwik had to duplicate a lot of what JQF is bringing to the table. What jqwik has to offer in its core is the ability to change the stream of random bytes and to trigger some behaviour after each execution of a property. Given that I'd hope that the rest can be done as a module on top.

@vlsi
Copy link
Contributor Author

vlsi commented Jan 21, 2020

The dependency on JQF (without Zest) would be too strong IMO to put it into jqwik's core

No-one suggested adding a dependency on JQF I guess :)

@jlink
Copy link
Collaborator

jlink commented Jan 22, 2020

@vlsi To summarize our talk:

  1. We'll together work out a first attempt at a generic GenerationGuidance interface
  2. I'll implement it straightforwardly in a dedicated branch
  3. We'll try it out and iterate through implementations that do not require instrumentation, e.g.
    • Performance guidance: Search for long lasting program runs
    • Memory allocation guidance: Search for program runs that consume more memory
    • Instrument some code manually to provide coverage data, e.g. FizzBuzz

@jlink
Copy link
Collaborator

jlink commented Feb 5, 2020

Here's my current take considering all your input:

interface GenerationGuidance {

	/**
	 * Returns a reference to an iterator that will deliver
	 * integer values to feed the pseudo-random number generator for the next try.
	 *
	 * @throws IllegalStateException if there is no next try available
	 */
	Iterator<Integer> nextTry();

	/**
	 * Decide if another sample can be tried.
	 *
	 * Method could potentially block to wait for guiding algorithm to finish.
	 *
	 * If it returns false generation will be finished.
	 */
	boolean hasNextTry();

	/**
	 * Callback for observing actual generated sample passed to the property method.
	 */
	void observeGeneratedSample(List<Object> sample);

	/**
	 * Handles the result of a property try.
	 */
	void handleResult(TryResult result);

}

interface TryResult {
	enum Status {
		SATISFIED,
		FALSIFIED,
		INVALID
	}

	Status status();

	Optional<Throwable> throwable();
}

@vlsi
Copy link
Contributor Author

vlsi commented Feb 5, 2020

A guidance might want to check if a different sequence leads to a change in generated values at all.

Generated values do not always have equals/hashCode, so there's no much guidance can do to compare the values.

JQF itself does not use observeGeneratedArgs method, so I think we don't need it in the first versions.

@vlsi
Copy link
Contributor Author

vlsi commented Feb 5, 2020

#84 (comment) looks good, except I would skip observeGeneratedSample.

Iterator<Integer> nextTry();

It might be slightly better to have Stream<Integer> (or even java.util.stream.IntStream) there as Stream#close() would enable implementations to close the relevant resources if any.


I wonder if it makes sense to unify the result type of nextTry with String PropertyCheckResult#randomSeed().
For instance, add a common interface for seed and series of ints.
It is probably not that important for guidance, however, the reported seed for guided execution should include the list of ints somehow, and the user should be able to re-execute it later.


I'm not sure, however, void handleResult(TryResult result); might belong to another interface.

In other words, something behind the lines of:
A) RandomSource == nextTry + hasNextTry. It produces enough information for jqwik to instantiate a Random (or similar). It looks like interface RandomSource extends Stream<IntStream>.
B) AfterTry listener with handleResult. It might be useful even without guided generation.
C) GuidedGenerator == RandomSource + AfterTry combined.

Then, if the user wants to reproduce the failure (or hard-code a single test case), they might use a non-guided RandomSource which takes the list of ids from the failure case.

WDYT?

https://github.com/jlink/jqwik/blob/4eca4b4292cd97a4149cf87ce361a59e45319722/engine/src/main/java/net/jqwik/engine/properties/PropertyCheckResult.java#L32

@jlink
Copy link
Collaborator

jlink commented Feb 6, 2020

It might be slightly better to have Stream<Integer> (or even java.util.stream.IntStream) there as Stream#close() would enable implementations to close the relevant resources if any.

As far as I know Stream in Java is always lazy so it's not possible (or at least not straightforward) to consume one integer after the other until no longer needed. Closing behaviour - when necessary - can be done when handleResult is being called.

I'm not sure, however, void handleResult(TryResult result); might belong to another interface.

Maybe, maybe not. Too much speculation for the time being. I'd rather start with a proof of concept and then learn from that.

Then, if the user wants to reproduce the failure (or hard-code a single test case), they might
use a non-guided RandomSource which takes the list of ids from the failure case.

There's already a different way to reproduce a failing sample so that's covered for the time being.

@jlink
Copy link
Collaborator

jlink commented Feb 6, 2020

So here's the current state. I removed the observe... method and renamed TryResult to TryExecutionResult because it fits better to what I'm currently doing in the code base.

interface GenerationGuidance {

	/**
	 * Returns a reference to an iterator that will deliver
	 * integer values to feed the pseudo-random number generator for the next try.
	 *
	 * @throws IllegalStateException if there is no next try available
	 */
	Iterator<Integer> nextTry();

	/**
	 * Decide if another sample can be tried.
	 *
	 * Method could potentially block to wait for guiding algorithm to finish.
	 *
	 * If it returns false generation will be finished.
	 */
	boolean hasNextTry();

	/**
	 * Handles the result of a property try.
	 */
	void handleResult(TryExecutionResult result);

}

interface TryExecutionResult {
	enum Status {
		SATISFIED,
		FALSIFIED,
		INVALID
	}

	Status status();

	Optional<Throwable> throwable();
}

@jlink
Copy link
Collaborator

jlink commented Feb 6, 2020

I'm wondering if InputStream isn't as bad as I initially thought since it is Closeable. Or maybe introduce a dedicated byte or integer based stream:

interface ByteStream implements Closeable {
   byte next();
   boolean hasNext();
}

@vlsi
Copy link
Contributor Author

vlsi commented Feb 6, 2020

Stream in Java is always lazy so it's not possible (or at least not straightforward) to consume one integer after the other until no longer needed

There's

java.util.stream.Stream#iterator.

IntStream is specialized as PrimitiveIterator.OfInt iterator();, so it is more-or-less the same as regular iterator, yet is has no boxing overhead.
So IntStream seems to convey the proper semantics, it has close(), and it is easy to instantiate in the client code (e.g. Arrays.stream(int[]))


Just in case: Stream#spliterator produces a Spliterator which can be processed item by item via Spliterator#tryAdvance. In other words, Stream should be fine for item-by-item pull processing.

@jlink
Copy link
Collaborator

jlink commented Feb 6, 2020

That's what I meant with "not straightforward". I don't think Stream should be used that way. It's against what I'd call the assumed lazyness of a stream.

I'll go with either Iterator<Integer> or a specialized type like ByteStream as described above.

@vlsi
Copy link
Contributor Author

vlsi commented Feb 6, 2020

I don't think it is worth spending time on the discussion like this, however, I truly do not see why assumed lazyness prevents the use of IntStream as an API.

Stream does support the notion of java.util.Spliterator#ORDERED.
Java 1.8 has even java.util.Random#ints() that returns java.util.stream.IntStream.
Spliterator has tryAdvance which is designed to consume elements one by one.

IntSteram is easy to instantiate (e.g. Arrays.stream(int[]), Random#ints(), IntStream.of() and so on), while specialized type would couple Guidance implementation to jqwik APIs.

Iterator<Integer> would involve box-unbox of the integers, and it might easily impact performance.
There's java.util.PrimitiveIterator.OfInt interface which is iterator over ints without boxing, however, I still think IntStream makes more sense there.

In other words, Java architects were more or less fine with using IntStream for the stream of ints.

I don't see why it does not fit for jqwik.

@jlink
Copy link
Collaborator

jlink commented Feb 6, 2020

I don't think it is worth spending time on the discussion like this

Absolutely not worth it. We just don't agree here.

@vlsi
Copy link
Contributor Author

vlsi commented Feb 6, 2020

Just in case:

val iter = IntStream.of(1, 2, 3, 4).iterator()
while (iter.hasNext()) {
    println(iter.nextInt())
}

^^^ the above is straightforward stream processing to me

@jlink
Copy link
Collaborator

jlink commented Feb 6, 2020

It's creating an iterator again. And creation would probably have to go through IntStream.iterate() or IntStream.generate(). Let it be.

@jlink
Copy link
Collaborator

jlink commented Feb 7, 2020

@vlsi Now that we must accept that we won't agree on every design aspect, do you still consider it worthwhile going forward with a proof of concept?

@jlink
Copy link
Collaborator

jlink commented Feb 7, 2020

This is my current best bet:

interface GenerationGuidance {

	/**
	 * Returns a reference to a source that will deliver
	 * integer values to feed the pseudo-random number generator for the next try.
	 *
	 * @throws IllegalStateException if there is no next try available
	 */
	TryGenerationSource nextTry();

	/**
	 * Decide if another sample can be tried.
	 * <p>
	 * Method could potentially block to wait for guiding algorithm to finish.
	 * <p>
	 * If it returns false generation will be finished.
	 */
	boolean hasNextTry();

	/**
	 * Handles the result of a property try.
	 */
	void handleResult(TryExecutionResult result);

}

interface TryExecutionResult {
	enum Status {
		SATISFIED,
		FALSIFIED,
		INVALID
	}

	Status status();

	Optional<Throwable> throwable();
}

/**
 * Source for providing integer values.
 */
interface TryGenerationSource extends AutoCloseable {
	int next();

	boolean hasNext();

	/**
	 * Will be called when no more values are necessary for
	 * generating the parameters of the current try.
	 */
	@Override
	default void close() {
		// Optional
	}
}

@vlsi
Copy link
Contributor Author

vlsi commented Feb 7, 2020

This is my current best bet:

LGTM

@jlink
Copy link
Collaborator

jlink commented Feb 7, 2020

I wonder if TryGenerationSource could be simplified to

/**
 * Source for providing integer values.
 */
interface TryGenerationSource extends AutoCloseable {
	
	/** There must always be another value to feed generation */
	int more();

	/**
	 * Will be called when no more values are necessary for
	 * generating the parameters of the current try.
	 */
	@Override
	default void close() {
		// Optional
	}
}

@vlsi
Copy link
Contributor Author

vlsi commented Feb 7, 2020

I wonder if TryGenerationSource could be simplified to

I guess both of them are OK provided there's the following factory method :)

interface TryGenerationSource extends AutoCloseable {
    static TryGenerationSource of(IntStream source);
}

@vlsi
Copy link
Contributor Author

vlsi commented Feb 7, 2020

However, I agree the random generator should better be treated as an infinite source.
I don't think the test engine can handle out of randomness exception.

@jlink
Copy link
Collaborator

jlink commented Jun 27, 2020

@vlsi Are you still interested in the feature and would have time to experiment with it? If so I'll move (a branch with) the basic mechanism towards the top of my todo list.

@vlsi
Copy link
Contributor Author

vlsi commented Jun 27, 2020

@jlink , that's on my list, however, I don't think I would be able to pick it in the nearest month :-/

@jlink
Copy link
Collaborator

jlink commented Jun 27, 2020

@vlsi OK. Then I'll leave it in my plans for the summer.

@jlink
Copy link
Collaborator

jlink commented Aug 27, 2020

This paper propagates coverage guided property testing as a more effective variant of what AFL does: https://www.cs.umd.edu/~mwh/papers/fuzzchick.pdf
Might be interesting to consider.

@fduminy
Copy link

fduminy commented Mar 10, 2021

any news on that issue ?

@jlink
Copy link
Collaborator

jlink commented Mar 11, 2021

Implementing the basic hook in jqwik is probably possible with reasonable effort. But someone - not me since I’m short on time - would then have to use it for adding AFL-like mutation and coverage it.

Any volunteers? I might push it up in the prioritised todo list.

@vlsi
Copy link
Contributor Author

vlsi commented Jan 9, 2022

While I do not have immediate plans to implement coverage-guided generation yet, however, the following looks useful and relevant: https://tiemoko.com/blog/diff-fuzz/

@vlsi
Copy link
Contributor Author

vlsi commented May 14, 2022

Here's an article that suggests to split random generators that affect structure of the input from the ones that affect data.

https://twitter.com/cestlemieux/status/1524806183466541056

@adam-waldenberg
Copy link

adam-waldenberg commented Mar 8, 2023

Any more thoughts on this? I think this would have been very cool in JQWik. Reading the papers and the resources the Zest algorithm sounds particularly interesting.

I'm just wondering what the best way to get something like this implemented is ? It requires instrumentation of the classes being tested, so coverage can actually be tracked. For reference I'm attaching a picture here of the Zest algo.

image
(Source: https://rohan.padhye.org/files/zest-issta19.pdf)

This might be a fun exercise to implement at some point.... We could probably rip the instrumentation project from JQF and then implement the above logic in a hook that intercepts each try.

Some questions come up though,

How would we deal with @Provide methods? Or even @Provide methods with nested arbitraries...?
How would we deal with the generated sets - we don't want to generate series of values that we never get to use.

@jlink
Copy link
Collaborator

jlink commented Mar 9, 2023

There's already a Java PBT lib that supports coverage-guided generation: https://github.com/quicktheories/QuickTheories. QT also shows that instrumentation through an agent is feasible.
As for plans to implement it, it will probably have to wait for jqwik 2, which I want to start working on this year.

I don't understand the point about @Provide methods - which are not different than any other arbitrary by the way. Generation always happens one by one, i.e. only if another set of values is needed; there's no eager generation of values. But I propbably missed your point.

@vlsi
Copy link
Contributor Author

vlsi commented Mar 9, 2023

I guess #424 might be relevant as it would be hard to introduce "coverage-guided" as long as jqwik uses java.util.Random.

@adam-waldenberg
Copy link

@jlink Cool. Last time I played with it, it didn't have this feature.

As for @Provide, I have noticed that a provider will be called quite a few times before a property is even executed the first time - with many combinations it can take quite a while. I guess I falsely assumed this was some kind of eager step it was doing. There has not been any point for me to take a closer look at it yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants