-
Notifications
You must be signed in to change notification settings - Fork 34
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
915661b
commit bdd0c7f
Showing
24 changed files
with
939 additions
and
27 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
# Fuzzing | ||
|
||
Formality programs can be fuzzed with [bolero](https://crates.io/crates/bolero) but as of now the integration is fairly weak and requires some manual intervention. One of the challenges is that bolero's traits do not allow us to easily thread context through, so we must rely on static variables. The other challenge is that we wish to steer the fuzzer to generate "mostly valid" programs, and that means that the integration cannot be fully auto-generated. | ||
|
||
## Auto-derive and when to use it | ||
|
||
Enabling most types for use with bolero is as simple as annotating types with `#[derive(bolero::TypeGenerator)]`. | ||
This works great as long as any value for the fields is potentialy valid. | ||
But when you have fields that are meant to be references to other items in the program, you are likely to get nonsense: | ||
not necessarily a *problem*, but likely a waste of fuzzing time and effort. | ||
Navigating this requires writing custom fuzzing implementations. | ||
|
||
## Common reference types | ||
|
||
The two most common "references" are identifiers (e.g., names of structs) and type variables. | ||
As these are part of formality-core we have some built-in support for generating them. | ||
|
||
### FuzzSingleton | ||
|
||
The `formality_core::fuzz::FuzzSingleton` is a useful type for setting up global context that can be accessed from `TypeGenerator` implementations. To use it, declare a static like | ||
|
||
```rust | ||
static F: FuzzSingleton<Vec<u32>> = F::new(); | ||
``` | ||
|
||
you can then use `F.get()` to read the current value (initialized with `Default::default`). | ||
|
||
You can modify the value with methods like `F.set()` and `F.push()`. These will return a "guard" value that resets the singleton back to its initial state. Make sure you drop that guard appropriately, typically at the end of the block: | ||
|
||
```rust | ||
let _guard = F.set(new_value); | ||
``` | ||
|
||
The `set` method is intended to be used once per fuzzing session and hence has an assertion that the value is currently at its default value. | ||
|
||
### Identifiers | ||
|
||
When you declare an identifier type `SomeId` with the `id!` macro, | ||
it also creates an associated "fuzzing pool" accessible via `SomeId::fuzz_pool()`. | ||
This is a static vector of identifiers you can use. | ||
When an identifier is fuzzed, it will pick a value from the fuzzing pool. | ||
|
||
The fuzzing pool starts empty. You can add entries to it by invoking `push` on the fuzzing pool. | ||
|
||
A common pattern is to generate a set of identifiers early on (e.g., a set of struct names your program will have) and then set the fuzzing pool to contain those identifiers. | ||
|
||
### Bound variables | ||
|
||
The built-in variable types (universal, existential, etc) have fuzzing implementations. | ||
The intention is to only have the fuzzer generate closed terms with no free variables. | ||
|
||
Every formality language `L` has a fuzzing pool of variables in scope that can be referenced. | ||
This begins as empty. | ||
You can push new entries onto it with `L::open_fuzz_binder(kinds)`, which will create a set of variables `V` with the given kinds. | ||
It returns a guard that will remove those variables from scope; the guard also has a method `into_binder` | ||
that can be used to close over the variables `V` and create a `Binder<T>`. | ||
|
||
Fuzzing a variable generates a reference to one of the variables pushed by `open_fuzz_binder`. | ||
These are `BoundVariable` elements with those kinds and with depth set to `None`, just as you get with `Binder::open`. | ||
They are meant to be enclosed later in a binder with the `into_binder` method. | ||
|
||
When you fuzz a `Binder<T>`, it follows this sequence: | ||
|
||
* Fuzz some set of kinds `K` | ||
* Invoke `guard = L::open_fuzz_binder(K)` to push a set of variables `V` in scope | ||
* Fuzz a `T` that will reference variables in `V` (and possibly others that are already in scope) | ||
* Close over the `T` with `guard.into_binder(T)`, returning a `Binder<T>` where each reference variable in `V` now refers to an element in the binder. `V` are also removed from scope. | ||
|
||
If you wish to fuzz a value that references a known set of variables, you can do so by invoking `L::open_fuzz_binder` yourself. | ||
|
||
## How formality-Rust fuzzing works | ||
|
||
The formality Rust fuzzer tries to generate "mostly well-kinded" programs. The pattern for generating structs is as follows: | ||
|
||
* Generate a set of struct names and their generic parameters, effeciively a set of `(String, Vec<ParameterKind>)` tuples. | ||
* Note that a **string** is used to let the fuzzer generate fresh names. | ||
* Push each of the struct names into `AdtId::fuzz_pool()` as an available name for reference. | ||
* Also store the kinds into a `FuzzSingleton<Map<AdtId, Vec<ParameterKind>>>` | ||
* Generate struct definitions as follows. For each defined struct `(S, K)` with name `S` and kinds `K`: | ||
* Manually invoke `let guard = L::open_fuzz_binder(K)` to bring those parameters into scope | ||
* Generate the "body" of the struct (`StructBoundData`) | ||
* Close over `guard.into_binder(body)` to get a `Binder<StructBoundData>` that can be used in the struct definition | ||
* Generate references to structs as follows. Whenever we fuzz a type, provide a custom `TypeGenerator` impl that will | ||
* Fuzz an `AdtId`, picking a struct name from the availabel list | ||
* Look up its kinds from the map | ||
* Generate a suitable set of parameters that match those kinds | ||
|
||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
use crate::{fold::CoreFold, fuzz::PushGuard, language::Language, variable::CoreBoundVar}; | ||
|
||
use super::{fresh_bound_var, CoreBinder}; | ||
|
||
/// Brings new variables into scope for fuzzing. | ||
/// Don't invoke directly, instead call `L::open_fuzz_binder`. | ||
pub(crate) fn open_fuzz_binder_impl<L>(kinds: &[L::Kind]) -> PushKindGuard<L> | ||
where | ||
L: Language, | ||
{ | ||
let variables: Vec<_> = kinds.iter().map(|k| fresh_bound_var(*k)).collect(); | ||
PushKindGuard { | ||
guard: L::fuzz_free_variables().fuzz_push(variables.clone()), | ||
variables, | ||
} | ||
} | ||
|
||
/// The guard returned when you open a binder for fuzzing. | ||
/// It references a set of variables `V` that were brought into scope. | ||
/// You should invoke `self.into_binder(t)` on the fuzzed term `t` to create a `Binder<T>` | ||
/// where each variable in `V` is converted to a reference to the binder. | ||
/// | ||
/// See the Formality Book [chapter on fuzzing][f] for more details. | ||
/// | ||
/// [f]: https://rust-lang.github.io/a-mir-formality/formality_core/fuzzing.html | ||
#[must_use] | ||
pub struct PushKindGuard<L: Language> { | ||
#[allow(dead_code)] // the point of this field is to run the destructor | ||
guard: PushGuard<'static, CoreBoundVar<L>>, | ||
variables: Vec<CoreBoundVar<L>>, | ||
} | ||
|
||
impl<L: Language> PushKindGuard<L> { | ||
/// Access the variables that were brought into scope. | ||
pub fn variables(&self) -> &Vec<CoreBoundVar<L>> { | ||
&self.variables | ||
} | ||
|
||
/// Convert into a binder. | ||
pub fn into_binder<T>(self, bound_term: T) -> CoreBinder<L, T> | ||
where | ||
T: CoreFold<L>, | ||
{ | ||
CoreBinder::new(self.variables, bound_term) | ||
} | ||
} | ||
|
||
impl<L: Language, T> bolero::TypeGenerator for CoreBinder<L, T> | ||
where | ||
T: bolero::TypeGenerator + CoreFold<L>, | ||
L::Kind: bolero::TypeGenerator, | ||
{ | ||
/// Generate a binder with some fresh data inside. | ||
fn generate<D: bolero::Driver>(driver: &mut D) -> Option<Self> { | ||
let kinds: Vec<L::Kind> = driver.gen()?; | ||
let guard = L::open_fuzz_binder(&kinds); | ||
let bound_term: T = driver.gen()?; | ||
Some(guard.into_binder(bound_term)) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
//! See the Formality Book [chapter on fuzzing][f] for more details. | ||
//! | ||
//! [f]: https://rust-lang.github.io/a-mir-formality/formality_core/fuzzing.html | ||
use std::fmt::Debug; | ||
use std::sync::OnceLock; | ||
use std::{ops::Deref, sync::RwLock}; | ||
|
||
use crate::Map; | ||
|
||
/// A global singleton accessible from anywhere. | ||
/// Used to thread state for fuzzing. | ||
/// | ||
/// See the Formality Book [chapter on fuzzing][f] for more details. | ||
/// | ||
/// [f]: https://rust-lang.github.io/a-mir-formality/formality_core/fuzzing.html | ||
pub struct FuzzSingleton<I> { | ||
data: OnceLock<RwLock<I>>, | ||
} | ||
|
||
/// Trait for clearing the value of a singleton. | ||
/// Not invoked from outside. | ||
trait FuzzClear { | ||
fn clear(&self); | ||
} | ||
|
||
/// Guard that clears the singleton back to its default value after `set` is called. | ||
pub struct SetGuard<'s> { | ||
pool: &'s dyn FuzzClear, | ||
} | ||
|
||
impl<I> FuzzSingleton<I> | ||
where | ||
I: Debug + Default + Eq, | ||
{ | ||
/// Create a new instance; this is a `const` so that the value can be stored in a `static`. | ||
pub const fn new() -> Self { | ||
Self { | ||
data: OnceLock::new(), | ||
} | ||
} | ||
|
||
/// Internal method: access the data, initializing if needed. | ||
fn data(&self) -> &RwLock<I> { | ||
self.data.get_or_init(|| RwLock::new(I::default())) | ||
} | ||
|
||
/// Set the set of available names. | ||
/// | ||
/// This is intended to be called once per fuzzing session, | ||
/// so it will panic if the value has already been "set" from its default value. | ||
/// | ||
/// Returns a guard that will restore the value to the default when dropped. | ||
/// Once this guard is dropped, you can call `set` again. | ||
pub fn set(&self, new_data: I) -> SetGuard<'_> { | ||
let mut data = self.data().write().unwrap(); | ||
let old_data = std::mem::replace(&mut *data, new_data); | ||
assert!( | ||
I::default() == old_data, | ||
"cannot fuzz more than one program at a time, already fuzzing `{old_data:?}`" | ||
); | ||
SetGuard { pool: self } | ||
} | ||
|
||
/// Access the data. Will deadlock if you try to set before the guard is dropped. | ||
pub fn get(&self) -> impl Deref<Target = I> + '_ { | ||
self.data().read().unwrap() | ||
} | ||
} | ||
|
||
impl<I> FuzzClear for FuzzSingleton<I> | ||
where | ||
I: Debug + Default + Eq, | ||
{ | ||
/// Clear the set of available names. | ||
/// Invoked by the guard returned from `set`. | ||
fn clear(&self) { | ||
*self.data().write().unwrap() = Default::default(); | ||
} | ||
} | ||
|
||
impl<K: Debug + Ord, V: Debug + Eq + Clone> FuzzSingleton<Map<K, V>> | ||
where | ||
K: Debug + Ord, | ||
V: Debug + Clone, | ||
{ | ||
pub fn get_key(&self, key: K) -> V { | ||
self.get().get(&key).cloned().unwrap() | ||
} | ||
} | ||
|
||
impl<E: Clone + Debug + Eq> FuzzSingleton<Vec<E>> { | ||
/// Pick one of the available names from the fuzzer, | ||
/// returning `None` if there are no available names or the fuzzer | ||
/// ran out of data. | ||
pub fn fuzz_pick(&self, driver: &mut impl bolero::Driver) -> Option<E> { | ||
let data = self.get(); | ||
if data.is_empty() { | ||
return None; | ||
} | ||
let i = driver.gen_variant(data.len(), 0)?; | ||
Some(data[i].clone()) | ||
} | ||
|
||
/// Push names into the pool of available values that will be used later | ||
/// by `fuzz_pick`. Returns a guard that will pop them from the pool. | ||
pub fn fuzz_push(&self, values: Vec<E>) -> PushGuard<'_, E> { | ||
let mut data = self.data().write().unwrap(); | ||
let len = data.len(); | ||
data.extend(values); | ||
PushGuard { pool: self, len } | ||
} | ||
} | ||
|
||
/// Guard to pop names added by `fuzz_push`. | ||
pub struct PushGuard<'s, E: Clone + Debug + Eq> { | ||
pool: &'s FuzzSingleton<Vec<E>>, | ||
len: usize, | ||
} | ||
|
||
impl<E> Drop for PushGuard<'_, E> | ||
where | ||
E: Clone + Debug + Eq, | ||
{ | ||
fn drop(&mut self) { | ||
let mut data = self.pool.data().write().unwrap(); | ||
data.truncate(self.len); | ||
} | ||
} | ||
|
||
impl Drop for SetGuard<'_> { | ||
fn drop(&mut self) { | ||
self.pool.clear(); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.