From 1947104a126107303b4f923dcea639de7cabb93a Mon Sep 17 00:00:00 2001 From: Caio Date: Fri, 20 Nov 2020 15:46:27 -0300 Subject: [PATCH 1/8] A new stack-based vector --- text/2978-stack_based_vec.md | 281 +++++++++++++++++++++++++++++++++++ 1 file changed, 281 insertions(+) create mode 100644 text/2978-stack_based_vec.md diff --git a/text/2978-stack_based_vec.md b/text/2978-stack_based_vec.md new file mode 100644 index 00000000000..2558bc871bc --- /dev/null +++ b/text/2978-stack_based_vec.md @@ -0,0 +1,281 @@ +- Feature Name: `stack_based_vec` +- Start Date: 2020-09-27 +- RFC PR: [rust-lang/rfcs#2990](https://github.com/rust-lang/rfcs/pull/2990) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +This RFC, which depends and takes advantage of the upcoming stabilization of constant generics (min_const_generics), tries to propose the creation of a new "growable" vector named `ArrayVec` that manages stack memory and can be seen as an alternative for the built-in structure that handles heap-allocated memory, aka `alloc::vec::Vec`. + +# Motivation +[motivation]: #motivation + +`core::collections::ArrayVec` has several use-cases and should be conveniently added into the standard library due to its importance. + +### Unification + +There are a lot of different crates about the subject that tries to do roughly the same thing, a centralized implementation would stop the current fragmentation. + +### Optimization + +Stack-based allocation is generally faster than heap-based allocation and can be used as an optimization in places that otherwise would have to call an allocator. Some resource-constrained embedded devices can also benefit from it. + +### Building block + +Just like `Vec`, `ArrayVec` is also a primitive vector where high-level structures can use it as a building block. For example, a stack-based matrix or binary heap. + +### Useful in the real world + +`arrayvec` is one of the most downloaded project of `crates.io` and is used by thousand of projects, including Rustc itself. + + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +`ArrayVec` is a container that encapsulates fixed size buffers. + +```rust +let mut v: ArrayVec = ArrayVec::new(); +let _ = v.push(1); +let _ = v.push(2); + +assert_eq!(v.len(), 2); +assert_eq!(v[0], 1); + +assert_eq!(v.pop(), Some(2)); +assert_eq!(v.len(), 1); + +v[0] = 7; +assert_eq!(v[0], 7); + +v.extend([1, 2, 3].iter().copied()); + +for element in &v { + println!("{}", element); +} +assert_eq!(v, [7, 1, 2, 3]); +``` + +Instead of relying on a heap-allocator, stack-based memory area is added and removed on-demand in a last-in-first-out (LIFO) order according to the calling workflow of a program. `ArrayVec` takes advantage of this predictable behavior to reserve an exactly amount of uninitialized bytes up-front and these bytes form a buffer where elements can be included dynamically. + +```rust +// `array_vec` can store up to 64 elements +let mut array_vec: ArrayVec = ArrayVec::new(); +``` + +Of course, fixed buffers lead to inflexibility because unlike `Vec`, the underlying capacity can not expand at run-time and there will never be more than 64 elements in the above example. + +```rust +// This vector can store up to 0 elements, therefore, nothing at all +let mut array_vec: ArrayVec = ArrayVec::new(); +let push_result = array_vec.push(1); +// Ooppss... Our push operation wasn't successful +assert!(push_result.is_err()); +``` + +A good question is: Should I use `core::collections::ArrayVec` or `alloc::collections::Vec`? Well, `Vec` is already good enough for most situations while stack allocation usually shines for small sizes. + +* Do you have a known upper bound? + +* How much memory are you going to allocate for your program? The default values of `RUST_MIN_STACK` or `ulimit -s` might not be enough. + +* Are you using nested `Vec`s? `Vec>` might be better than `Vec>`. + +Each use-case is different and should be pondered individually. In case of doubt, stick with `Vec`. + +For a more technical overview, take a look at the following operations: + +```rust +// `array_vec` has a pre-allocated memory of 2048 bits (32 * 64) that can store up +// to 64 decimals. +let mut array_vec: ArrayVec = ArrayVec::new(); + +// Although reserved, there isn't anything explicitly stored yet +assert_eq!(array_vec.len(), 0); + +// Initializes the first 32 bits with a simple '1' decimal or +// 00000000 00000000 00000000 00000001 bits +array_vec.push(1); + +// Our vector memory is now split into a 32/2016 pair of initialized and +// uninitialized memory respectively +assert_eq!(array_vec.len(), 1); +``` + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +`ArrayVec` is a contiguous memory block where elements can be collected, therefore, a collection by definition and even though `core::collections` doesn't exist, it is the most natural module placement. + +The API basically mimics most of the current `Vec` surface with some tweaks to manage capacity. + +Notably, these tweaked methods are checked (out-of-bound inputs or invalid capacity) versions of some well-known functions like `push` that will return `Result` instead of panicking at run-time. Since the upper capacity bound is known at compile-time and the majority of methods are `#[inline]`, the compiler is likely going to remove most of the conditional bounding checking. + +```rust +// Please, bare in mind that these methods are simply suggestions. Discussions about the +// API should probably take place elsewhere. + +pub struct ArrayVec { + data: MaybeUninit<[T; N]>, + len: usize, +} + +impl ArrayVec { + // Constructors + + pub const fn from_array(array: [T; N]) -> Self; + + pub const fn from_array_and_len(array: [T; N], len: usize) -> Self; + + pub const fn new() -> Self; + + // Methods + + pub const fn as_mut_ptr(&mut self) -> *mut T; + + pub const fn as_mut_slice(&mut self) -> &mut [T]; + + pub const fn as_ptr(&self) -> *const T; + + pub const fn as_slice(&self) -> &[T]; + + pub const fn capacity(&self) -> usize; + + pub fn clear(&mut self); + + pub fn dedup(&mut self) + where + T: PartialEq; + + pub fn dedup_by(&mut self, same_bucket: F) + where + F: FnMut(&mut T, &mut T) -> bool; + + pub fn dedup_by_key(&mut self, mut key: F) + where + F: FnMut(&mut T) -> K, + K: PartialEq; + + pub fn drain(&mut self, range: R) -> Option> + where + R: RangeBounds; + + pub fn extend_from_cloneable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]> + where + T: Clone; + + pub fn extend_from_copyable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]> + where + T: Copy; + + pub fn insert(&mut self, idx: usize, element: T) -> Result<(), T>; + + pub const fn is_empty(&self) -> bool; + + pub const fn len(&self) -> usize; + + pub fn pop(&mut self) -> Option; + + pub fn push(&mut self, element: T) -> Result<(), T>; + + pub fn remove(&mut self, idx: usize) -> Option; + + pub fn retain(&mut self, mut f: F) + where + F: FnMut(&mut T) -> bool; + + pub fn splice(&mut self, range: R, replace_with: I) -> Option> + where + I: IntoIterator, + R: RangeBounds; + + pub fn split_off(&mut self, at: usize) -> Option; + + pub fn swap_remove(&mut self, idx: usize) -> Option; + + pub fn truncate(&mut self, len: usize); +} +``` + +Meaningless, unstable and deprecated methods like `reserve` or `drain_filter` weren't considered. A concrete implementation is available at https://github.com/c410-f3r/stack-based-vec. + +# Drawbacks +[drawbacks]: #drawbacks + +### Additional complexity + +New and existing users are likely to find it difficult to differentiate the purpose of each vector type, especially people that don't have a theoretical background in memory management. + +### The current ecosystem is fine + +`ArrayVec` might be an overkill in certain situations. If someone wants to use stack memory in a specific application, then it is just a matter of grabbing the appropriated crate. + +# Prior art +[prior-art]: #prior-art + +These are the most known structures: + + * `arrayvec::ArrayVec`: Uses declarative macros and an `Array` trait for implementations but lacks support for arbitrary sizes. + * `heapless::Vec`: With the usage of `typenum`, can support arbitrary sizes without a nightly compiler. + * `staticvec::StaticVec`: Uses unstable constant generics for arrays of arbitrary sizes. + * `tinyvec::ArrayVec`: Supports fixed and arbitrary (unstable feature) sizes but requires `T: Default` for security reasons. + +As seen, there isn't an implementation that stands out among the others because all of them roughly share the same purpose and functionality. Noteworthy is the usage of constant generics that makes it possible to create an efficient and unified approach for arbitrary array sizes. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +### Nomenclature + +`ArrayVec` will conflict with `arrayvec::ArrayVec` and `tinyvec::ArrayVec`. + +### Prelude + +Should it be included in the prelude? + +### Macros + +```rust +// Instance with 1i32, 2i32 and 3i32 +let _: ArrayVec = array_vec![1, 2, 3]; + +// Instance with 1i32 and 1i32 +let _: ArrayVec = array_vec![1; 2]; +``` + +# Future possibilities +[future-possibilities]: #future-possibilities + +### Dynamic array + +An hydric approach between heap and stack memory could also be provided natively in the future. + +```rust +pub struct DynVec { + // Hides internal implementation + data: DynVecData, +} + +impl DynVec { + // Much of the `Vec` API goes here +} + +// This is just an example. `Vec` could be `Box` and `enum` an `union`. +enum DynVecData { + Heap(Vec), + Inline(ArrayVec), +} +``` + +The above description is very similar to what `smallvec` already does. + +### Generic collections and generic strings + +Many structures that use `alloc::vec::Vec` as the underlying storage can also use stack or hybrid memory, for example, an hypothetical `GenericString`, where `S` is the storage, could be split into: + +```rust +type DynString = GenericString>; +type HeapString = GenericString>; +type StackString = GenericString>; +``` From 126d24b7725148dc0273e6f2e03dc22a7c3f58c9 Mon Sep 17 00:00:00 2001 From: Caio Date: Sat, 16 Jan 2021 14:47:34 -0300 Subject: [PATCH 2/8] Address some comments There are still thigns to review. Just need some time to address everything --- text/2978-stack_based_vec.md | 52 ++++++++++++++++++++++-------------- 1 file changed, 32 insertions(+), 20 deletions(-) diff --git a/text/2978-stack_based_vec.md b/text/2978-stack_based_vec.md index 2558bc871bc..6462086be8d 100644 --- a/text/2978-stack_based_vec.md +++ b/text/2978-stack_based_vec.md @@ -11,24 +11,27 @@ This RFC, which depends and takes advantage of the upcoming stabilization of con # Motivation [motivation]: #motivation -`core::collections::ArrayVec` has several use-cases and should be conveniently added into the standard library due to its importance. - -### Unification - -There are a lot of different crates about the subject that tries to do roughly the same thing, a centralized implementation would stop the current fragmentation. +`core::collections::ArrayVec` should be conveniently added into the standard library due to its importance and potential. ### Optimization Stack-based allocation is generally faster than heap-based allocation and can be used as an optimization in places that otherwise would have to call an allocator. Some resource-constrained embedded devices can also benefit from it. -### Building block +### Unstable features and constant functions -Just like `Vec`, `ArrayVec` is also a primitive vector where high-level structures can use it as a building block. For example, a stack-based matrix or binary heap. +By adding `ArrayVec` into the standard library, it will be possible to use internal unstable features to optimize machine code generation and expose public constant functions without the need of a nightly compiler. ### Useful in the real world -`arrayvec` is one of the most downloaded project of `crates.io` and is used by thousand of projects, including Rustc itself. +`arrayvec` is one of the most downloaded project of `crates.io` and is used by thousand of projects, including Rustc itself. Currently ranks ninth in the "Data structures" category and seventy-fifth in the "All Crate" category. +### Building block + +Just like `Vec`, `ArrayVec` is also a primitive vector where high-level structures can use it as a building block. For example, a stack-based matrix or binary heap. + +### Unification + +There are a lot of different crates about the subject that tries to do roughly the same thing, a centralized implementation would stop the current fragmentation. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation @@ -57,13 +60,26 @@ for element in &v { assert_eq!(v, [7, 1, 2, 3]); ``` -Instead of relying on a heap-allocator, stack-based memory area is added and removed on-demand in a last-in-first-out (LIFO) order according to the calling workflow of a program. `ArrayVec` takes advantage of this predictable behavior to reserve an exactly amount of uninitialized bytes up-front and these bytes form a buffer where elements can be included dynamically. +Instead of relying on a heap-allocator, stack-based memory is added and removed on-demand in a last-in-first-out (LIFO) order according to the calling workflow of a program. `ArrayVec` takes advantage of this predictable behavior to reserve an exactly amount of uninitialized bytes up-front and these bytes form a buffer where elements can be included dynamically. ```rust // `array_vec` can store up to 64 elements let mut array_vec: ArrayVec = ArrayVec::new(); ``` +Another potential use-case is the usage within constant environments: + +```rust +const MY_CONST_ARRAY_VEC: ArrayVec = { + let mut v = ArrayVec::new(); + let _ = v.try_push(1); + let _ = v.try_push(2); + let _ = v.try_push(3); + let _ = v.try_push(4); + v +}; +``` + Of course, fixed buffers lead to inflexibility because unlike `Vec`, the underlying capacity can not expand at run-time and there will never be more than 64 elements in the above example. ```rust @@ -74,13 +90,13 @@ let push_result = array_vec.push(1); assert!(push_result.is_err()); ``` -A good question is: Should I use `core::collections::ArrayVec` or `alloc::collections::Vec`? Well, `Vec` is already good enough for most situations while stack allocation usually shines for small sizes. +A good question is: Should I use `core::collections::ArrayVec` or `alloc::vec::Vec`? Well, `Vec` is already good enough for most situations while stack allocation usually shines for small sizes. * Do you have a known upper bound? * How much memory are you going to allocate for your program? The default values of `RUST_MIN_STACK` or `ulimit -s` might not be enough. -* Are you using nested `Vec`s? `Vec>` might be better than `Vec>`. +* Are you using nested `Vec`s? `Vec>` might be better than `Vec>` because the heap-allocator is only called once instead of the `N` nested calls. Each use-case is different and should be pondered individually. In case of doubt, stick with `Vec`. @@ -88,13 +104,13 @@ For a more technical overview, take a look at the following operations: ```rust // `array_vec` has a pre-allocated memory of 2048 bits (32 * 64) that can store up -// to 64 decimals. +// to 64 signed integers. let mut array_vec: ArrayVec = ArrayVec::new(); // Although reserved, there isn't anything explicitly stored yet assert_eq!(array_vec.len(), 0); -// Initializes the first 32 bits with a simple '1' decimal or +// Initializes the first 32 bits with a simple '1' integer or // 00000000 00000000 00000000 00000001 bits array_vec.push(1); @@ -108,9 +124,9 @@ assert_eq!(array_vec.len(), 1); `ArrayVec` is a contiguous memory block where elements can be collected, therefore, a collection by definition and even though `core::collections` doesn't exist, it is the most natural module placement. -The API basically mimics most of the current `Vec` surface with some tweaks to manage capacity. +The API mimics most of the current `Vec` surface with some additional methods to manage capacity. -Notably, these tweaked methods are checked (out-of-bound inputs or invalid capacity) versions of some well-known functions like `push` that will return `Result` instead of panicking at run-time. Since the upper capacity bound is known at compile-time and the majority of methods are `#[inline]`, the compiler is likely going to remove most of the conditional bounding checking. +Notably, these additional methods are verifiable (out-of-bound inputs or invalid capacity) versions of some well-known functions like `push` that will return `Result` instead of panicking at run-time. Since the upper capacity bound is known at compile-time and the majority of methods are `#[inline]`, the compiler is likely going to remove most of the conditional bounding checking. ```rust // Please, bare in mind that these methods are simply suggestions. Discussions about the @@ -124,10 +140,6 @@ pub struct ArrayVec { impl ArrayVec { // Constructors - pub const fn from_array(array: [T; N]) -> Self; - - pub const fn from_array_and_len(array: [T; N], len: usize) -> Self; - pub const fn new() -> Self; // Methods @@ -249,7 +261,7 @@ let _: ArrayVec = array_vec![1; 2]; ### Dynamic array -An hydric approach between heap and stack memory could also be provided natively in the future. +An hybrid approach between heap and stack memory could also be provided natively in the future. ```rust pub struct DynVec { From 7ebb18920174800b68e5b4a6d9961272e752b82e Mon Sep 17 00:00:00 2001 From: Caio Date: Tue, 19 Jan 2021 20:07:20 -0300 Subject: [PATCH 3/8] Address more comments --- text/2978-stack_based_vec.md | 161 +++++++++++++++++++++++++---------- 1 file changed, 116 insertions(+), 45 deletions(-) diff --git a/text/2978-stack_based_vec.md b/text/2978-stack_based_vec.md index 6462086be8d..fdd7ff56ce7 100644 --- a/text/2978-stack_based_vec.md +++ b/text/2978-stack_based_vec.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -This RFC, which depends and takes advantage of the upcoming stabilization of constant generics (min_const_generics), tries to propose the creation of a new "growable" vector named `ArrayVec` that manages stack memory and can be seen as an alternative for the built-in structure that handles heap-allocated memory, aka `alloc::vec::Vec`. +This RFC, which depends and takes advantage of the upcoming stabilization of constant generics (min_const_generics), tries to propose the creation of a new vector named `ArrayVec` that manages stack memory and can be seen as an alternative for the built-in structure that handles heap-allocated memory, aka `alloc::vec::Vec`. # Motivation [motivation]: #motivation @@ -40,8 +40,8 @@ There are a lot of different crates about the subject that tries to do roughly t ```rust let mut v: ArrayVec = ArrayVec::new(); -let _ = v.push(1); -let _ = v.push(2); +v.push(1); +v.push(2); assert_eq!(v.len(), 2); assert_eq!(v[0], 1); @@ -60,7 +60,7 @@ for element in &v { assert_eq!(v, [7, 1, 2, 3]); ``` -Instead of relying on a heap-allocator, stack-based memory is added and removed on-demand in a last-in-first-out (LIFO) order according to the calling workflow of a program. `ArrayVec` takes advantage of this predictable behavior to reserve an exactly amount of uninitialized bytes up-front and these bytes form a buffer where elements can be included dynamically. +Instead of relying on a heap-allocator, stack-based memory is added and removed on-demand in a last-in-first-out (LIFO) order according to the calling workflow of a program. `ArrayVec` takes advantage of this predictable behavior to reserve an exactly amount of uninitialized bytes up-front to form an internal buffer. ```rust // `array_vec` can store up to 64 elements @@ -72,10 +72,10 @@ Another potential use-case is the usage within constant environments: ```rust const MY_CONST_ARRAY_VEC: ArrayVec = { let mut v = ArrayVec::new(); - let _ = v.try_push(1); - let _ = v.try_push(2); - let _ = v.try_push(3); - let _ = v.try_push(4); + v.push(1); + v.push(2); + v.push(3); + v.push(4); v }; ``` @@ -85,9 +85,7 @@ Of course, fixed buffers lead to inflexibility because unlike `Vec`, the underly ```rust // This vector can store up to 0 elements, therefore, nothing at all let mut array_vec: ArrayVec = ArrayVec::new(); -let push_result = array_vec.push(1); -// Ooppss... Our push operation wasn't successful -assert!(push_result.is_err()); +array_vec.push(1); // Error! ``` A good question is: Should I use `core::collections::ArrayVec` or `alloc::vec::Vec`? Well, `Vec` is already good enough for most situations while stack allocation usually shines for small sizes. @@ -96,7 +94,50 @@ A good question is: Should I use `core::collections::ArrayVec` or `alloc::vec * How much memory are you going to allocate for your program? The default values of `RUST_MIN_STACK` or `ulimit -s` might not be enough. -* Are you using nested `Vec`s? `Vec>` might be better than `Vec>` because the heap-allocator is only called once instead of the `N` nested calls. +* Are you using nested `Vec`s? `Vec>` might be better than `Vec>`. + +``` +let _: Vec> = vec![vec![1, 2, 3], vec![4, 5]]; + + +-----+-----+-----+ + | ptr | len | cap | + +--|--+-----+-----+ + | + | +---------------------+---------------------+----------+ + | | Vec | Vec | | + | | +-----+-----+-----+ | +-----+-----+-----+ | Unused | + '-> | | ptr | len | cap | | | ptr | len | cap | | capacity | + | +--|--+-----+-----+ | +--|--+-----+-----+ | | + +----|----------------+----|----------------+----------+ + | | + | | +---+---+--------+ + | '-> | 4 | 5 | Unused | + | +---+---+--------+ + | +---+---+---+--------+ + '-> | 1 | 2 | 3 | Unused | + +---+---+---+--------+ + +Illustration credits: @mbartlett21 +``` + +Can you see the `N`, where `N` is length of the external `Vec`, calls to the heap allocator? In the following illustration, the internal `ArrayVec`s are placed contiguously in the same space. + +```txt +let _: Vec> = vec![array_vec![1, 2, 3], array_vec![4, 5]]; + + +-----+-----+-----+ + | ptr | len | cap | + +--|--+-----+-----+ + | + | +------------------------------+--------------------------+----------+ + | | ArrayVec | Arrayvec | | + | | +-----+---+---+---+--------+ | +-----+---+---+--------+ | Unused | + '-> | | len | 1 | 2 | 3 | Unused | | | len | 4 | 5 | Unused | | capacity | + | +-----+---+---+---+--------+ | +-----+---+---+--------+ | | + +------------------------------+--------------------------+----------+ + +Illustration credits: @mbartlett21 +``` Each use-case is different and should be pondered individually. In case of doubt, stick with `Vec`. @@ -124,9 +165,7 @@ assert_eq!(array_vec.len(), 1); `ArrayVec` is a contiguous memory block where elements can be collected, therefore, a collection by definition and even though `core::collections` doesn't exist, it is the most natural module placement. -The API mimics most of the current `Vec` surface with some additional methods to manage capacity. - -Notably, these additional methods are verifiable (out-of-bound inputs or invalid capacity) versions of some well-known functions like `push` that will return `Result` instead of panicking at run-time. Since the upper capacity bound is known at compile-time and the majority of methods are `#[inline]`, the compiler is likely going to remove most of the conditional bounding checking. +To avoid length and conflicting conversations, the API will mimic most of the current `Vec` surface, which also means that all methods that depend on valid user input or valid internal capacity will panic at run-time when something goes wrong. For example, removing an element that is out of bounds. ```rust // Please, bare in mind that these methods are simply suggestions. Discussions about the @@ -142,7 +181,7 @@ impl ArrayVec { pub const fn new() -> Self; - // Methods + // Infallible Methods pub const fn as_mut_ptr(&mut self) -> *mut T; @@ -156,57 +195,48 @@ impl ArrayVec { pub fn clear(&mut self); - pub fn dedup(&mut self) - where - T: PartialEq; + pub const fn is_empty(&self) -> bool; - pub fn dedup_by(&mut self, same_bucket: F) - where - F: FnMut(&mut T, &mut T) -> bool; + pub const fn len(&self) -> usize; - pub fn dedup_by_key(&mut self, mut key: F) + pub fn retain(&mut self, mut f: F) where - F: FnMut(&mut T) -> K, - K: PartialEq; + F: FnMut(&mut T) -> bool; + + pub fn truncate(&mut self, len: usize); + + // Methods that can panic at run-time - pub fn drain(&mut self, range: R) -> Option> + pub fn drain(&mut self, range: R) -> Drain<'_, T, N> where R: RangeBounds; - pub fn extend_from_cloneable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]> + pub fn extend_from_cloneable_slice<'a>(&mut self, other: &'a [T]) where T: Clone; - pub fn extend_from_copyable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]> + pub fn extend_from_copyable_slice<'a>(&mut self, other: &'a [T]) where T: Copy; - pub fn insert(&mut self, idx: usize, element: T) -> Result<(), T>; - - pub const fn is_empty(&self) -> bool; - - pub const fn len(&self) -> usize; - - pub fn pop(&mut self) -> Option; + pub const fn insert(&mut self, idx: usize, element: T); - pub fn push(&mut self, element: T) -> Result<(), T>; + pub const fn push(&mut self, element: T); - pub fn remove(&mut self, idx: usize) -> Option; + pub const fn remove(&mut self, idx: usize) -> T; - pub fn retain(&mut self, mut f: F) - where - F: FnMut(&mut T) -> bool; - - pub fn splice(&mut self, range: R, replace_with: I) -> Option> + pub fn splice(&mut self, range: R, replace_with: I) -> Splice<'_, I::IntoIter, N> where I: IntoIterator, R: RangeBounds; - pub fn split_off(&mut self, at: usize) -> Option; + pub fn split_off(&mut self, at: usize) -> Self; - pub fn swap_remove(&mut self, idx: usize) -> Option; + pub fn swap_remove(&mut self, idx: usize) -> T; - pub fn truncate(&mut self, len: usize); + // Verifiable methods + + pub const fn pop(&mut self) -> Option; } ``` @@ -238,6 +268,47 @@ As seen, there isn't an implementation that stands out among the others because # Unresolved questions [unresolved-questions]: #unresolved-questions +### Verifiable methods + +Unlike methods that will abort the current thread execution, verifiable methods will signal that something has gone wrong or is missing. This approach has two major benefits: + +- `Security`: The user is forced to handle possible variants or corner-cases and enables graceful program shutdown by wrapping everything until `fn main() -> Result<(), MyCustomErrors>` is reached. + +- `Flexibility`: Gives freedom to users because it is possible to choose between, for example, `my_full_array_vec.push(100)?` (check), `my_full_array_vec.push(100).unwrap()` (panic) or `let _ = my_full_array_vec.push(100);` (ignore). + +In regards to performance, since the upper capacity bound is known at compile-time and the majority of methods are `#[inline]`, the compiler will probably have the necessary information to remove most of the conditional bounding checking when producing optimized machine code. + +```rust +pub fn drain(&mut self, range: R) -> Option> +where + R: RangeBounds; + +pub fn extend_from_cloneable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]> +where + T: Clone; + +pub fn extend_from_copyable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]> +where + T: Copy; + +pub const fn insert(&mut self, idx: usize, element: T) -> Result<(), T>; + +pub const fn push(&mut self, element: T) -> Result<(), T>; + +pub const fn remove(&mut self, idx: usize) -> Option; + +pub fn splice(&mut self, range: R, replace_with: I) -> Option> +where + I: IntoIterator, + R: RangeBounds; + +pub fn split_off(&mut self, at: usize) -> Option; + +pub fn swap_remove(&mut self, idx: usize) -> Option; +``` + +In my opinion, every fallible method should either return `Option` or `Result` instead of panicking at run-time. Although the future addition of `try_*` variants can mitigate this situation, it will also bring additional maintenance burden. + ### Nomenclature `ArrayVec` will conflict with `arrayvec::ArrayVec` and `tinyvec::ArrayVec`. From c6706710a203ed5d166a34c5b8a354c08b455959 Mon Sep 17 00:00:00 2001 From: Trevor Gross Date: Tue, 13 Sep 2022 22:40:46 -0400 Subject: [PATCH 4/8] Reset some PR-tied tags to 0000 --- ...8-stack_based_vec.md => 0000-array_vec.md} | 128 +++++++++++++----- 1 file changed, 95 insertions(+), 33 deletions(-) rename text/{2978-stack_based_vec.md => 0000-array_vec.md} (69%) diff --git a/text/2978-stack_based_vec.md b/text/0000-array_vec.md similarity index 69% rename from text/2978-stack_based_vec.md rename to text/0000-array_vec.md index fdd7ff56ce7..28f0cfca571 100644 --- a/text/2978-stack_based_vec.md +++ b/text/0000-array_vec.md @@ -1,37 +1,53 @@ -- Feature Name: `stack_based_vec` +- Feature Name: `array_vec` - Start Date: 2020-09-27 -- RFC PR: [rust-lang/rfcs#2990](https://github.com/rust-lang/rfcs/pull/2990) +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) +- Original PR: [rust-lang/rfcs#2990](https://github.com/rust-lang/rfcs/pull/2990) # Summary [summary]: #summary -This RFC, which depends and takes advantage of the upcoming stabilization of constant generics (min_const_generics), tries to propose the creation of a new vector named `ArrayVec` that manages stack memory and can be seen as an alternative for the built-in structure that handles heap-allocated memory, aka `alloc::vec::Vec`. +This RFC, which depends and takes advantage of the upcoming stabilization of +constant generics (min_const_generics), tries to propose the creation of a new +vector named `ArrayVec` that manages stack memory and can be seen as an +alternative for the built-in structure that handles heap-allocated memory, aka +`alloc::vec::Vec`. # Motivation [motivation]: #motivation -`core::collections::ArrayVec` should be conveniently added into the standard library due to its importance and potential. +`core::collections::ArrayVec` should be conveniently added into the standard +library due to its importance and potential. ### Optimization -Stack-based allocation is generally faster than heap-based allocation and can be used as an optimization in places that otherwise would have to call an allocator. Some resource-constrained embedded devices can also benefit from it. +Stack-based allocation is generally faster than heap-based allocation and can be +used as an optimization in places that otherwise would have to call an +allocator. Some resource-constrained embedded devices can also benefit from it. ### Unstable features and constant functions -By adding `ArrayVec` into the standard library, it will be possible to use internal unstable features to optimize machine code generation and expose public constant functions without the need of a nightly compiler. +By adding `ArrayVec` into the standard library, it will be possible to use +internal unstable features to optimize machine code generation and expose public +constant functions without the need of a nightly compiler. ### Useful in the real world -`arrayvec` is one of the most downloaded project of `crates.io` and is used by thousand of projects, including Rustc itself. Currently ranks ninth in the "Data structures" category and seventy-fifth in the "All Crate" category. +`arrayvec` is one of the most downloaded project of `crates.io` and is used by +thousand of projects, including Rustc itself. Currently ranks ninth in the "Data +structures" category and seventy-fifth in the "All Crate" category. ### Building block -Just like `Vec`, `ArrayVec` is also a primitive vector where high-level structures can use it as a building block. For example, a stack-based matrix or binary heap. +Just like `Vec`, `ArrayVec` is also a primitive vector where high-level +structures can use it as a building block. For example, a stack-based matrix or +binary heap. ### Unification -There are a lot of different crates about the subject that tries to do roughly the same thing, a centralized implementation would stop the current fragmentation. +There are a lot of different crates about the subject that tries to do roughly +the same thing, a centralized implementation would stop the current +fragmentation. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation @@ -60,7 +76,10 @@ for element in &v { assert_eq!(v, [7, 1, 2, 3]); ``` -Instead of relying on a heap-allocator, stack-based memory is added and removed on-demand in a last-in-first-out (LIFO) order according to the calling workflow of a program. `ArrayVec` takes advantage of this predictable behavior to reserve an exactly amount of uninitialized bytes up-front to form an internal buffer. +Instead of relying on a heap-allocator, stack-based memory is added and removed +on-demand in a last-in-first-out (LIFO) order according to the calling workflow +of a program. `ArrayVec` takes advantage of this predictable behavior to reserve +an exactly amount of uninitialized bytes up-front to form an internal buffer. ```rust // `array_vec` can store up to 64 elements @@ -80,7 +99,9 @@ const MY_CONST_ARRAY_VEC: ArrayVec = { }; ``` -Of course, fixed buffers lead to inflexibility because unlike `Vec`, the underlying capacity can not expand at run-time and there will never be more than 64 elements in the above example. +Of course, fixed buffers lead to inflexibility because unlike `Vec`, the +underlying capacity can not expand at run-time and there will never be more than +64 elements in the above example. ```rust // This vector can store up to 0 elements, therefore, nothing at all @@ -88,13 +109,17 @@ let mut array_vec: ArrayVec = ArrayVec::new(); array_vec.push(1); // Error! ``` -A good question is: Should I use `core::collections::ArrayVec` or `alloc::vec::Vec`? Well, `Vec` is already good enough for most situations while stack allocation usually shines for small sizes. +A good question is: Should I use `core::collections::ArrayVec` or +`alloc::vec::Vec`? Well, `Vec` is already good enough for most situations +while stack allocation usually shines for small sizes. * Do you have a known upper bound? -* How much memory are you going to allocate for your program? The default values of `RUST_MIN_STACK` or `ulimit -s` might not be enough. +* How much memory are you going to allocate for your program? The default values + of `RUST_MIN_STACK` or `ulimit -s` might not be enough. -* Are you using nested `Vec`s? `Vec>` might be better than `Vec>`. +* Are you using nested `Vec`s? `Vec>` might be better than + `Vec>`. ``` let _: Vec> = vec![vec![1, 2, 3], vec![4, 5]]; @@ -120,7 +145,9 @@ let _: Vec> = vec![vec![1, 2, 3], vec![4, 5]]; Illustration credits: @mbartlett21 ``` -Can you see the `N`, where `N` is length of the external `Vec`, calls to the heap allocator? In the following illustration, the internal `ArrayVec`s are placed contiguously in the same space. +Can you see the `N`, where `N` is length of the external `Vec`, calls to the +heap allocator? In the following illustration, the internal `ArrayVec`s are +placed contiguously in the same space. ```txt let _: Vec> = vec![array_vec![1, 2, 3], array_vec![4, 5]]; @@ -139,7 +166,8 @@ let _: Vec> = vec![array_vec![1, 2, 3], array_vec![4, 5]]; Illustration credits: @mbartlett21 ``` -Each use-case is different and should be pondered individually. In case of doubt, stick with `Vec`. +Each use-case is different and should be pondered individually. In case of +doubt, stick with `Vec`. For a more technical overview, take a look at the following operations: @@ -163,9 +191,14 @@ assert_eq!(array_vec.len(), 1); # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -`ArrayVec` is a contiguous memory block where elements can be collected, therefore, a collection by definition and even though `core::collections` doesn't exist, it is the most natural module placement. +`ArrayVec` is a contiguous memory block where elements can be collected, +therefore, a collection by definition and even though `core::collections` +doesn't exist, it is the most natural module placement. -To avoid length and conflicting conversations, the API will mimic most of the current `Vec` surface, which also means that all methods that depend on valid user input or valid internal capacity will panic at run-time when something goes wrong. For example, removing an element that is out of bounds. +To avoid length and conflicting conversations, the API will mimic most of the +current `Vec` surface, which also means that all methods that depend on valid +user input or valid internal capacity will panic at run-time when something goes +wrong. For example, removing an element that is out of bounds. ```rust // Please, bare in mind that these methods are simply suggestions. Discussions about the @@ -240,43 +273,66 @@ impl ArrayVec { } ``` -Meaningless, unstable and deprecated methods like `reserve` or `drain_filter` weren't considered. A concrete implementation is available at https://github.com/c410-f3r/stack-based-vec. +Meaningless, unstable and deprecated methods like `reserve` or `drain_filter` +weren't considered. A concrete implementation is available at +https://github.com/c410-f3r/stack-based-vec. # Drawbacks [drawbacks]: #drawbacks ### Additional complexity -New and existing users are likely to find it difficult to differentiate the purpose of each vector type, especially people that don't have a theoretical background in memory management. +New and existing users are likely to find it difficult to differentiate the +purpose of each vector type, especially people that don't have a theoretical +background in memory management. ### The current ecosystem is fine -`ArrayVec` might be an overkill in certain situations. If someone wants to use stack memory in a specific application, then it is just a matter of grabbing the appropriated crate. +`ArrayVec` might be an overkill in certain situations. If someone wants to use +stack memory in a specific application, then it is just a matter of grabbing the +appropriated crate. # Prior art [prior-art]: #prior-art These are the most known structures: - * `arrayvec::ArrayVec`: Uses declarative macros and an `Array` trait for implementations but lacks support for arbitrary sizes. - * `heapless::Vec`: With the usage of `typenum`, can support arbitrary sizes without a nightly compiler. - * `staticvec::StaticVec`: Uses unstable constant generics for arrays of arbitrary sizes. - * `tinyvec::ArrayVec`: Supports fixed and arbitrary (unstable feature) sizes but requires `T: Default` for security reasons. + * `arrayvec::ArrayVec`: Uses declarative macros and an `Array` trait for + implementations but lacks support for arbitrary sizes. + * `heapless::Vec`: With the usage of `typenum`, can support arbitrary sizes + without a nightly compiler. + * `staticvec::StaticVec`: Uses unstable constant generics for arrays of + arbitrary sizes. + * `tinyvec::ArrayVec`: Supports fixed and arbitrary (unstable feature) sizes + but requires `T: Default` for security reasons. -As seen, there isn't an implementation that stands out among the others because all of them roughly share the same purpose and functionality. Noteworthy is the usage of constant generics that makes it possible to create an efficient and unified approach for arbitrary array sizes. +As seen, there isn't an implementation that stands out among the others because +all of them roughly share the same purpose and functionality. Noteworthy is the +usage of constant generics that makes it possible to create an efficient and +unified approach for arbitrary array sizes. # Unresolved questions [unresolved-questions]: #unresolved-questions ### Verifiable methods -Unlike methods that will abort the current thread execution, verifiable methods will signal that something has gone wrong or is missing. This approach has two major benefits: +Unlike methods that will abort the current thread execution, verifiable methods +will signal that something has gone wrong or is missing. This approach has two +major benefits: -- `Security`: The user is forced to handle possible variants or corner-cases and enables graceful program shutdown by wrapping everything until `fn main() -> Result<(), MyCustomErrors>` is reached. +- `Security`: The user is forced to handle possible variants or corner-cases and + enables graceful program shutdown by wrapping everything until `fn main() -> + Result<(), MyCustomErrors>` is reached. -- `Flexibility`: Gives freedom to users because it is possible to choose between, for example, `my_full_array_vec.push(100)?` (check), `my_full_array_vec.push(100).unwrap()` (panic) or `let _ = my_full_array_vec.push(100);` (ignore). +- `Flexibility`: Gives freedom to users because it is possible to choose + between, for example, `my_full_array_vec.push(100)?` (check), + `my_full_array_vec.push(100).unwrap()` (panic) or `let _ = + my_full_array_vec.push(100);` (ignore). -In regards to performance, since the upper capacity bound is known at compile-time and the majority of methods are `#[inline]`, the compiler will probably have the necessary information to remove most of the conditional bounding checking when producing optimized machine code. +In regards to performance, since the upper capacity bound is known at +compile-time and the majority of methods are `#[inline]`, the compiler will +probably have the necessary information to remove most of the conditional +bounding checking when producing optimized machine code. ```rust pub fn drain(&mut self, range: R) -> Option> @@ -307,7 +363,10 @@ pub fn split_off(&mut self, at: usize) -> Option; pub fn swap_remove(&mut self, idx: usize) -> Option; ``` -In my opinion, every fallible method should either return `Option` or `Result` instead of panicking at run-time. Although the future addition of `try_*` variants can mitigate this situation, it will also bring additional maintenance burden. +In my opinion, every fallible method should either return `Option` or `Result` +instead of panicking at run-time. Although the future addition of `try_*` +variants can mitigate this situation, it will also bring additional maintenance +burden. ### Nomenclature @@ -332,7 +391,8 @@ let _: ArrayVec = array_vec![1; 2]; ### Dynamic array -An hybrid approach between heap and stack memory could also be provided natively in the future. +An hybrid approach between heap and stack memory could also be provided natively +in the future. ```rust pub struct DynVec { @@ -355,7 +415,9 @@ The above description is very similar to what `smallvec` already does. ### Generic collections and generic strings -Many structures that use `alloc::vec::Vec` as the underlying storage can also use stack or hybrid memory, for example, an hypothetical `GenericString`, where `S` is the storage, could be split into: +Many structures that use `alloc::vec::Vec` as the underlying storage can also +use stack or hybrid memory, for example, an hypothetical `GenericString`, +where `S` is the storage, could be split into: ```rust type DynString = GenericString>; From c75c586896f192c32a6c3e2784965361a504c95e Mon Sep 17 00:00:00 2001 From: Trevor Gross Date: Wed, 14 Sep 2022 01:37:39 -0400 Subject: [PATCH 5/8] Rewrote large sections of the text, requires further work --- text/0000-array_vec.md | 421 ++++++++++++++++------------------------- 1 file changed, 167 insertions(+), 254 deletions(-) diff --git a/text/0000-array_vec.md b/text/0000-array_vec.md index 28f0cfca571..1d2e9afbfca 100644 --- a/text/0000-array_vec.md +++ b/text/0000-array_vec.md @@ -4,70 +4,114 @@ - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) - Original PR: [rust-lang/rfcs#2990](https://github.com/rust-lang/rfcs/pull/2990) +**RFC TODO** _Would a name like `BufVec`/`BufferVec` be better? This is sort of +generic across both `array`s and buffers for things like MMIO that may benefit +from the structure._ + # Summary [summary]: #summary -This RFC, which depends and takes advantage of the upcoming stabilization of -constant generics (min_const_generics), tries to propose the creation of a new -vector named `ArrayVec` that manages stack memory and can be seen as an -alternative for the built-in structure that handles heap-allocated memory, aka -`alloc::vec::Vec`. +This RFC proposes the creation of an object to represent variable-length data +within a fixed size memory buffer, with associated methods to easily manipulate +it. The interface will mimic the most common methods of `Vec`, and this memory +buffer is an array; hence, the selected name of `ArrayVec`. This will provide +Rust with a representation of a very prevelant programming concept to enable +higher-level data manipulation without heap reliance. + # Motivation [motivation]: #motivation -`core::collections::ArrayVec` should be conveniently added into the standard -library due to its importance and potential. - -### Optimization - -Stack-based allocation is generally faster than heap-based allocation and can be -used as an optimization in places that otherwise would have to call an -allocator. Some resource-constrained embedded devices can also benefit from it. - -### Unstable features and constant functions - -By adding `ArrayVec` into the standard library, it will be possible to use -internal unstable features to optimize machine code generation and expose public -constant functions without the need of a nightly compiler. - -### Useful in the real world - -`arrayvec` is one of the most downloaded project of `crates.io` and is used by -thousand of projects, including Rustc itself. Currently ranks ninth in the "Data -structures" category and seventy-fifth in the "All Crate" category. - -### Building block - -Just like `Vec`, `ArrayVec` is also a primitive vector where high-level -structures can use it as a building block. For example, a stack-based matrix or -binary heap. +Vectors provide one of the easiest ways to work with data that may change its +length, and this is provided in Rust via `std::vec::Vec`. However, this requires +heap allocations, and this may not always be desirable in cases where: + +- An allocator is not available. This is typically `no_std` environments like + embedded, kernel, or safety-critical applications. +- A previous stack frame provides a buffer for data, and heap allocating would + be redundant. (This is very pervasive in C which has no vector representation, + which extends to Rust's FFI. Instead of vectors, function signatures like + `void somefunc(buf [BUF_SIZE], int* len)` are used when a function must return + variable-length data.) +- `Vec`-style data structures are required in `const` scopes +- Small or short-lived representations of variable data are preferred for + performance or memory optimization +- The buffer does not represent memory, e.g. memory-mapped I/O **RFC TODO** _is + this even worth mentioning? Could we guarantee anything that would make this + useful in MMIO? Would it be good/better to provide a trait for `push`, `pop`, + etc that would apply for this, some custom MMIO implementation, and `Vec`?_ + +While this sort of datastructure is likely to usually reside on the stack, it is +entirely possible to reside in some form on the heap within a `Box`, `Vec`, or +other structure. + +Possibly the most persuasive argument for why `ArrayVec` belongs in Rust's +`core` is that bits and pieces of the language already use it. Additionally, it +would provide a pathway for easing future development instead of piecewise +re-implementing the concept as needed. Some examples: + +- [`try_collect_into_array`][try_collect_arr] and its variants are used + internally. This function wraps a `Guard` struct containing an array and a + length that it initializes item by item. Essentially, _this is the fundamental + structure of `ArrayVec`_, it is just not made public. Having `ArrayVec` would + allow simplifying this function. +- The much-requested feature of some way to collect into arrays would have a + more clear path +- Constructing a `core::ffi::CStr` is not directly possible from `&str` due to + the extra bit. `ArrayVec` would allow for a more clear way to perform this + common operation in `no_std` environments. +- A structure such as `ArrayString` would be posssible to enable easier string + manipulation in `no_std` environments + +In short, the benefits to an `ArrayVec` concept are notable enough that there +are already parts of the implementation in core, and there are a handful of top +100 crates that provide similar functionality. Exporsing a public `ArrayVec` in +`core` would help fragmentation, provide a pathway for future language features, +and give users a builtin tool for a common form of data manipulation. + + +[try_collect_arr]: https://github.com/rust-lang/rust/blob/17cbdfd07178349d0a3cecb8e7dde8f915666ced/library/core/src/array/mod.rs#L804) -### Unification - -There are a lot of different crates about the subject that tries to do roughly -the same thing, a centralized implementation would stop the current -fragmentation. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -`ArrayVec` is a container that encapsulates fixed size buffers. +`ArrayVec` is a simple data structure, represented internally with a fixed-size +memory buffer (an array) and a length. It should feel very familiar to `Vec`. +The main difference to `Vec` is, the maximum capacity of that memory buffer must +be known at compile time, and is specified through a generic paramter. See the +comments in this example: ```rust +// RFC TODO: core::collections and core::ext have also been proposed +use core::array::ArrayVec; + +// We are creating an `ArrayVec` here that contains an i32, and has a capacity +// of 4. That capacity cannot be changed during runtime let mut v: ArrayVec = ArrayVec::new(); -v.push(1); -v.push(2); +// Adding values to this `ArrayVec` works almost as you would expect +// One difference is, `push()` returns a `Result<(), InsertionError>`. +// This is because there is a higher chance that the insertion may fail at runtime, +// compared to `Vec` +v.push(1).unwrap(); +v.push(2).unwrap(); + +// Length, indexing, and end access work similarly to other data structures assert_eq!(v.len(), 2); assert_eq!(v[0], 1); assert_eq!(v.pop(), Some(2)); assert_eq!(v.len(), 1); +// Indexed assignment works as expected +// **RFC TODO** what is the safe/checked way to perform assignment? It seems +// like there should be a `.set(&T) -> Result` method to go along with `get`, +// but I don't know what it is v[0] = 7; assert_eq!(v[0], 7); +// Many higher-order concepts work from `Vec` as well v.extend([1, 2, 3].iter().copied()); for element in &v { @@ -76,134 +120,68 @@ for element in &v { assert_eq!(v, [7, 1, 2, 3]); ``` -Instead of relying on a heap-allocator, stack-based memory is added and removed -on-demand in a last-in-first-out (LIFO) order according to the calling workflow -of a program. `ArrayVec` takes advantage of this predictable behavior to reserve -an exactly amount of uninitialized bytes up-front to form an internal buffer. +In the above example, the `ArrayVec` is allocated on the stack, which is its +usual home (though one can be present on the heap within another type). There +are advantages and disadvantages to this, but the main thing is that the maximum +capacity of the `ArrayVec` must be known at compile time. ```rust -// `array_vec` can store up to 64 elements -let mut array_vec: ArrayVec = ArrayVec::new(); +// `av` can store up to 64 elements +let mut v: ArrayVec = ArrayVec::new(); ``` -Another potential use-case is the usage within constant environments: +As its size is known at compile time, `ArrayVec` can also be initialized within +const environments: ```rust const MY_CONST_ARRAY_VEC: ArrayVec = { let mut v = ArrayVec::new(); - v.push(1); - v.push(2); - v.push(3); - v.push(4); + v.push(1).unwrap(); + v.push(2).unwrap(); + v.push(3).unwrap(); + v.push(4).unwrap(); v }; ``` -Of course, fixed buffers lead to inflexibility because unlike `Vec`, the -underlying capacity can not expand at run-time and there will never be more than -64 elements in the above example. +The biggest downside to `ArrayVec` is, as mentioned, that its capacity cannot be +changed at runtime. For this reason, `Vec` is generally preferable unless you +know you have a case that requires `ArrayVec`. ```rust -// This vector can store up to 0 elements, therefore, nothing at all -let mut array_vec: ArrayVec = ArrayVec::new(); -array_vec.push(1); // Error! -``` - -A good question is: Should I use `core::collections::ArrayVec` or -`alloc::vec::Vec`? Well, `Vec` is already good enough for most situations -while stack allocation usually shines for small sizes. - -* Do you have a known upper bound? - -* How much memory are you going to allocate for your program? The default values - of `RUST_MIN_STACK` or `ulimit -s` might not be enough. - -* Are you using nested `Vec`s? `Vec>` might be better than - `Vec>`. - +// An example attempting to push more than 2 elements +let mut array_vec: ArrayVec = ArrayVec::new(); +array_vec.push(1).unwrap(); // Ok +array_vec.push(1).unwrap(); // Ok +array_vec.push(1).unwrap(); // Error! ``` -let _: Vec> = vec![vec![1, 2, 3], vec![4, 5]]; - - +-----+-----+-----+ - | ptr | len | cap | - +--|--+-----+-----+ - | - | +---------------------+---------------------+----------+ - | | Vec | Vec | | - | | +-----+-----+-----+ | +-----+-----+-----+ | Unused | - '-> | | ptr | len | cap | | | ptr | len | cap | | capacity | - | +--|--+-----+-----+ | +--|--+-----+-----+ | | - +----|----------------+----|----------------+----------+ - | | - | | +---+---+--------+ - | '-> | 4 | 5 | Unused | - | +---+---+--------+ - | +---+---+---+--------+ - '-> | 1 | 2 | 3 | Unused | - +---+---+---+--------+ - -Illustration credits: @mbartlett21 -``` - -Can you see the `N`, where `N` is length of the external `Vec`, calls to the -heap allocator? In the following illustration, the internal `ArrayVec`s are -placed contiguously in the same space. - -```txt -let _: Vec> = vec![array_vec![1, 2, 3], array_vec![4, 5]]; - - +-----+-----+-----+ - | ptr | len | cap | - +--|--+-----+-----+ - | - | +------------------------------+--------------------------+----------+ - | | ArrayVec | Arrayvec | | - | | +-----+---+---+---+--------+ | +-----+---+---+--------+ | Unused | - '-> | | len | 1 | 2 | 3 | Unused | | | len | 4 | 5 | Unused | | capacity | - | +-----+---+---+---+--------+ | +-----+---+---+--------+ | | - +------------------------------+--------------------------+----------+ - -Illustration credits: @mbartlett21 -``` - -Each use-case is different and should be pondered individually. In case of -doubt, stick with `Vec`. - -For a more technical overview, take a look at the following operations: -```rust -// `array_vec` has a pre-allocated memory of 2048 bits (32 * 64) that can store up -// to 64 signed integers. -let mut array_vec: ArrayVec = ArrayVec::new(); - -// Although reserved, there isn't anything explicitly stored yet -assert_eq!(array_vec.len(), 0); +In the above example, the `push()` fails because the `ArrayVec` is already full. -// Initializes the first 32 bits with a simple '1' integer or -// 00000000 00000000 00000000 00000001 bits -array_vec.push(1); - -// Our vector memory is now split into a 32/2016 pair of initialized and -// uninitialized memory respectively -assert_eq!(array_vec.len(), 1); -``` +**RFC TODO** _I will add some more here_. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -`ArrayVec` is a contiguous memory block where elements can be collected, -therefore, a collection by definition and even though `core::collections` -doesn't exist, it is the most natural module placement. +`ArrayVec` represents a higher-level concept that is essentially a type of +collection that should be available without `std`. For this reason, +`core::collections` was chosen as its home. This does not exist yet, but may be +created with the intent that future collections may arise. + +In general, the API mimics that of `Vec` for simplicity of use. However: it is +expected that there is a relatively high chance of failure `pushing` to a +fixed-length `ArrayVec`, compared to the chance of an allocation failure for +pushing to a `Vec`. For that reason, failable methods return a `Result`. -To avoid length and conflicting conversations, the API will mimic most of the -current `Vec` surface, which also means that all methods that depend on valid -user input or valid internal capacity will panic at run-time when something goes -wrong. For example, removing an element that is out of bounds. +The reason behind this decision (instead of `panic!`ing) is that `ArrayVec` will +likely find common use in `no_std` systems like bare metal and kernelland. In +these environments, panicking is considered undefined behavior, so it makes +sense to guide the user toward infailable methods. (`unwrap` can easily be used +to change this behavior, at user discretion). ```rust -// Please, bare in mind that these methods are simply suggestions. Discussions about the -// API should probably take place elsewhere. +// The actual internal representation may vary, and should vary pub struct ArrayVec { data: MaybeUninit<[T; N]>, len: usize, @@ -214,8 +192,27 @@ impl ArrayVec { pub const fn new() -> Self; - // Infallible Methods + // Basic methods + pub const fn insert(&mut self, idx: usize, element: T) -> Result<(), InsertionError>; + + pub const fn push(&mut self, element: T) -> Result<(), InsertionError>; + + pub const fn remove(&mut self, idx: usize) -> Result ; + + pub const fn pop(&mut self) -> Option; + + pub const fn get(&mut self, idx: usize) -> Option; + + pub const fn first(&self) -> Option<&T> + + pub const fn first_mut(&self) -> Option<&mut T> + + pub const fn last(&self) -> Option<&T> + + pub const fn last_mut(&self) -> Option<&mut T> + // General methods + // **RFC TODO** verify what makes sense to return a `Result` pub const fn as_mut_ptr(&mut self) -> *mut T; pub const fn as_mut_slice(&mut self) -> &mut [T]; @@ -238,44 +235,37 @@ impl ArrayVec { pub fn truncate(&mut self, len: usize); - // Methods that can panic at run-time - - pub fn drain(&mut self, range: R) -> Drain<'_, T, N> + pub fn drain(&mut self, range: R) -> Result, IndexError> where R: RangeBounds; - pub fn extend_from_cloneable_slice<'a>(&mut self, other: &'a [T]) + pub fn extend_from_cloneable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]> where T: Clone; - pub fn extend_from_copyable_slice<'a>(&mut self, other: &'a [T]) + pub fn extend_from_copyable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]> where T: Copy; - pub const fn insert(&mut self, idx: usize, element: T); - - pub const fn push(&mut self, element: T); - - pub const fn remove(&mut self, idx: usize) -> T; - - pub fn splice(&mut self, range: R, replace_with: I) -> Splice<'_, I::IntoIter, N> + pub fn splice(&mut self, range: R, replace_with: I) -> Result, IndexError> where I: IntoIterator, R: RangeBounds; - pub fn split_off(&mut self, at: usize) -> Self; + pub fn split_off(&mut self, at: usize) -> Result; - pub fn swap_remove(&mut self, idx: usize) -> T; + pub fn swap_remove(&mut self, idx: usize) -> Result; - // Verifiable methods - - pub const fn pop(&mut self) -> Option; + // Maybe needed: Some sort of `from_ptr(*ptr, len)` that would ease FFI use } ``` -Meaningless, unstable and deprecated methods like `reserve` or `drain_filter` -weren't considered. A concrete implementation is available at -https://github.com/c410-f3r/stack-based-vec. +Traits that are implemented for `Vec` and `array` will be implemented for +`ArrayVec`, as is applicable. Unstable and deprecated methods like `reserve` or +`drain_filter` weren't considered. + +**RFC todo** _We need a discussion on `FromIter`. I don't know whether it belongs +in this RFC, or would be better to mention as a future use case_ # Drawbacks [drawbacks]: #drawbacks @@ -283,28 +273,29 @@ https://github.com/c410-f3r/stack-based-vec. ### Additional complexity New and existing users are likely to find it difficult to differentiate the -purpose of each vector type, especially people that don't have a theoretical -background in memory management. +purpose of each vector type, especially those that don't have a theoretical +background in memory management. This can be mitigated by providing coherent +docs in `ArrayVec`. ### The current ecosystem is fine -`ArrayVec` might be an overkill in certain situations. If someone wants to use -stack memory in a specific application, then it is just a matter of grabbing the -appropriated crate. +`ArrayVec` is arguably not needed in `core`, as there are a handful of existing +crates to handle the problem. However, being available in `core` will add the +possiblity of Rust using the feature, which otherwise wouldn't be an option. # Prior art [prior-art]: #prior-art These are the most known structures: - * `arrayvec::ArrayVec`: Uses declarative macros and an `Array` trait for +- `arrayvec::ArrayVec`: Uses declarative macros and an `Array` trait for implementations but lacks support for arbitrary sizes. - * `heapless::Vec`: With the usage of `typenum`, can support arbitrary sizes +- `heapless::Vec`: With the usage of `typenum`, can support arbitrary sizes without a nightly compiler. - * `staticvec::StaticVec`: Uses unstable constant generics for arrays of +- `staticvec::StaticVec`: Uses unstable constant generics for arrays of arbitrary sizes. - * `tinyvec::ArrayVec`: Supports fixed and arbitrary (unstable feature) sizes - but requires `T: Default` for security reasons. +- `tinyvec::ArrayVec`: Supports fixed and arbitrary (unstable feature) sizes + but requires `T: Default` to avoid unsafe `MaybeUninit`. As seen, there isn't an implementation that stands out among the others because all of them roughly share the same purpose and functionality. Noteworthy is the @@ -314,70 +305,15 @@ unified approach for arbitrary array sizes. # Unresolved questions [unresolved-questions]: #unresolved-questions -### Verifiable methods - -Unlike methods that will abort the current thread execution, verifiable methods -will signal that something has gone wrong or is missing. This approach has two -major benefits: - -- `Security`: The user is forced to handle possible variants or corner-cases and - enables graceful program shutdown by wrapping everything until `fn main() -> - Result<(), MyCustomErrors>` is reached. - -- `Flexibility`: Gives freedom to users because it is possible to choose - between, for example, `my_full_array_vec.push(100)?` (check), - `my_full_array_vec.push(100).unwrap()` (panic) or `let _ = - my_full_array_vec.push(100);` (ignore). - -In regards to performance, since the upper capacity bound is known at -compile-time and the majority of methods are `#[inline]`, the compiler will -probably have the necessary information to remove most of the conditional -bounding checking when producing optimized machine code. - -```rust -pub fn drain(&mut self, range: R) -> Option> -where - R: RangeBounds; - -pub fn extend_from_cloneable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]> -where - T: Clone; - -pub fn extend_from_copyable_slice<'a>(&mut self, other: &'a [T]) -> Result<(), &'a [T]> -where - T: Copy; - -pub const fn insert(&mut self, idx: usize, element: T) -> Result<(), T>; - -pub const fn push(&mut self, element: T) -> Result<(), T>; - -pub const fn remove(&mut self, idx: usize) -> Option; - -pub fn splice(&mut self, range: R, replace_with: I) -> Option> -where - I: IntoIterator, - R: RangeBounds; - -pub fn split_off(&mut self, at: usize) -> Option; - -pub fn swap_remove(&mut self, idx: usize) -> Option; -``` - -In my opinion, every fallible method should either return `Option` or `Result` -instead of panicking at run-time. Although the future addition of `try_*` -variants can mitigate this situation, it will also bring additional maintenance -burden. - ### Nomenclature `ArrayVec` will conflict with `arrayvec::ArrayVec` and `tinyvec::ArrayVec`. - -### Prelude - -Should it be included in the prelude? +`BufVec` or `BufferVec` may be alternatives. ### Macros +Macros should likely mimic `vec!`. + ```rust // Instance with 1i32, 2i32 and 3i32 let _: ArrayVec = array_vec![1, 2, 3]; @@ -389,29 +325,6 @@ let _: ArrayVec = array_vec![1; 2]; # Future possibilities [future-possibilities]: #future-possibilities -### Dynamic array - -An hybrid approach between heap and stack memory could also be provided natively -in the future. - -```rust -pub struct DynVec { - // Hides internal implementation - data: DynVecData, -} - -impl DynVec { - // Much of the `Vec` API goes here -} - -// This is just an example. `Vec` could be `Box` and `enum` an `union`. -enum DynVecData { - Heap(Vec), - Inline(ArrayVec), -} -``` - -The above description is very similar to what `smallvec` already does. ### Generic collections and generic strings From b4ae425419dfe34ef7cb0e3cff39ef29fb86c1fc Mon Sep 17 00:00:00 2001 From: Trevor Gross Date: Thu, 15 Sep 2022 00:19:22 -0400 Subject: [PATCH 6/8] Updates to all sections, ready for draft PR --- text/0000-array_vec.md | 267 ++++++++++++++++++++++++++--------------- 1 file changed, 168 insertions(+), 99 deletions(-) diff --git a/text/0000-array_vec.md b/text/0000-array_vec.md index 1d2e9afbfca..2b9d46c0ae6 100644 --- a/text/0000-array_vec.md +++ b/text/0000-array_vec.md @@ -4,9 +4,6 @@ - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) - Original PR: [rust-lang/rfcs#2990](https://github.com/rust-lang/rfcs/pull/2990) -**RFC TODO** _Would a name like `BufVec`/`BufferVec` be better? This is sort of -generic across both `array`s and buffers for things like MMIO that may benefit -from the structure._ # Summary [summary]: #summary @@ -16,34 +13,39 @@ within a fixed size memory buffer, with associated methods to easily manipulate it. The interface will mimic the most common methods of `Vec`, and this memory buffer is an array; hence, the selected name of `ArrayVec`. This will provide Rust with a representation of a very prevelant programming concept to enable -higher-level data manipulation without heap reliance. +higher-level data manipulation without heap reliance, as well as create a +backend to simplify implementation of various concepts in `core`. # Motivation [motivation]: #motivation -Vectors provide one of the easiest ways to work with data that may change its -length, and this is provided in Rust via `std::vec::Vec`. However, this requires -heap allocations, and this may not always be desirable in cases where: +Vectors provide one of the easiest ways to work with data that does not have a +fixed length, and an API to do this is provided in Rust via `std::vec::Vec`. +However, vectors requires heap allocations that may not be desirable (or +possible) in cases where: - An allocator is not available. This is typically `no_std` environments like embedded, kernel, or safety-critical applications. - A previous stack frame provides a buffer for data, and heap allocating would - be redundant. (This is very pervasive in C which has no vector representation, - which extends to Rust's FFI. Instead of vectors, function signatures like - `void somefunc(buf [BUF_SIZE], int* len)` are used when a function must return - variable-length data.) + be redundant. This is common in C, which has no vector type, so function + signatures like `void somefunc(int buf[BUF_SIZE], int* len)` are used when a + function must return variable-length data. `ArrayVec` would provide a + convenient wrapper for these representations, bolstering the ease of use of + Rust's FFI. - `Vec`-style data structures are required in `const` scopes - Small or short-lived representations of variable data are preferred for performance or memory optimization -- The buffer does not represent memory, e.g. memory-mapped I/O **RFC TODO** _is - this even worth mentioning? Could we guarantee anything that would make this - useful in MMIO? Would it be good/better to provide a trait for `push`, `pop`, - etc that would apply for this, some custom MMIO implementation, and `Vec`?_ +- The buffer does not represent actual memory, e.g. memory-mapped I/O -While this sort of datastructure is likely to usually reside on the stack, it is -entirely possible to reside in some form on the heap within a `Box`, `Vec`, or -other structure. +_(**RFC TODO** is the last item even worth mentioning? Could we guarantee +anything that would make this useful in MMIO? Would it be good/better to provide +a trait for `push`, `pop`, etc that would apply for this, some custom MMIO +implementation, and `Vec`?)_ + +While this sort of datastructure will usually reside on the stack, it is +entirely possible to be placed on the heap within something like a `Box` or +`Vec`. Possibly the most persuasive argument for why `ArrayVec` belongs in Rust's `core` is that bits and pieces of the language already use it. Additionally, it @@ -51,26 +53,26 @@ would provide a pathway for easing future development instead of piecewise re-implementing the concept as needed. Some examples: - [`try_collect_into_array`][try_collect_arr] and its variants are used - internally. This function wraps a `Guard` struct containing an array and a - length that it initializes item by item. Essentially, _this is the fundamental - structure of `ArrayVec`_, it is just not made public. Having `ArrayVec` would - allow simplifying this function. + internally. This function wraps a `Guard` struct containing a `MaybeUninit` + array and a length that it initializes item by item. Essentially, _this is the + fundamental structure of `ArrayVec`_, it is just not made public. Having + `ArrayVec` would allow simplifying this function and others like it. - The much-requested feature of some way to collect into arrays would have a - more clear path + more clear path, potentially by making `try_collect_into_array` public - Constructing a `core::ffi::CStr` is not directly possible from `&str` due to - the extra bit. `ArrayVec` would allow for a more clear way to perform this - common operation in `no_std` environments. + the extra bit needed. `ArrayVec` would allow for a more clear way to perform + this common operation in `no_std` environments. - A structure such as `ArrayString` would be posssible to enable easier string manipulation in `no_std` environments In short, the benefits to an `ArrayVec` concept are notable enough that there are already parts of the implementation in core, and there are a handful of top 100 crates that provide similar functionality. Exporsing a public `ArrayVec` in -`core` would help fragmentation, provide a pathway for future language features, -and give users a builtin tool for a common form of data manipulation. +`core` would help reduce fragmentation, provide a pathway for future language +features, and give users a builtin tool for a common form of data manipulation. -[try_collect_arr]: https://github.com/rust-lang/rust/blob/17cbdfd07178349d0a3cecb8e7dde8f915666ced/library/core/src/array/mod.rs#L804) +[try_collect_arr]: https://github.com/rust-lang/rust/blob/17cbdfd07178349d0a3cecb8e7dde8f915666ced/library/core/src/array/mod.rs#L804 # Guide-level explanation @@ -83,47 +85,55 @@ be known at compile time, and is specified through a generic paramter. See the comments in this example: ```rust -// RFC TODO: core::collections and core::ext have also been proposed -use core::array::ArrayVec; +use core::collections::ArrayVec; + +const ARR_LEN: usize = 4; -// We are creating an `ArrayVec` here that contains an i32, and has a capacity +// We are creating an `ArrayVec` here that contains instances of i32, and has a capacity // of 4. That capacity cannot be changed during runtime -let mut v: ArrayVec = ArrayVec::new(); +let mut v: ArrayVec = ArrayVec::new(); // Adding values to this `ArrayVec` works almost as you would expect // One difference is, `push()` returns a `Result<(), InsertionError>`. -// This is because there is a higher chance that the insertion may fail at runtime, -// compared to `Vec` +// This is because there is a much higher chance that the insertion may fail at +// runtime (running out of space on the buffer) compared to `Vec` v.push(1).unwrap(); v.push(2).unwrap(); -// Length, indexing, and end access work similarly to other data structures +// Length and indexing work similarly to other data structures assert_eq!(v.len(), 2); assert_eq!(v[0], 1); - assert_eq!(v.pop(), Some(2)); assert_eq!(v.len(), 1); // Indexed assignment works as expected // **RFC TODO** what is the safe/checked way to perform assignment? It seems // like there should be a `.set(&T) -> Result` method to go along with `get`, -// but I don't know what it is +// but I don't know what it is (probably just missing something) v[0] = 7; assert_eq!(v[0], 7); -// Many higher-order concepts work from `Vec` as well +// Many higher-order concepts from `Vec` work as well v.extend([1, 2, 3].iter().copied()); -for element in &v { +// `ArrayVec` can also be iterated +for element in v { println!("{}", element); } + +v.iter_mut().for_each(|x| *x += 2); + +// And can be cloned and compared +assert_eq!(v, v.clone()); + +// Comparisons to standard arrays also work assert_eq!(v, [7, 1, 2, 3]); ``` In the above example, the `ArrayVec` is allocated on the stack, which is its usual home (though one can be present on the heap within another type). There -are advantages and disadvantages to this, but the main thing is that the maximum -capacity of the `ArrayVec` must be known at compile time. +are advantages and disadvantages to this, but the main thing to keep in mind is +that the maximum capacity of the `ArrayVec` must be known at compile time. ```rust // `av` can store up to 64 elements @@ -144,6 +154,16 @@ const MY_CONST_ARRAY_VEC: ArrayVec = { }; ``` +This will also implement macros that mirror `vec!`: + +```rust +// Instantiate an i32 ArrayVec with capacity 40 and elements 1, 2, and 3 +let _: ArrayVec = array_vec![1, 2, 3]; + +// Instantiate an i32 ArrayVec with capacity 64, and 4 instances of `1` +let _: ArrayVec = array_vec![1; 4]; +``` + The biggest downside to `ArrayVec` is, as mentioned, that its capacity cannot be changed at runtime. For this reason, `Vec` is generally preferable unless you know you have a case that requires `ArrayVec`. @@ -158,7 +178,6 @@ array_vec.push(1).unwrap(); // Error! In the above example, the `push()` fails because the `ArrayVec` is already full. -**RFC TODO** _I will add some more here_. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -192,7 +211,7 @@ impl ArrayVec { pub const fn new() -> Self; - // Basic methods + // Basic methods similar to `Vec` pub const fn insert(&mut self, idx: usize, element: T) -> Result<(), InsertionError>; pub const fn push(&mut self, element: T) -> Result<(), InsertionError>; @@ -211,20 +230,19 @@ impl ArrayVec { pub const fn last_mut(&self) -> Option<&mut T> - // General methods - // **RFC TODO** verify what makes sense to return a `Result` - pub const fn as_mut_ptr(&mut self) -> *mut T; + pub fn clear(&mut self); + // General methods + pub const fn as_slice(&self) -> &[T]; + pub const fn as_mut_slice(&mut self) -> &mut [T]; - + pub const fn as_ptr(&self) -> *const T; - pub const fn as_slice(&self) -> &[T]; + pub const fn as_mut_ptr(&mut self) -> *mut T; pub const fn capacity(&self) -> usize; - pub fn clear(&mut self); - pub const fn is_empty(&self) -> bool; pub const fn len(&self) -> usize; @@ -256,84 +274,135 @@ impl ArrayVec { pub fn swap_remove(&mut self, idx: usize) -> Result; - // Maybe needed: Some sort of `from_ptr(*ptr, len)` that would ease FFI use + // **RFC TODO** Is this name apropriate? Should const + // This is designed to easily map to C's `void somefunc(int buf[BUF_SIZE], int* len)` + pub unsafe fn from_raw_parts_mut<'a, T, const N: usize>( + data: *mut [T; N], + len: usize, + ) -> &'a mut Self + } ``` Traits that are implemented for `Vec` and `array` will be implemented for -`ArrayVec`, as is applicable. Unstable and deprecated methods like `reserve` or -`drain_filter` weren't considered. - -**RFC todo** _We need a discussion on `FromIter`. I don't know whether it belongs -in this RFC, or would be better to mention as a future use case_ +`ArrayVec`, as is applicable. These may include: + +- `AsMut<[T]>` +- `AsRef<[T]>` +- `Borrow<[T]>` +- `Clone` +- `Debug` (creates an empty `ArrayVec`) +- `Deref` +- `DerefMut` +- `Drop` +- `Extend` +- `FromIterator` _**RFC Note** this needs discussion_ +- `Hash` +- `Index` +- `IndexMut` +- `IntoIterator` +- `Ord` +- `PartialEq` +- `TryFrom` _**RFC Note** probably need to use this wherever possible instead of `From`_ + +The list of traits above is a tall list, and it is likely to require some +pruning based on what is possible, and what takes priority. # Drawbacks [drawbacks]: #drawbacks -### Additional complexity +One drawback is that new (and existing) users are likely to find it difficult to +differentiate the purpose of each vector type, especially those that don't have +a theoretical background in memory management. This can be mitigated by +providing coherent docs in `ArrayVec` that indicate `Vec` is to be preferred. -New and existing users are likely to find it difficult to differentiate the -purpose of each vector type, especially those that don't have a theoretical -background in memory management. This can be mitigated by providing coherent -docs in `ArrayVec`. +The main drawback with anything new is that adding _any_ code adds a maintenance +overhead. The authors of the RFC consider this to nevertheless be a worthwhile +addition because it simplifies design patterns used not only by external users, +but also by `Rust` itself. -### The current ecosystem is fine -`ArrayVec` is arguably not needed in `core`, as there are a handful of existing -crates to handle the problem. However, being available in `core` will add the -possiblity of Rust using the feature, which otherwise wouldn't be an option. +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +_**RFC TODO** More can be added to this section once an interface is decided +upon_ # Prior art [prior-art]: #prior-art -These are the most known structures: - -- `arrayvec::ArrayVec`: Uses declarative macros and an `Array` trait for - implementations but lacks support for arbitrary sizes. -- `heapless::Vec`: With the usage of `typenum`, can support arbitrary sizes - without a nightly compiler. -- `staticvec::StaticVec`: Uses unstable constant generics for arrays of - arbitrary sizes. -- `tinyvec::ArrayVec`: Supports fixed and arbitrary (unstable feature) sizes - but requires `T: Default` to avoid unsafe `MaybeUninit`. - -As seen, there isn't an implementation that stands out among the others because -all of them roughly share the same purpose and functionality. Noteworthy is the -usage of constant generics that makes it possible to create an efficient and -unified approach for arbitrary array sizes. +Similar concepts have been implemented in crates: + +- `smallvec::SmallVec` Smallvec uses an interface that predates const generics, + i.e. `SmallVec` with `Array` implemented for `[T; 0]`, `[T; 1]`, + etc. It allows overflowing of the data onto the heap. +- `arrayvec::ArrayVec`: Similar to the implementation described here, quite + popular but unfortunately about a year out of maintenance +- `heapless::Vec`: Similar to the implementation described here, also includes + many other nonallocating collections. +- `tinyvec::ArrayVec`: Provides similar features to the described + implementation using only safe code. Has features +- `staticvec::StaticVec`: Similar features to the described implementation, + generally regarded as the most performant crate (so should be observed for + implementation guidelines) +- Work in `core` as described in [motivation](#motivation) # Unresolved questions [unresolved-questions]: #unresolved-questions -### Nomenclature +### Generic Interface -`ArrayVec` will conflict with `arrayvec::ArrayVec` and `tinyvec::ArrayVec`. -`BufVec` or `BufferVec` may be alternatives. +Thom Chiovoloni suggested an alternate interface based around `ArrayVec<[T]>` +and `ArrayVec<[T; N]>` [in a comment on the original pull +request](https://github.com/rust-lang/rfcs/pull/2990#issuecomment-848962572) +that allows for some code deduplication in assembly. The generics API is not +quite as elegant or clear as ``, but the benefits are worth +investigating. -### Macros +### Slice Backing -Macros should likely mimic `vec!`. +A more generic representation of an `ArrayVec` could be something that is +slice-backed, rather than array-backed. This could be quite powerful and is +worth looking into because it would allow using the `ArrayVec` API in places +where size is not known at compile time. For example: allocations in heap, +chunks of a stack-based buffer, FFI buffers where length is passed as an +argument, or really any arbitrary slice. -```rust -// Instance with 1i32, 2i32 and 3i32 -let _: ArrayVec = array_vec![1, 2, 3]; +Developing a clean API for this concept that still allows initializing as an +array is difficult. The "Generic Interface" suggestion directly above may +provide a solution via `ArrayVec<&[T]>`, or perhaps an enum-based option could +work. -// Instance with 1i32 and 1i32 -let _: ArrayVec = array_vec![1; 2]; +If slices are acceptable backing, something like `BufVec` would likely be a +better name. Additional methods along the lines of the following could be added +(generics and lifetimes omitted for brevity): + +```rust +/// Creates a zero-length ArrayVec that will be located on the provided slice +fn from_slice(buf: &mut [T]) -> Self + +/// Create an `ArrayVec` on a slice with a specified length of elements +/// +/// Safety: this function is not unsafe within Rust as all slices always contain +/// valid data. However, if the slice is coming from an external FFI, note that +/// the first `len` items of `buf _must_ contain valid data, otherwise undefined +/// behavior is possible. +fn from_slice_with_len(buf: &mut [T], len: usize) -> Self ``` # Future possibilities [future-possibilities]: #future-possibilities -### Generic collections and generic strings +### `ArrayVec`-backed `StringVec` -Many structures that use `alloc::vec::Vec` as the underlying storage can also -use stack or hybrid memory, for example, an hypothetical `GenericString`, -where `S` is the storage, could be split into: +A simple extension of `ArrayVec` would be `StringVec`, an array of +`u8`s. This would greatly simplify string manipulation options when a heap is +not available -```rust -type DynString = GenericString>; -type HeapString = GenericString>; -type StackString = GenericString>; -``` + +### Easier interface between `CStr` and `&str` + +`ArrayVec` would allow for a function that enables converting between `&str` and +`CStr` by providing a fixed-size buffer to write the `&str` and the terminating +`\0`. From a7b381b12224c6fe8a40d14a0695a72e3134ec18 Mon Sep 17 00:00:00 2001 From: Trevor Gross Date: Thu, 15 Sep 2022 00:44:58 -0400 Subject: [PATCH 7/8] Update with PR number --- text/{0000-array_vec.md => 3316-array-vec.md} | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) rename text/{0000-array_vec.md => 3316-array-vec.md} (98%) diff --git a/text/0000-array_vec.md b/text/3316-array-vec.md similarity index 98% rename from text/0000-array_vec.md rename to text/3316-array-vec.md index 2b9d46c0ae6..d8a6158c5af 100644 --- a/text/0000-array_vec.md +++ b/text/3316-array-vec.md @@ -1,8 +1,7 @@ - Feature Name: `array_vec` - Start Date: 2020-09-27 -- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) -- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) -- Original PR: [rust-lang/rfcs#2990](https://github.com/rust-lang/rfcs/pull/2990) +- RFC PR: [rust-lang/rfcs#3316](https://github.com/rust-lang/rfcs/pull/3316) +- Rust Issue: [rust-lang/rust#3316](https://github.com/rust-lang/rust/issues/3316) # Summary From 6a8f08d9640cc88c28eaa8dda998795668c857e8 Mon Sep 17 00:00:00 2001 From: Trevor Gross Date: Thu, 15 Sep 2022 01:13:43 -0400 Subject: [PATCH 8/8] Update rust issue number --- text/3316-array-vec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3316-array-vec.md b/text/3316-array-vec.md index d8a6158c5af..f263f9f30bf 100644 --- a/text/3316-array-vec.md +++ b/text/3316-array-vec.md @@ -1,7 +1,7 @@ - Feature Name: `array_vec` - Start Date: 2020-09-27 - RFC PR: [rust-lang/rfcs#3316](https://github.com/rust-lang/rfcs/pull/3316) -- Rust Issue: [rust-lang/rust#3316](https://github.com/rust-lang/rust/issues/3316) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) # Summary