Tracking Issue for Vec::push_within_capacity #100486

the8472 · 2022-08-13T08:55:19Z

Feature gate: #![feature(vec_push_within_capacity)]

This is a tracking issue for Vec::push_within_capacity

It enables pushing into a Vec if there is any capacity left and otherwise returns the value. I.e. it avoids implicit resizing. This can be useful when one wants to explicitly control it (e.g. via try_reserve) or uses a Vec in an unsafe manner where moving the elements would be a problem.

Public API

// alloc::Vec

impl Vec<T, A> {
    pub fn push_within_capacity(&mut self, value: T) -> Result<(), T>;
}

Steps / History

Implementation: add Vec::push_within_capacity - fallible, does not allocate #89123
Final comment period (FCP)¹
Stabilization PR

Unresolved Questions

naming (e.g. push_in_capacity)
may become redundant due to larger-scale work in RFC: Add more support for fallible allocations in Vec rfcs#3271

https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html ↩

The text was updated successfully, but these errors were encountered:

em-tg · 2023-03-19T15:20:45Z

Would it be possible for push_within_capacity to take self by shared reference? If the backing store is never re-allocated then calling this wouldn't invalidate any references to the Vec's contents, I'd think.

the8472 · 2023-03-19T22:57:39Z

It still needs to bump the length field.

em-tg · 2023-03-19T23:47:38Z

Wouldn't wrapping the length in UnsafeCell work for that? The public api only exposes the len() getter.

em-tg · 2023-03-20T00:27:24Z

Ah wait I think I see the issue. You'd lose Sync by doing that. Sorry, still pretty new to Rust and just thought I'd ask!

tiby312 · 2023-06-19T23:44:53Z

I think this is a great feature. I often find myself just assuming my buffer reserving code is working since it is a hard bug to catch when it's not working given push automatically reserves.

krtab · 2023-10-27T09:05:34Z

Hi!

I think this could enter its FCP:

RFC: Add more support for fallible allocations in Vec rfcs#3271 is not merged. Parts of it were released as a crate, part of it has been delegated to keyword genericity (so, pretty far away)
The code of add Vec::push_within_capacity - fallible, does not allocate #89123 has been in nightly for a year and received no criticisms afaict
This API can prove very useful to add optimization hints by marking its Err return value as unreachable

kornelski · 2024-02-11T21:34:30Z

I find it very useful, and have similar helpers in my codebases. It generates much leaner code than the usual push when the optimizer can't be convinced that the capacity is sufficient (e.g. when it's calculated in a non-trivial way, or the vec-appending code isn't all inlined). It has a useful side effect of verifying that a sufficient capacity has been precalculated, and there aren't unexpected reallocations.

However, just push is not enough. I'm missing similar functionality for extend_from_slice — _within_capacity.

elichai · 2024-02-11T22:12:52Z

I hope this isn't completely useful,
But is there some kind of mechanism we can use to allow all these try_ functionality without adding dozens of try_ in all collections, could being generic over the effect of try allow something like this?

Darksonn · 2024-05-01T13:56:29Z

How about insert_within_capacity? It's very non-trivial to reimplement insert without an allocation codepath.

the8472 · 2024-05-01T14:12:09Z

Sounds interesting, though a bit niche (just like this method). If you have a use-case you should file an ACP for it.
Ditto @kornelski

If you several a bunch of methods like this in mind then we could perhaps bundle them to see if there are some overarching things to consider.

zeroflaw · 2024-05-09T12:49:23Z

Now that it's possible to do a try_reserve, I find myself using these extension methods a lot more often. push_within_capacity here gives us an escape hatch to do an unchecked_push.
I included an example of insert_within_capacity and try_insert_within_capacity

// ===== trait Ext =====

pub trait Ext<T> {
    /// Appends an element to the back of a collection without checking
    /// if the collection has enough capacity.
    ///
    /// # Safety
    /// The user must ensure that `self.len() < self.capacity()`.
    ///
    unsafe fn unchecked_push(&mut self, value: T);

    /// Removes the last element from a vector without checking
    /// if the collection is empty.
    ///
    /// # Safety
    /// The user must ensure that `self.len() != 0`.
    ///
    unsafe fn unchecked_pop(&mut self) -> T;

    /// Inserts an element at position index within the vector, 
    /// without checking if the collection has enough capacity, 
    /// or the position index is out of bounds.
    /// all elements after the position index are shifted to the right. 
    /// 
    /// # Safety
    /// The user must ensure that `index <= self.len()` and `self.len() < self.capacity()`
    ///
    unsafe fn unchecked_insert(&mut self, index: usize, value: T);


    /// Inserts an element at position index within the vector, 
    /// if there is sufficient spare capacity, otherwise an error is returned with the element.
    /// 
    /// # Errors
    /// if insufficient capacity, an error is returned with the element
    /// 
    /// # Panics
    /// Panics if index > len.
    /// 
    fn insert_within_capacity(&mut self, index: usize, element: T) -> Result<(), T>;

    /// Inserts an element at position index within the vector, 
    /// if there is sufficient spare capacity, and index is within bounds,
    /// otherwise an error is returned with the element.
    /// 
    /// # Errors
    /// if insufficient capacity, or the index is out of bounds, an error is returned with the element
    /// 
    fn try_insert_within_capacity(&mut self, index: usize, element: T) -> Result<(), T>;
}

// ===== impl Vec =====

impl<T> Ext<T> for Vec<T> {
    unsafe fn unchecked_push(&mut self, value: T) {
        std::hint::assert_unchecked(self.len() < self.capacity());
        self.push_within_capacity(value).unwrap_unchecked();
    }

    unsafe fn unchecked_pop(&mut self) -> T {
        std::hint::assert_unchecked(!self.is_empty());
        self.pop().unwrap_unchecked()
    }
    
    unsafe fn unchecked_insert(&mut self, index: usize, element: T) {
        std::hint::assert_unchecked(index <= self.len());
        std::hint::assert_unchecked(self.len() < self.capacity());
        self.insert(index, element);
    }
    
    fn insert_within_capacity(&mut self, index: usize, element: T) -> Result<(), T> {
        if self.len() == self.capacity() {
            return Err(element);
        }
        unsafe { std::hint::assert_unchecked(self.len() < self.capacity()) };
        self.insert(index, element);
        Ok(())
    }
    
    fn try_insert_within_capacity(&mut self, index: usize, element: T) -> Result<(), T> {
        if self.len() == self.capacity() || index > self.len() {
            return Err(element);
        }
        unsafe { self.unchecked_insert(index, element); }
        Ok(())
    }
}

kornelski · 2024-05-09T21:43:02Z

There are two ways to implement such functionality, and I'm not sure which way to propose:

Add more methods on Vec with a _within_capacity version. I think this would work well, but I've got a lot of pushback about adding try_ versions of methods. Rust doesn't have a feature like Swift's function naming/overloading to keep all of these functionality tweaks elegantly together.
A reference type like FixedCapacityVecView<'_> that imitates interface of Vec, but without reallocation. This seems like a big addition, especially if it will duplicate most of Vec's interface, but most users will probably only need push and extend.

GoldsteinE · 2024-06-05T14:36:37Z

This FixedCapacityVecView<'_> is kinda similar to std::io::BorrowedBuf<'_> in both its API and idea.

kornelski · 2024-06-05T15:48:50Z

Although it's somewhat similar, BorrowedBuf is not a replacement:

it only supports u8, no other types.
it tracks initialized and filled levels separately, which is unnecessary for Vec. Just an extra field may seem like a small issue, but Vec had issues with missed optimizations and bloated code when LLVM lost track of len <= capacity relationship, and I'm afraid that tracking relationships betwee len, capacity, spare_capacity_slice.len(), filled, and initialized all at once may be hard for LLVM in hot loops. I use push in capacity mainly to reduce branches and code size.
it doesn't seem to have a way to integrate with a Vec in safe code. User needs to use vec.set_len() to update it with the filled len. This actually seems like a flaw in BorrowedVec to me.

the8472 · 2024-06-05T17:27:16Z

I assume that was only intended to point at some similar-not-identical prior art, not suggesting that BorrowedBuf can be reused for the purpose.

Add more methods on Vec with a within_capacity version. I think this would work well, but I've got a lot of pushback about adding try versions of methods.

Yeah, imo this is the biggest obstacle to getting anything done. It's either a lot new awkwardly named methods, view types or some generics/associated type shenanigans.
And some of the APIs require new design because they're not just 1:1 changes. Note how push_within_capacity has a return type, unlike push.

Thinking about design I now realize that there's an alternative, it could take a &mut MaybeUninit<T> and not move the value if the capacity is full. Would that bring any perf benefits, maybe for large values?
And that's just an alternative for a single method. This kind of exploration is why it's often suggested to test these thing as a crate outside of std first.

mcroomp · 2024-12-19T17:54:07Z

How about something like this:

pub fn needs_to_grow<T>(&self, additional: usize) -> bool {
    additional > self.capacity().wrapping_sub(self.len())
}

it makes it easy to solve both problems... the trick is that the implementation must match exactly to what RawVec does, otherwise the optimizer doesn't realize that the operations are identical.

kornelski · 2024-12-19T23:19:49Z

I don't think it's important to return elements that couldn't fit in the capacity. This isn't like I/O functions or concurrent queues where you can't predict whether insert will succeed or not. When you want to keep the elements that can't be inserted, it's sufficient to check the capacity before push/extend.

mcroomp · 2024-12-20T06:20:06Z

I don't think it's important to return elements that couldn't fit in the capacity. This isn't like I/O functions or concurrent queues where you can't predict whether insert will succeed or not. When you want to keep the elements that can't be inserted, it's sufficient to check the capacity before push/extend.

The main problem is the that obvious x.len() + additional <= x.capacity() will not convince the optimizer not to check twice, you have to do additional > self.capacity().wrapping_sub(self.len()) to match the check that RawVec makes, which is why it would probably be better to have a utility method in Vec to avoid taking a dependency on the internal implementation.

hanna-kruppe · 2024-12-20T10:10:56Z

Returning the value that couldn’t be pushed/inserted/etc. only seems to matter if:

The element type isn’t Copy, and
The code using *_within_capacity can do something useful with the excess value after failing to put it in the Vec, and
It’s important that the capacity is only checked once for runtime efficiency or binary size.

I struggle to imagine a use case with these properties that isn’t isomorphic to “generic try_{push,insert,…} that handles OOM by returning Err”. The other uses I know of only involve Copy types (e.g., T = u8 for byte centric I/O without std::io), or have sufficient capacity as logical invariant (i.e., if the within_capacity call fails, that’s not meaningfully recoverable).

I guess if you use a Vec as a fixed size buffer of non-Copy elements and want to flush the buffer to another stage of processing whenever it gets full? This needs push_within_capacity (or eat the potential cost of a manual capacity check) but not any other methods (that imply a capacity check). A moving extend_from_slice analogue is sometimes useful, but if you want to do better than pushing one element at a time then the technical challenge isn’t “de-duplicate a capacity check” but rather arranging a right-sized memcpy from the source into your buffer (repeatedly until the source is exhausted). This can be done today with spare_capacity_mut without having to assume anything undocumented about Vec.

If any use case for more than push_within_capacity amounts to tackling the “zoo of Vec::try_*” challenge, then I’d rather see just push_within_capacity stabilized on its own merits.

the8472 added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Aug 13, 2022

VictorKoenders mentioned this issue Mar 30, 2023

[WIP] Deny OOM, embrace try_reserve bincode-org/bincode#448

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking Issue for Vec::push_within_capacity #100486

Tracking Issue for Vec::push_within_capacity #100486

the8472 commented Aug 13, 2022 •

edited

Loading

em-tg commented Mar 19, 2023

the8472 commented Mar 19, 2023

em-tg commented Mar 19, 2023

em-tg commented Mar 20, 2023

tiby312 commented Jun 19, 2023

krtab commented Oct 27, 2023

kornelski commented Feb 11, 2024

elichai commented Feb 11, 2024

Darksonn commented May 1, 2024

the8472 commented May 1, 2024

zeroflaw commented May 9, 2024

kornelski commented May 9, 2024

GoldsteinE commented Jun 5, 2024

kornelski commented Jun 5, 2024

the8472 commented Jun 5, 2024 •

edited

Loading

mcroomp commented Dec 19, 2024

kornelski commented Dec 19, 2024

mcroomp commented Dec 20, 2024 •

edited

Loading

hanna-kruppe commented Dec 20, 2024

Tracking Issue for Vec::push_within_capacity #100486

Tracking Issue for Vec::push_within_capacity #100486

Comments

the8472 commented Aug 13, 2022 • edited Loading

Public API

Steps / History

Unresolved Questions

Footnotes

em-tg commented Mar 19, 2023

the8472 commented Mar 19, 2023

em-tg commented Mar 19, 2023

em-tg commented Mar 20, 2023

tiby312 commented Jun 19, 2023

krtab commented Oct 27, 2023

kornelski commented Feb 11, 2024

elichai commented Feb 11, 2024

Darksonn commented May 1, 2024

the8472 commented May 1, 2024

zeroflaw commented May 9, 2024

kornelski commented May 9, 2024

GoldsteinE commented Jun 5, 2024

kornelski commented Jun 5, 2024

the8472 commented Jun 5, 2024 • edited Loading

mcroomp commented Dec 19, 2024

kornelski commented Dec 19, 2024

mcroomp commented Dec 20, 2024 • edited Loading

hanna-kruppe commented Dec 20, 2024

the8472 commented Aug 13, 2022 •

edited

Loading

the8472 commented Jun 5, 2024 •

edited

Loading

mcroomp commented Dec 20, 2024 •

edited

Loading