Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for MaybeUninit methods for arrays #96097

Open
4 of 6 tasks
clarfonthey opened this issue Apr 16, 2022 · 34 comments
Open
4 of 6 tasks

Tracking Issue for MaybeUninit methods for arrays #96097

clarfonthey opened this issue Apr 16, 2022 · 34 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-close This PR / issue is in PFCP or FCP with a disposition to close it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@clarfonthey
Copy link
Contributor

clarfonthey commented Apr 16, 2022

This is a meta-tracking issue for multiple APIs that are linked across multiple issues. Right now it only includes two methods, but since there seems to be a desire to add more, this issue can be used as a placeholder for those discussions until those methods are added.

Public API

impl<T> MaybeUninit<T> {
    pub const fn uninit_array<const N: usize>() -> [Self; N];
    pub const fn array_assume_init<const N: usize>(array: [Self; N]) -> [T; N];
}

impl<T, const N: usize> MaybeUninit<[T; N]> {
    pub const fn transpose(self) -> [MaybeUninit<T>; N];
}

impl<T, const N: usize> [MaybeUninit<T>; N] {
    pub const fn transpose(self) -> MaybeUninit<[T; N]>;
}

Steps / History

Relevant Links

Unresolved Questions

  • Should MaybeUninit::uninit_array::<LEN>() be stabilised if it can be replaced by [const { MaybeUninit::uninit() }; LEN] ?
  • What other APIs should be added for arrays?
  • Is array_assume_init the right pattern, or should we convert from [MaybeUninit<T>, N] back to MaybeUninit<[T; N]> first?
@clarfonthey clarfonthey added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Apr 16, 2022
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Apr 16, 2022
…up, r=dtolnay

MaybeUninit array cleanup

* Links `MaybeUninit::uninit_array` to meta-tracking issue
* Links `MaybeUninit::array_assume_init` to meta-tracking issue
* Unstably constifies `MaybeUninit::array_assume_init`

Another thing worth mentioning: this splits the const feature flag for `maybe_uninit_uninit_array` into `const_maybe_uninit_uninit_array` to avoid weird cases where only one gets stabilised.

Note that it may be desired to keep the `array_assume_init` method linked to its dedicated issue, but at least for now, I decided to link to the meta-tracking issue so that all of the methods lead users to the same place. But I can revert that bit if desired.

The meta-tracking issue that I filed is rust-lang#96097.
@joboet
Copy link
Member

joboet commented Apr 16, 2022

I'm a bit concerned about the size and inconsistency of this API, as the corresponding functions for single values are all methods, while for arrays and slices we have associated functions. For example, array_assume_init fundamentally does the same thing as assume_init, but is called in a different way, with a different name. If the API was complete (zeroed_array is currently missing, among others), we would have 8 or so functions for arrays alone (currently only two, but a lot of functionality still unnecessarily requires unsafe).

I would rather prefer a trait-based API like this:

unsafe trait Uninitialized {
    type Initialized: ?Sized;

    fn uninit() -> Self
    where Self: Sized;
    fn zeroed() -> Self
    where Self: Sized;
    
    unsafe fn assume_init(self) -> Self::Initialized
    where Self: Sized;
    unsafe fn assume_init_ref(&self) -> &Self::Initialized;

    ...
}

unsafe impl<T> Uninitialized for MaybeUninit {...}
unsafe impl<T, const N: usize> Uninitialized for [T; N] {...}
unsafe impl<T> Uninitialized for [T] {...}

or with a marker trait like I proposed here.

@Noratrieb
Copy link
Member

I don't think it's worth making a trait for this. I've had many use cases for uninit_array before, and none for a generic solution. MaybeUninit is often used with arrays, since that's where initialization tends to be the most expensive.

While it is a trivial function to write yourself, I do think that it's worth it to stabilize maybe_uninit_uninit_array. It's a very common use case, and the alternative is to write am unsettling MaybeUninit::uninit ().assume_init() which is not very nice and makes it harder to audit the code.

@clarfonthey
Copy link
Contributor Author

clarfonthey commented Sep 10, 2022

Another thing worth asking: why not have methods directly implemented on arrays? You'd still need some form of uninit_array, but instead of MaybeUninit::array_assume_init(arr), you'd just do arr.assume_init(), and the implementation is on [MaybeUninit<T>; N].

This seems possible specifically because the types exist in core, but open to other interpretations.

I also suggested perhaps just adding Index and IndexMut implementations for MaybeUninit<[T]> that return MaybeUninit<T> and MaybeUninit<[T]> references, then removing the transposed versions of [MaybeUninit<T>] methods.

Of course, the weirdness of MaybeUninit<[T]> is that the length of the slice isn't uninit, just the data in the slice itself, since the pointer metadata is totally separate. But I feel like this is only a weird quirk that quickly becomes natural after a short explanation. Plus, this applies already to other DSTs like dyn Trait, even though a lot of methods require Sized. In principle you could coerce &mut MaybeUninit<T> into &mut MaybeUninit<dyn Trait> if it were more idiomatic and that would make sense as a type, even though you couldn't do much with it.

@WaffleLapkin
Copy link
Member

@clarfonthey with the current implementation, MaybeUninit<[T]> is not possible at all, MaybeUninit requires T: Sized.

@SUPERCILEX
Copy link
Contributor

SUPERCILEX commented Oct 15, 2022

Docs patch extracted from #102023:

Subject: [PATCH] Add MaybeUninit array transpose impls

Signed-off-by: Alex Saveau <[email protected]>
---
Index: library/core/src/mem/maybe_uninit.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/library/core/src/mem/maybe_uninit.rs b/library/core/src/mem/maybe_uninit.rs
--- a/library/core/src/mem/maybe_uninit.rs	(revision 8147e6e427a1b3c4aedcd9fd85bd457888f80972)
+++ b/library/core/src/mem/maybe_uninit.rs	(date 1665873934720)
@@ -117,15 +117,12 @@
 /// `MaybeUninit<T>` can be used to initialize a large array element-by-element:
 ///
 /// ```
-/// use std::mem::{self, MaybeUninit};
+/// use std::mem::MaybeUninit;
 ///
 /// let data = {
-///     // Create an uninitialized array of `MaybeUninit`. The `assume_init` is
-///     // safe because the type we are claiming to have initialized here is a
-///     // bunch of `MaybeUninit`s, which do not require initialization.
-///     let mut data: [MaybeUninit<Vec<u32>>; 1000] = unsafe {
-///         MaybeUninit::uninit().assume_init()
-///     };
+///     // Create an uninitialized array of `MaybeUninit`.
+///     let mut data: [MaybeUninit<Vec<u32>>; 1000] = MaybeUninit::uninit().transpose();
 ///
 ///     // Dropping a `MaybeUninit` does nothing, so if there is a panic during this loop,
 ///     // we have a memory leak, but there is no memory safety issue.
@@ -133,25 +130,23 @@
 ///         elem.write(vec![42]);
 ///     }
 ///
-///     // Everything is initialized. Transmute the array to the
-///     // initialized type.
-///     unsafe { mem::transmute::<_, [Vec<u32>; 1000]>(data) }
+///     // Everything is initialized. Convert the array to the initialized type.
+///     unsafe { MaybeUninit::<[Vec<_>; 1000]>::assume_init(data.transpose()) }
 /// };
 ///
-/// assert_eq!(&data[0], &[42]);
+/// assert_eq!(&data[42], &[42]);
 /// ```
 ///
 /// You can also work with partially initialized arrays, which could
 /// be found in low-level datastructures.
 ///
 /// ```
 /// use std::mem::MaybeUninit;
 /// use std::ptr;
 ///
-/// // Create an uninitialized array of `MaybeUninit`. The `assume_init` is
-/// // safe because the type we are claiming to have initialized here is a
-/// // bunch of `MaybeUninit`s, which do not require initialization.
-/// let mut data: [MaybeUninit<String>; 1000] = unsafe { MaybeUninit::uninit().assume_init() };
+/// // Create an uninitialized array of `MaybeUninit`.
+/// let mut data: [MaybeUninit<String>; 1000] = MaybeUninit::uninit().transpose();
 /// // Count the number of elements we have assigned.
 /// let mut data_len: usize = 0;
 ///

JohnTitor added a commit to JohnTitor/rust that referenced this issue Oct 16, 2022
…r=scottmcm

Add MaybeUninit array transpose From impls

See discussion in rust-lang#101179 and rust-lang#96097. I believe this solution offers the simplest implementation with minimal future API regret.

`@RalfJung` mind doing a correctness review?
thomcc pushed a commit to tcdi/postgrestd that referenced this issue Feb 10, 2023
Add MaybeUninit array transpose From impls

See discussion in rust-lang/rust#101179 and rust-lang/rust#96097. I believe this solution offers the simplest implementation with minimal future API regret.

`@RalfJung` mind doing a correctness review?
@SimonSapin
Copy link
Contributor

SimonSapin commented Apr 21, 2023

Should MaybeUninit::uninit_array::<LEN>() be stabilised if it can be replaced by [const { MaybeUninit::uninit() }; LEN] ?

I’d like to argue for: yes. Stabilizing now can allow users to remove some unsafe blocks soon, whereas we don’t know how long it’s gonna be until const {…} blocks are stable and powerful enough for this use case. If and when they eventually are, we’d end up with two ways of doing the same thing but that’s not really harmful. At that point we can soft-deprecate the constructor by pointing out the new pattern in its docs. (Emitting an actual deprecation warning will not worth be the churn IMO but will be a possibility.)

What other APIs should be added for arrays?

New proposals can be made at any time. The methods already in Nightly don’t need to be blocked on exhausting the design space for other thematically-related methods.

@nbdd0121
Copy link
Contributor

What's the rationale behind separate feature flag for const? Is there any reason that, if we stabilise this function, we want it to be non-const?

I think we should merge the feature flags and stabilise it in one go.

@safinaskar
Copy link
Contributor

I don't like name of this method. transpose should be reserved for actual matrix transpose

@andylizi
Copy link
Contributor

andylizi commented Jun 2, 2023

I don't like name of this method. transpose should be reserved for actual matrix transpose

That ship has already sailed with Option::transpose and Result::transpose. In this case, I feel it's somewhat unlikely that people doing high-level matrix operations would need to use the low-level MaybeUninit at the same time.

Here's the original discussion for transpose naming options: #47338 (comment)

@AlexTMjugador
Copy link

AlexTMjugador commented Jul 7, 2023

Inline const expressions are said to render this method less necessary, but such expressions aren't even necessary for Copy types on stable Rust today:

use std::mem::{MaybeUninit, transmute};

fn main() {
    let mut arr = [MaybeUninit::uninit(); 4];

    for i in 0..arr.len() {
        unsafe { *arr[i].as_mut_ptr() = i as u8; }
    }

    // Prints "[0, 1, 2, 3]"
    println!("{:?}", unsafe { transmute::<_, [u8; 4]>(arr) });
}

What's then the point of this method for these cases? Does it have any codegen advantage? I checked on godbolt.org and the generated assembly for this code didn't initialize the array elements. (Edit: even after tweaking this code to initialize the array positions to std::env::args().count() so that the compiler could not get too smart I couldn't get it to emit initialization code for the array positions.)

Edit: given that for non-Copy types you would need to resort to nightly features anyway, I'd prefer to just use inline const expressions instead, but there is some digression on this opinion.

Edit 2: actually, it looks like even for non-Copy types you don't need to resort to nightly features when using the inline-const crate.

@Lokathor
Copy link
Contributor

Lokathor commented Jul 7, 2023

Because it labels the action you're intending to do, without the reader having to puzzle out "oh, this time the transmute was to initialize the data within the type system".

Exactly like f32::to_bits and Option::unwrap and dozens of other small helper functions. Putting the name on the action is the value, not because there's better codegen than you could write yourself.

@AlexTMjugador
Copy link

Thanks - I should clarify that I was referring to uninit_array in my comment, but I indeed see some readability advantages for adding array_assume_init.

@Lokathor
Copy link
Contributor

Lokathor commented Jul 7, 2023

Ah, with uninit_array there is a small but temporary benefit other than the "name the action" argument. SimonSapin mentioned it a little bit up in the thread: Currently the array size of an output array can be inferred but the length of an array expression cannot.

@clarfonthey
Copy link
Contributor Author

clarfonthey commented Jul 8, 2023

Another thing which is arguably inconsequential is that the intent is a bit more clear: even though I know that the compiler isn't really copying the same uninitialised data N times into the array, it feels better to explicitly say "I want an uninitialised array" rather than "I want to initialise an array with each element uninitialised."

This is actually a reason why I argue that MaybeUninit<[T; N]> being easier to use feels more natural, although due to concerns with how MaybeUninit<[T]> would work (it doesn't at all right now), this isn't really being pursued.

@SimonSapin
Copy link
Contributor

I believe [MaybeUninit::uninit(); 4] didn't work yet when adding uninit_array, it wasn't about "more explicit".

@SUPERCILEX
Copy link
Contributor

Godbolt says they are optimized identically.

Replacing u64 with a struct (a more realistic thing to use with MaybeUninit makes that false: rust.godbolt.org/z/o6ecsxjGa

Dang, that's disappointing.

@dtolnay dtolnay changed the title Meta tracking Issue for MaybeUninit methods for arrays Tracking Issue for MaybeUninit methods for arrays Nov 23, 2023
@mcronce
Copy link

mcronce commented Dec 3, 2023

Just realized, uninit_array can also be written with array::from_fn.

I'm glad you posted this, it's much nicer than the [(); N].map(|_| MaybeUninit::uninit()) that I was just writing 😂

@rdrpenguin04
Copy link

I agree that it is possible to obtain an uninitialized array other ways, but uninit_array is the most obvious way to do it, and it's the cleanest solution I've found too. Aside from that, what's blocking stabilization?

@nazar-pc
Copy link

I am interested in <&MaybeUninit<[T; N]>>::transpose() and <&mut MaybeUninit<[T; N]>>::transpose(). Specifically this is helpful when allocating Box<[T; N]> where [T; N] is too large to fit on the stack, it would make it possible to use with iterators.

Would it be considered related to this issue?

@dtolnay
Copy link
Member

dtolnay commented Jun 5, 2024

@rust-lang/libs-api:
@rfcbot fcp close

(Only in regard to MaybeUninit::uninit_array, not the other unstable APIs still tracked by this issue.)

I propose that we accept #125082 to remove MaybeUninit::uninit_array() in favor of having callers use [MaybeUninit::uninit(); N] and [const { MaybeUninit::uninit() }; N]. (The const block is required for T: !Copy.)

When uninit_array was originally introduced, it was useful because it was a safe wrapper around an unsafe implementation: unsafe { MaybeUninit::<[MaybeUninit<T>; N]>::uninit().assume_init() }. The best equivalent safe way to express what this does used to be significantly more verbose:

fn workaround<T, const N: usize>() -> [MaybeUninit<T>; N] {
    trait Workaround: Sized {
        const UNINIT: MaybeUninit<Self>;
    }
    impl<T> Workaround for T {
        const UNINIT: MaybeUninit<Self> = MaybeUninit::uninit();
    }
    [<T as Workaround>::UNINIT; N]
}

These days, with const {…} expressions stabilizing in Rust 1.79 (#104087), the justification for a dedicated uninit_array is weaker and limited to convenience and discoverability.

My opinion aligns with @kpreid's characterization in the PR description:

The only remaining question is whether it is an important enough convenience to keep it around.

I believe it is net good to remove this function, on the principle that it is better to compose two orthogonal features (MaybeUninit and array construction) than to have a specific function for the specific combination, now that that is possible.

The counter perspective is the one in #125082 (comment).

I still prefer MaybeUninit::uninit_array() to [const { MaybeUninit::uninit() }; N], it's shorter and more readable IMHO.

I do not dispute that it is 25% fewer characters.

For the common case of copyable contents like an uninit u8 buffer, [MaybeUninit::uninit(); N] is even shorter than that.

Regarding readability, the comparison assumes one has read and retained this part of https://doc.rust-lang.org/std/mem/union.MaybeUninit.html, which is not free. Superseding a part of the extensive, ad-hoc API of MaybeUninit with better composable semantics is probably good for readability.

For discoverability, I'd expect [MaybeUninit::uninit(); N] is more discoverable than MaybeUninit::uninit_array(). From there, a compiler diagnostic can hint to add const if dealing with a non-Copy element type.

@dtolnay dtolnay closed this as completed Jun 5, 2024
@dtolnay dtolnay reopened this Jun 5, 2024
@dtolnay
Copy link
Member

dtolnay commented Jun 5, 2024

@rfcbot fcp close

@rfcbot
Copy link

rfcbot commented Jun 6, 2024

Team member @dtolnay has proposed to close this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-close This PR / issue is in PFCP or FCP with a disposition to close it. labels Jun 6, 2024
@dtolnay
Copy link
Member

dtolnay commented Jun 11, 2024

I realized I'll need to file a compiler diagnostics bug. Currently on nightly:

use std::mem::MaybeUninit;

fn main() {
    let _: [MaybeUninit<String>; 2] = [MaybeUninit::uninit(); 2];
}
error[E0277]: the trait bound `String: Copy` is not satisfied
 --> src/main.rs:4:40
  |
4 |     let _: [MaybeUninit<String>; 2] = [MaybeUninit::uninit(); 2];
  |                                        ^^^^^^^^^^^^^^^^^^^^^ the trait `Copy` is not implemented for `String`, which is required by `MaybeUninit<String>: Copy`
  |
  = note: required for `MaybeUninit<String>` to implement `Copy`
  = note: the `Copy` trait is required because this value will be copied for each element of the array
  = help: create an inline `const` block, see RFC #2920 <https://github.com/rust-lang/rfcs/pull/2920> for more information
help: consider creating a new `const` item and initializing it with the result of the function call to be used in the repeat position
  |
4 ~     const ARRAY_REPEAT_VALUE: MaybeUninit<String> = MaybeUninit::uninit();
5 ~     let _: [MaybeUninit<String>; 2] = [ARRAY_REPEAT_VALUE; 2];
  |

We should change this to suggest [const { MaybeUninit::uninit() }; 2], instead of the current suggestion which is:

const ARRAY_REPEAT_VALUE: MaybeUninit<String> = MaybeUninit::uninit();
let _: [MaybeUninit<String>; 2] = [ARRAY_REPEAT_VALUE; 2];

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Jun 11, 2024
@rfcbot
Copy link

rfcbot commented Jun 11, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Jun 11, 2024
@knickish
Copy link
Contributor

Is the intention that a new tracking issue be opened for the remaining methods?

@kpreid
Copy link
Contributor

kpreid commented Jun 11, 2024

A FCP to close doesn't mean that the issue has to be actually closed after it completes. dtolnay wrote

(Only in regard to MaybeUninit::uninit_array, not the other unstable APIs still tracked by this issue.)

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Jun 21, 2024
@rfcbot
Copy link

rfcbot commented Jun 21, 2024

The final comment period, with a disposition to close, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

@Ciel-MC
Copy link
Contributor

Ciel-MC commented Jul 21, 2024

This should be closed since inline const blocks has already landed in stable right? Or is that only part of this issue? Just going thru MaybeUninit and seeing the new uninit array method is still on nightly

@RalfJung
Copy link
Member

See here.

uninit_array should be removed, but the issue still tracks other methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-close This PR / issue is in PFCP or FCP with a disposition to close it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests