-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
containers should provide some way to not panic on failed allocations #29802
Comments
I think this is pending the custom allocator work. |
Yes, in general, the standard library assumes that you can't really handle OOM. Using libcore and custom allocators, you could implement whatever behavior you'd like, but you'd still need to re-implement this kind of thing on top of it. The standard library is going to operate this way for the forseeable future, as it would require changing the signatures of stable things (like |
So I should have been clearer about this: I don't think that all operations would need to return To my mind, it'd be better for the standard library to have this sort of thing, rather than folks having to write their own crate for Does that seem reasonable, @steveklabnik ? I'm afraid I'm ignorant of the custom allocator bits cited herein, so I can't really tell if that's a way to modify the allocation behavior of |
Independent of the custom allocators, i.e. |
Yeah this is a tricky problem. I'd really not like to bloat up the collections APIs with _fallible variants if we can avoid it. However custom allocators will do nothing for this, as apasel notes. To be exhaustive, we'd need to provide fallible versions of: Vec:
Maps:
Unless you believe that only a subset is "truly" necessary? |
Waking up, and this is getting more in cache. Part of this problem is the fuzzy nature of OOM on modern platforms (at least, Linux). It's my understanding that you can happily allocate some eggregious amount of memory from a bad image, only to get smacked down by the OOM killer when you really start trying to use the memory, and there's no way for us to deal with that. Is there any problem with just having an incredibly reasonable policy like "no images bigger than 100MB"? |
I assume this wouldn't be backwards-compatible, but it'd be nice to avoid the proliferation of methods with something like: #![feature(default_type_parameter_fallback)]
pub trait AllocResult<T = (), E = ()> {
fn ok(value: T) -> Self;
fn err(err: E) -> Self;
}
impl<T, E> AllocResult<T, E> for T {
fn ok(value: T) -> T { value }
fn err(_err: E) -> T { oom(); }
}
impl<T, E> AllocResult<T, E> for Result<T, E> {
fn ok(value: T) -> Self { Ok(value) }
fn err(err: E) -> Self { Err(err) }
}
impl<T> Vec<T> {
pub fn with_capacity<T, R: AllocResult<Self> = Self>(cap: usize) -> R {
let ptr = alloc::heap::allocate(...);
if ptr.is_null() { R::err(()) } else { R::ok(Vec { ... }) }
}
pub fn push<R: AllocResult = ()>(&mut self, item: T) -> R {
if self.needs_reallocate() {
let ptr = alloc::heap::reallocate(...);
if ptr.is_null() { return R::err(()); }
...
}
...
R::ok(())
}
}
#[test]
fn test() {
let vec: Vec<i32> = Vec::with_capacity(5); // aborts on OOM
let mut vec: Vec<i32> = match Vec::with_capacity(5) { // returns `Err` on OOM
Ok(vec) => vec,
Err(_) => panic!("OOM"),
};
vec.push(1); // aborts on OOM
match vec.push(1) { // returns `Err` on OOM
Ok(_) => {}
Err(_) => panic!("OOM"),
}
} But this is clearly less discoverable and comprehensible for users. |
That's a reasonable point, but on Windows the memory allocator really can fail, and Rust should be able to deal with that relatively gracefully. I don't know about OS X; I'm guessing it's more like Windows than Linux in this regard?
A 100MB image is just 5k x 5k x 32bpp (raw pixel data), which is pretty big (roughly 3x 4k monitors), but doesn't sound like anything too out of the ordinary. |
I'm having trouble finding details TBH. It would appear that FreeBSD supports overcommit, and by extension OSX/iOS seem to as well, at least to a limited extent. It certainly seems that they will do implicit deduplication and then copy-on-write transparently. Since the COW can presumably fail to find any physical memory (otherwise why bother? cache?), that opens up code to the same implicit failure state. iOS at least will send your application a warning that the OOM killer is eyeing it, giving it a chance to make amends (but this may be futile if other memory-starved applications are eating all the memory you're giving up). |
Good old SIGDANGER |
IIRC there was some discussion about allowing custom allocators someday. In combination with specialization, could an allocator + |
cc me |
One could indeed expose different ops/signatures for custom allocators. In fact I believe it could all be hacked on top of associated types, where |
Wouldn't it be more general to leave everything as it is and offer a way to locally "catch" oom? Similar to catch_panic? |
CC @pnkfelix, who has been working on allocator stuff. |
That's correct. There's also a readability argument to be made: it's easier to tell whether something can fail if you read: if !v.append_fallible(...) { ... } or whether something is written incorrectly: // Forgot to check the return value!
v.append_fallible(...) { ... } vs. // Is v an infallible vector or did I forget to check the return value?
v.append(...); You could address this with Having two different types for vectors can also make for some awkward code: why do I care about the fallibility of allocating operations if the code that I'm writing doesn't actually care about doing allocation on the vector? You could cleverly address this problem with traits or templates, I suppose. In any event, Firefox is currently in the process of ripping that type distinction out and using separate methods instead. We think it's much more valuable to see the fallibility of the operation at the call site, rather than having it reside on the type of something, since the type may not be visible/obvious at the call site. |
Yes it should be obvious at the call site and not create type-clutter. But duplicating all the functions is just the same, but instead of having two types, you end up with loads of functions that almost do the same. With A solution similar to if let Ok(()) = catch_oom(|| v.push(42)) {
// do something
} if you forget to use the return value, the normal if such a wrapper is too much, you can always use a macro to make it less verbose: |
Please no catching of oom. I think panic on oom is untenable for the same reasons, more complicated exception safety issues for lots of You can even add the fallible allocation in user code (example) using unstable features, although I guess it's debatable if rust provides a guarantee that it will work. @froydnj |
@bluss, could you elaborate how a failed 1GB allocation (say, for a large photograph) would prevent unwinding later? My understanding is that there are many myths around overcommit, and that the system can actually tell prior to memory access whether an allocation is likely possible. |
There will be cases due to overcommit where the application receives a "successful" allocation, and it crashes when trying to use it later. However, that doesn't seem to be the problem, because we can't do much about that. @nathanaeljones I don't understand the scenario, I don't see why a failed allocation would prevent unwinding later. I may be missing something. The only Rust code that should have to worry about unsoundness in the presence of unwinding is It does seem unlikely, and I can't find any problematic code with a quick search, but introducing new panics in stable API is a hazard if there is code that relied on it to never panic. |
I was just wondering about this issue and @froydnj pointed me at this PR. It is a common occurrence in Firefox that we get crash reports that are triggered by OOMs caused by very large allocation requests (e.g. 100 MB or more). Our standard response to these is to make the allocation fallible. Firefox's data structures support this fairly well so it's usually a straightforward fix. Although aborting on OOM is a reasonable default, I'm worried that Rust doesn't have a graceful way of recovering from a 100 MB+ allocation request failing. |
As for Linux overcommit: at least for Firefox, to a first approximation, Windows is the only platform that matters. So I wouldn't focus too much on overcommit. |
FYI: in Firefox we are planning to implement FallibleVec and FallibleHashMap classes. This will likely involve copy-pasting the Vec and HashMap code and then adding the missing fallible operations. Which is awful, but necessary. (Strings might be necessary, too.) We regularly change allocation sites in Firefox from infallible (which is the default) to fallible in response to crash reports that show OOM failures due to large allocation sizes. These are typically on Win32 where OOM due to address space exhaustion is common when doing large allocations. See https://bugzilla.mozilla.org/show_bug.cgi?id=1329171 for an example. We need to be able to do the same thing in Rust code. |
Hey all, this has become high priority for the Firefox devs (which includes me) as they integrate more major Rust components into Gecko, so I'm going to start drafting up a proposal for providing some version of fallible collection methods to avoid fragmenting the ecosystem over this matter. Some quick thoughts: This is impossible, overcommit!Don't care. This can make software more reliable when it works. Sometimes it won't. Oh well. Types vs MethodsI firmly believe that a type-level distinction would be bad to have by default. Fallibility is fairly fluid in practice, and is nice to be able to integrate incrementally. Also if we provide fallible methods it's not a big deal for someone to newtype std collections to only expose the fallible APIs. API SurfaceThere are three tacts we can take here:
I'm going to push for minimal for now, with an intent to propose maximal when we have experience with these APIs under our belts. The APItry_* methods will return a Result, with an Error format based on the new Allocator APIs. I need to review that RFC and the format, which largely went down while I was absent from the community last year. I expect methods like try_push will need to return the pushed element on Err; I don't want to have to propose a callback-based or Entry-style API for this. Capacity overflow panics -- might make them a state in the Err? TBD! |
I'm strongly considering the as_fallible approach right now, although I still don't think we want to make Vec generic, because it requires significant language changes that likely won't ever happen. |
We could consider adapting the placement-in API to incorporate the notion of fallibility. |
How would that work? |
Make See also brain dump on IRC. |
|
@gankro I'm really glad to have you taking this on! A few thoughts:
|
@aturon et al We've discussed this with @gankro on IRC, but for the purpose of visibility I'll go on record here: I believe @gankro's proposal will handle the majority (if not virtually all) embedded use cases, which progressively fall back to a more complex error handling strategy based on domain requirements:
This also handles the most obvious hosted use case, libraries that cannot afford to bring down their host process on OOM, e.g. those handling user-controlled content such as zip or XML.
Why not just replace the implementation of |
I think we can do it at the type level while still not using Result everywhere in the infallible case and being backwards-compatible. You just need ATC and improved ergonomics of default type parameters: trait Allocator {
type Result<T>;
}
impl Allocator for InfallibleAllocator {
type Result<T> = T;
}
impl Allocator for FallibleAllocator {
type Result<T> = Result<T, AllocatorError>;
}
impl Vec<T, A: Allocator = InfallibleAllocator> {
fn push(v: T) -> A::Result<()>
}
I think that's the point, you'd want a macro to generate all those infallible wrappers |
We don't OOM on capacity overflow today, so that would be a change to the public API. If we feed the reason for failure through then it might be fine (we can match and choose the approach). There's also the matter of whether the fallible code optimizes to the same thing as the infallible code (more branches + a payload to manipulate). |
I've always been a fan of the type-based approach with associated The existing methods would require an infallibile allocator, but the default would be the global one with an infallibile wrapper so no compat is broken. |
This is objectively incorrect, as it's literally impossible to implement today, even on nightly. Edit: to be clear, such a design is blocked on generic associated types, which might be implemented and on stable in late 2018. |
@gankro you mean something like #29802 (comment) ? Yeah I don't want to do that. I want to add an associated error type to the allocator trait, make it |
Oh ok, well that one just won't ever be implemented, so it's even less possible to implement today? |
Huh? Lets not confused not wanting to do something with the impossibility of doing something. This is the main benefit of the associated error type, so it's good to consider together. |
Sorry do you mean having (on fn push(&mut self) -> Result<(), Alloc::Err> {
if cap == len { self.alloc.reserve(1)? }
...
} or fn push(&mut self) {
if cap == len { self.alloc.reserve(1).unwrap() }
...
}
fn try_push(&mut self) -> Result<(), Alloc::Err> {
if cap == len { self.alloc.reserve(1)? }
...
} |
Ah, sorry if I was unclear before. Here's a more worked out example: //!
//! in alloc or collections, based on how that shakes out
//!
struct PanicOom<T>(T);
impl<A: Alloc> Alloc for PanicOom<A> {
type Error = !;
// much repetition of
fn something(&mut self) {
self.something().or_else(|_| self.oom())
}
}
// In the future, define and use a:
// trait InfalibleAlloc=Alloc<Error=!>;
//!
//! in collections
//!
impl<T, A: Alloc> Vec<T, A> {
fn try_push(&mut self) -> Result<(), A::Err> {
if cap == len { self.alloc.reserve(1)? }
...
}
}
// in the future, use `InfalibleAlloc` for more readable bound
impl<T, A: Alloc<Error=!>> Vec<T, A> {
fn push(&mut self) {
self.try_push().void_unwrap()
}
}
//!
//! In std
//!
// stop gap until we have aliases with params with default args
type HashMap<K, V> = HashMap<K, V, PanicOom<GlobalAlloc>>; The idea being one either uses only |
This is an interesting idea, but it seems like a lot of work for a lint that none of the users I've interviewed actually seem interested in. @Ericson2314 do you intend to use this API? What for? (feel free to contact me on irc/email/twitter if you prefer) |
If we're talking about a lint for unhandled allocation failures
(non-fallible allocations), I and many others are very interested. Codec
libraries (and most other libs) are usually required to gracefully handle
any alloc failures.
…On Aug 3, 2017 12:28 PM, "Alexis Beingessner" ***@***.***> wrote:
This is an interesting idea, but it seems like a lot of work for a lint
that none of the users I've interviewed actually seem interested in.
@Ericson2314 <https://github.com/ericson2314> do you intend to use this
API? What for?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29802 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAGln17P3erDaE5NL_lOyO6Wg_Eqha-nks5sUhFWgaJpZM4GhKY0>
.
|
Sorry to be clear: there is a strong desire for such a lint, but this is a lot of work for a weak version of it. I could have sworn I linked this thread here, but I don't see it, so here's where I discuss various options and what different users were asking for: https://internals.rust-lang.org/t/notes-interviews-on-fallible-collection-allocations/5619/8 In it I describe hooking into the availability lint system that is intended to solve similar problems like "don't use floats". This would be more robust in that it would also verify you don't call methods which e.g. create their own Vec and infallibly allocate. |
Well, I'm glad you find it interesting :). This is something I've murmored about for a while, but I guess I made the mistake of never actually writing it down. Truth by told, my day job is Haskell so I personally will not have the time to be a serious consumer of the allocation APIs any time soon. I hope to fine time to use it in my tiny toy OS, but that's it for now. My general principle is while I'm sure serious industrial users can get away without any sort of lint, it's a nice thing to have that helps everyone not shoot themselves in the foot and, more importantly, help (no-std, for now) library authors not shoot each other in the foot, so the ecosystem can better trust itself. @aturon mentions the portability lint, but I think that's a rather heavyweight solution for dealing with a pure-Rust interface (as opposed to however allocation is implemented). In particular, no one has proposed a modular way just yet for using it with user-defined CFG tokens. Now, cargo features would be the natural solution, and I suppose we could add a [edited a bunch, a hard habit to kick.] |
@gankro Ah your latest comment appeared as I was writing this. Nor had I read the thread. Glad to here we all do want a lint. In light of that I think my point about making the lint easier to maintain is perhaps the best. Relatedly, for refactoring existing Rust code, e.g. Servo, I'd want
And again in terms of expediency, while the portability lint is not yet implemented, everything I mention an be done today. A road map could be
|
To be clear, we identified that the lint was nice to have but in no way a blocker for our users. As such we planned to punt on it and solve it more robustly with actual lints. In the mean time we could provide the methods, and users could build wrapper types to hide infallibility if they chose want. On the topic of different collections, only BTreeMap would run into issues with a naive translation. It shouldn't be too difficult to pre-allocate the nodes before starting the mutations. |
@gankro I think expecting users to write their own wrapper types is futile because the biggest benefit is trust between libraries, but I can't see any wrapper types outside of the libcollections becoming as standard. I also forgot to mention how error-type polymorphism allows one to convert code bases function by function. Is there anything you dislike about mine, besides it being a lot of work in your eyes? |
RFC: rust-lang/rfcs#2116 |
Since the RFC was merged, and I believe future work in this area will require extensive discussion (either on internals or in RFC(s)) I'm closing this issue. |
I was working on a Rust image library, which has code like:
This code should really be checking for overflow on the multiplications. But doing so only eliminates one class of problems with this code: it's still reasonable for a maliciously crafted image to have large
self.width
andself.height
values whose product doesn't overflowusize
and yet the amount of memory can't be allocated. (I discovered this through an image test suite that has images with...large widths and heights that ought to return errors, but panic'd in Rust.)Looking through the documentation, I didn't see any way of avoiding this panic-on-allocation failure, either at vector creation, or when trying to append elements to a vector.
The text was updated successfully, but these errors were encountered: