-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating graph or mesh should be marked unsafe #11
Comments
Good catch! Thanks for the comprehensive bug report. Your fix looks good but is a breaking change. We could raise the error in |
Actually, there is an issue. We cannot do any meaningful check in This is a pretty weird feature IMO, and I don't know anyone who uses this (let alone with these rust bindings). On the other hand, I don't want to be too opinionated and would rather keep the API close to So we can either
|
I do not think a lot of Rust users use numbering starting from 1.
Please do the second option. Not having Fortran arrays is not a problem. It will also avoid a lot of bugs playing with offsets (my personal experience dealing with offset in Scotch). |
This is true. At least all CSR crates that I know of start from 0 (sprs, nalgebra-sparse and russell_sparse).
METIS plays pretty safe and decrements all values from xadj and adjncy by 1 before partitioning 1. This is a pain for us however because it's not thread safe and requires mutable refs everywhere. If we prevent Fortran numbering maybe we could use shared references instead and reopen #8. However METIS could still be mutating input values in other places. Footnotes |
This fixes the unsound Graph::new and Mesh::new. See the following report for more details: <#11> Co-authored-by: Gavin Rohrer <[email protected]>
Just to add my 2cts, I was experiencing spurious segfaults and heap corruption. I spent a considerable amount of time debugging the issue, only to find out that it was my input that was causing the problem. The cause turned out to be that I was passing a weight of 0 for some edges. I meticulously verified my data, but this was something I overlooked and I figured it out when re-reading the METIS API documentation. It would've been a life saver if this was checked by the crate 😄. I would therefore opt for the first option, raising the error on the part_* calls, because then you'll be able check additional data (in my case, the edge weights) as well. |
I think we have to do a lot better at checking data before calling Metis, that is kind of fragile depending on user input. Raising errors should be generalized to all functions, with unchecked versions as well. As we speak about API changes, we should think about not having the partitioning function tied to the Graph structure and perhaps use a builder pattern to construct the data. |
I've been doing a lot of investigation into the METIS source in my efforts to port it to Rust and there are a whole bunch of other subtle ways you can cause undefined behavior. For example, graphs with degree that is "too large" will overflow the index in a way that is very difficult to predict (for example, in kwayrefine.c). There doesn't seem to be a safe way of doing it short of forcing METIS itself to do a ton more checks. Edit: the example in |
I'm usually a fan of "don't let perfect be the enemy of good enough", so I would opt for adding the checks even if we can't cover them all. However, then calling the variant with the checks 'safe' is a different question altogether, as technically it still isn't. But still better than the current situation, though... |
Yeah. If only there was a clear specification of what METIS accepts as input (whether CSR indices are sorted, value ranges, etc. the manual doesn't specify these if i remember correctly), then we could do those checks in a "safe" wrapper. But the manual is pretty light on details, and with every change upstream can add another way to trigger undefined behavior. Technically As Cédric said we can also do the checks/raise errors early in all A "good" thing about this is that we don't have to add all the checks at once. Once the API is fixed they can be added incrementally i think, at least if all functions return a impl<'a> Graph<'a> {
pub fn new(...) -> Result<Self, ...> { ... }
pub unsafe fn new_unchecked(...) -> Self { ... }
// could also take `self`, but then the user must
// be able to retrieve the graph from the error type
pub fn set_*(&mut self, ...) -> Result<&mut Self, ...> { ... }
pub unsafe fn set_*_unchecked(&mut self, ...) -> &mut Self { ... }
// Take a &mut self instead of owned so that
// Graph can be reused without doing all the checks
pub unsafe fn part_*(&mut self, ...) -> Result<...> { ... }
// no need for part_*_unchecked, if all checks
// happen in set_* and new? we can always
// add it later without breaking compatiblity
}
// same with Mesh
pub unsafe fn mesh_to_dual(...) -> Result<Dual, ...> { ... } I currently don't have time to continue work on #13 so feel free to take it from there if you want to. |
The problem
All partition functions in METIS will cause undefined behavior if given not well formed CSR graphs.
For example:
This is expected as METIS does unchecked array indexing using values in
adjncy
in many places.Because of this, the API is unsound and should be changed.
How can this be fixed
According to the Rust API Guidelines, the way to fix this and keep the signatures the same will be to add a validation function that checks every value of
adjncy
that it is less thanxadj.len() - 1
. The current implementations could be moved to an_unchecked
version of the function.The text was updated successfully, but these errors were encountered: