Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong usage of unsafe Vec::from_raw_parts in TensorData #2375

Open
WorldSEnder opened this issue Oct 15, 2024 · 3 comments · May be fixed by #2416
Open

Wrong usage of unsafe Vec::from_raw_parts in TensorData #2375

WorldSEnder opened this issue Oct 15, 2024 · 3 comments · May be fixed by #2416

Comments

@WorldSEnder
Copy link
Contributor

WorldSEnder commented Oct 15, 2024

let bytes = unsafe { Vec::from_raw_parts(ptr as *mut u8, len, capacity) };

This unsafe line has an incorrect usage of Vec::from_raw_parts due to re-interpreting a pointer to some E as a pointer to u8. Note that the docs state in the safety requirements:

T needs to have the same alignment as what ptr was allocated with.

This is because some allocators care about the alignment of these types, hence the buffer that was initially allocated with alignment for the arbitrary (even possibly user-defined) type E will be deallocated with alignment of 1. Some allocators do not care about alignment where this bug does not surface. In any case, miri will complain about this minimal test case:

#[cfg(test)]
#[test]
fn bug() {
    let _ = burn::tensor::TensorData::new(vec![0xdeadu32], [1]);
}

incorrect layout on deallocation: alloc80436 has size 4 and alignment 4, but gave size 4 and alignment 1


Comment: The asserts at the top of the function is pointless, since size_of::<u8>() is guaranteed to return 1.
Comment2: The unconditional truncation of the input vector was a bit surprising - instead of an error due to length mismatch as in the case when the input vector is too short - and made me look closer into the function in the first place ;)

@WorldSEnder
Copy link
Contributor Author

bytemuck has a BoxBytes struct that could contain the allocation in an erased form, but it doesn't implement Clone and Debug.

@nathanielsimard
Copy link
Member

For now I think all Element types are valid in terms of alignment, making the checks quite useless and confusing.

@WorldSEnder
Copy link
Contributor Author

WorldSEnder commented Oct 23, 2024

Wrestling with a fix to this (and being stuck on incompatible serialization), there is one insight: the check does guard against size_of::<E> == 0.

I'm a bit surprised there haven't been more problems due to misaligned Vec<u8>s interpreted as some bigger element type, i.e. f32 and so on, leading to unaligned reads. It seems that most allocators overalign the data anyway.

In any case, this must be a breaking change in burn-tensor as this publicly exposes its data as a Vec<u8> which is not quite possible as explained above.

@WorldSEnder WorldSEnder linked a pull request Oct 24, 2024 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants