-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(decode): optimize Vec alloc/resize #179
base: master
Are you sure you want to change the base?
Conversation
Avoid calloc overhead of initializing buffer we'll write into Unfortunately requires unsafe Vec::set_len so we can get a mutable ref to the uninit portion of the Vec's buffer
FYI this also fixes a "bug" in let mut buffer = Vec::<u8>::with_capacity(input.as_ref().len() * 4 / 3); This approximately allocates with a ratio of 4/3 instead of 3/4 (without accounting for padding, etc...) |
buffer.reserve(len_estimate); | ||
unsafe { | ||
buffer.set_len(total_len_estimate); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The safety documentation of set_len
is pretty clear that this is UB. Namely it states
The elements at
old_len..new_len
must be initialized.
But total_len_estimate = starting_output_len + len_estimate
, which means the last len_estimate
elements are not properly initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonasbb It's not UB or unsafe in aggregate since write into that "unintialized" portion of the buffer and then truncate to the true length.
This avoids the overhead of callocing/initializing a buffer that we'll immediately overwrite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation mentions that as a safety requirement, and this code is not upholding it. Later, the code also creates a reference to this uninitialized memory, which is also not allowed.
Some functions like decode
could also panic and unwind before truncate could set the correct length. Now, the drop implementation of Vec
might potentially read the uninitialized values.
If you really need uninitialized memory you need to write to it via a pointer and update the length only after writing. Alternatively, using MaybeUninit
is an option. If the buffer would be of type Vec<MaybeUninit<u8>>
, then I think the set_len
would be ok, but it isn't.
Thanks for this work. However, I'm not sure who the target user would be -- if you're ok with unsafe, the forthcoming AVX2 version will surely outperform it, and if you're not, you'd probably want the existing safe version. 🤔 Good catch on the over-allocation, too! |
Hi. There is another crate base64-simd for performance. It is highly unsafe but much faster than |
Avoid calloc overhead of initializing buffer we'll write into, improving
decode_small_input/decode/3
by -33%Unfortunately requires unsafe Vec::set_len so we can get a mutable ref to the uninit portion of the Vec's buffer
Before
After