Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve calculation for SerializedValues::with_capacity #809

Merged
merged 1 commit into from
Sep 20, 2023

Conversation

cstyles
Copy link
Contributor

@cstyles cstyles commented Sep 10, 2023

It seems that currently the argument provided to this function is being treated as the number of items that we expect to serialize. Normally that would make sense because the argument is passed along to Vec::with_capacity which provisions space for the inner serialized_values field. However, that field is a Vec<u8> so really the argument corresponds to the total size of the serialized values in bytes.

This commit tries to be smarter about reserving space by taking into account the actual size in memory of the values to be serialized. It's not perfect (e.g., Strings will always reserve the same amount of space, regardless of how long their actual contents are) but hopefully it will still result in fewer re-allocations.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

Copy link
Collaborator

@piodul piodul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the contribution!

In the next release, we are planning to introduce new serialization traits (#463) which should make it possible to calculate required size exactly - however, we will keep the old traits in parallel for some time to make it possible to gradually transition to the new traits, so your contribution will stil be useful.

Please rebase on the newest main branch, the issue with failing min_rust check is fixed there.

It seems that currently the argument provided to this function is being
treated as the _number_ of items that we expect to serialize. Normally
that would make sense because the argument is passed along to
`Vec::with_capacity` which provisions space for the inner
`serialized_values` field. However, that field is a `Vec<u8>` so really
the argument corresponds to the total _size_ of the serialized values in
bytes.

This commit tries to be smarter about reserving space by taking into
account the actual size in memory of the values to be serialized. It's
not perfect (e.g., `String`s will always reserve the same amount of
space, regardless of how long their actual contents are) but hopefully
it will still result in fewer re-allocations.
@cstyles cstyles force-pushed the improve-calculation-for-with-capacity branch from f1c9063 to 389c772 Compare September 20, 2023 15:40
@piodul piodul merged commit 1854c96 into scylladb:main Sep 20, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants