Lazy parameters adaptation - buffered mode #3029
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a follow up to #2974, which was limited to
ZSTD_stableInBuffer
mode.In this PR, input data is just buffered, until a full block is reached, or a
flush()
orend()
order arrives.This makes it possible to do late parameters adaptation when input is smaller than a full block, which is especially important for very small data, in order to use less resource and more appropriate parameters.
This (generally) results in better compression ratio, and also better speed, especially for newly allocated contexts, and as such can be seen as a follow up to #2969.
This proposal uses an additional buffer to buffer the first block before context initialization.
The issue with this proposal is that it's not compatible with
initStatic
.Using the existing context for buffering would be another possibility, but it's more complex to do when the context doesn't exist, or is too small. And there is also the tricky topic of transferring data from the old to the new context at initialization time, which is especially dangerous when they are overlapping, which happens with
initStatic,
and will likely require some dedicated code. So there is some balance to find with complexity.Anyway, this PR is at least good enough to start a discussion.