Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reclaim space more aggresively for table with vnode table watermark specificed (table with range delete) #17157

Closed
hzxa21 opened this issue Jun 6, 2024 · 0 comments · Fixed by #17372
Assignees
Milestone

Comments

@hzxa21
Copy link
Collaborator

hzxa21 commented Jun 6, 2024

After switching to vnode table watermark (#13148) for range delete, we have ensured that range deletion is done in a monotonical way. This means that the per vnode deletion watermark is either monotonically increasing or decreasing. With this property, we have simplified the range delete implementation in many places of the storage code paths but compaction strategy is still agnostic to the deletion watermark.

The current status:

  • Compaction strategy generates compaction task without considering the per vnode table watermark. Only LSM balancedness and point deletion tombstone ratio are considered.
  • Compactor will take per vnode table watermark into consideration during the execution of compaction task. That means if a task includes SST falling below the vnode deletion watermark, the corresponding data will be reclaim after the compaction is done.

Potential issues with the current status:

  • There is no bound on when the data deleted by range delete can be reclaim. It is possible that the space for such tables keep increasing even when the range deletion happens more frequently then writes.
  • For append-only state table with range deletion (e.g. log store state table), trivial move compaction tasks can be the majority and almost all space cannot be reclaim in time in such cases.

Ideas:

  • Given the per vnode table watermark is monotonic, we can have a simple picker to pick SSTs with key range fully covered by the watermark for space reclaim. These are trivial space reclaim tasks with no actual compaction execution involved.
  • For simplicity, this picker can only take bottommost level into consideration.
  • To extend from above idea, we can generalize the picker to pick SSTs with >X% key range covered by the watermark for space reclaim. This can generate actual compaction tasks to execute.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants