Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance(compaction): Optimize the compaction task based on the state of the LSM #19531

Open
Li0k opened this issue Nov 21, 2024 · 0 comments
Open
Labels
component/storage Storage type/enhancement Improvements to existing implementation.

Comments

@Li0k
Copy link
Contributor

Li0k commented Nov 21, 2024

Hummock has an almost fixed “Rule” for picking compaction tasks, and only violates the “Rule” in extreme cases (write-stop), which is reliable in most cases.

However, it is not always efficient. Hummock's Compaction Rule is write-friendly, for example

  • Batch according to configuration
  • minimize write amplification
  • Appropriate compaction task size and target_file size to improve parallelism.

In extreme cases, the above rules are not efficient, such as

  • Read-sensitive workloads (more timely l0 compact)
  • large L0 stacks due to huge ckpt (more aggressive l0 batch)
  • space-sensitive (more timely high level compact)

In fact, the above scenarios can be divided into read and write, but the problem of read operation is affected by more factors, such as operator cache, block cache, file cache and cache refill/evict policy. Therefore, I would like to focus on optimizing the write problem first.

For the write scenario, we can optimize the known case by selecting the task through a Rule. 1.

  1. Optimize the task's output sst parition by Write throughput (already implemented).
  2. adjust the selection rule and output of the L0 task by the number of L0 stacks.
  • More aggressive batch parameters, select more level count / sst count / max compaction size.
    image
  • Adjust the output sst size to reduce the sst count, and reduce the pressure on the meta.
    image
  1. optimize trivial-move commits, commit more trivial-move tasks in one commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/storage Storage type/enhancement Improvements to existing implementation.
Projects
None yet
Development

No branches or pull requests

1 participant