You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hummock has an almost fixed “Rule” for picking compaction tasks, and only violates the “Rule” in extreme cases (write-stop), which is reliable in most cases.
However, it is not always efficient. Hummock's Compaction Rule is write-friendly, for example
Batch according to configuration
minimize write amplification
Appropriate compaction task size and target_file size to improve parallelism.
In extreme cases, the above rules are not efficient, such as
Read-sensitive workloads (more timely l0 compact)
large L0 stacks due to huge ckpt (more aggressive l0 batch)
space-sensitive (more timely high level compact)
In fact, the above scenarios can be divided into read and write, but the problem of read operation is affected by more factors, such as operator cache, block cache, file cache and cache refill/evict policy. Therefore, I would like to focus on optimizing the write problem first.
For the write scenario, we can optimize the known case by selecting the task through a Rule. 1.
Optimize the task's output sst parition by Write throughput (already implemented).
adjust the selection rule and output of the L0 task by the number of L0 stacks.
More aggressive batch parameters, select more level count / sst count / max compaction size.
Adjust the output sst size to reduce the sst count, and reduce the pressure on the meta.
optimize trivial-move commits, commit more trivial-move tasks in one commit.
The text was updated successfully, but these errors were encountered:
Hummock has an almost fixed “Rule” for picking compaction tasks, and only violates the “Rule” in extreme cases (write-stop), which is reliable in most cases.
However, it is not always efficient. Hummock's Compaction Rule is write-friendly, for example
In extreme cases, the above rules are not efficient, such as
In fact, the above scenarios can be divided into read and write, but the problem of read operation is affected by more factors, such as operator cache, block cache, file cache and cache refill/evict policy. Therefore, I would like to focus on optimizing the write problem first.
For the write scenario, we can optimize the known case by selecting the task through a Rule. 1.
The text was updated successfully, but these errors were encountered: