perf(storage): Improve data alignment for multi-table compaction groups #13037

Li0k · 2023-10-24T15:18:19Z

We avoid the split of creating table after #11826.

Backfill snapshot read can cause tables to have a large write throughput during mv creation and cause excessive compaction groups to be created. But the streaming throughput can be low after mv creation completes.

Currently, we have not implemented compaction group merge, In the above scenario, it may cause us to waste more IOPS. However, placing high write-throughput tabless from in the default compaction group will not allow to utilization of parallel base compaction to improve the efficiency of the compaction, due to the key range not being aligned.

To solve the data alignment problem, propose a simple solution to improve the parallelism of compaction by performing some data alignment operations on the default compaction group, which may improve the performance of the backfill, reduce the stacking of l0s, and get more efficient compaction.

In the default compaction group, count the table write throughput in the creating phase (logic already exists).
cut the table high throughput by table_id and vnode to achieve data alignment (like the dedicated compaction group)
After backfill, restore the default logic to reduce the iops of the default compaction group.

hzxa21 · 2023-11-08T09:50:30Z

#13075

Li0k · 2023-11-13T08:48:41Z

Backfill Test

Resource

compute node = 8c_32g * 3
compactor node = 16c_4g * 1

Background

Test the behavior of compaction and backfill under different policies by creating mvs on a mirror cluster, The mvs created contain multiple state tables, and there are a few state tables with high write throughput. Comparing the old and new policies:

During Backfill, we do not split any compaction group and perform compaction in the default compaction group without data alignment.
During Backfill, data alignment is performed in the default compaction group.
main https://grafana.prod.risingwave.cloud/d/EpkBw5W4k/risingwave-dev-dashboard?from=1699416722067&orgId=1&to=1699417335729&var-component=All&var-datasource=P7AEF1E9BD6C1839A&var-instance=risingwave&var-namespace=rwc-g1hc96c0vserua5v7ohnjr7ash-monocle-troubleshooting&var-pod=All&var-table=All
branch https://grafana.prod.risingwave.cloud/d/EpkBw5W4k/risingwave-dev-dashboard?from=1699527970000&orgId=1&to=1699528590000&var-component=All&var-datasource=P7AEF1E9BD6C1839A&var-instance=risingwave&var-namespace=rwc-g1hc96c0vserua5v7ohnjr7ash-monocle-troubleshooting&var-pod=All&var-table=All

Result

CPU

main

branch

The compactor cpu utilization of branch has increased, which indirectly indicates an increase in parallelism.

Barrier Latency

main

branch

Read Duration - iter

main

branch

SStable Count

main

branch

SStable Size

main

branch

cg2 and cg3 have less stacked l0 and base level data.

cg2 75g vs 65g
cg3 32g vs 24g

Compaction Skip Count

main

branch

cg2 and cg3 have fewer skip counts due to pending-files

Compaction Task

main

branch

From the analysis of CompactTask's properties, we can find that the branch's task can eliminate more sub_levels, and the size of each task is controlled to be around 2g, and the number of files is kept below 100. Therefore, we can maintain a stable running task count and improve the compactor cpu utilization.

Compacting Task count

main

branch

It is intuitively obvious that the branch's base level compaction task has a higher parallelism.

Lsm Compact Pending Bytes

main

branch

Conclusion

Data alignment does bring some compaction benefits. It improves compactor utilization and therefore alleviates data buildup in lsm. However, in the current tests, the short backfill times do not result in significant time optimization, and the barrier latency is somewhat jittery due to more frequent compactions.

Li0k · 2024-04-08T09:08:51Z

related to #15291 , We will introduce new strategies to perform data alignment and split.

github-actions · 2024-06-12T08:59:55Z

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

Li0k added the type/feature label Oct 24, 2023

github-actions bot added this to the release-1.4 milestone Oct 24, 2023

Li0k mentioned this issue Oct 26, 2023

feat(storage): optimize data alignment for default compaction group #13075

Merged

8 tasks

hzxa21 assigned Li0k Nov 8, 2023

hzxa21 modified the milestones: release-1.4, release-1.5 Nov 8, 2023

Li0k modified the milestones: release-1.5, release-1.6 Dec 6, 2023

Li0k modified the milestones: release-1.6, release-1.8 Mar 6, 2024

Li0k modified the milestones: release-1.8, release-1.9 Apr 8, 2024

github-actions bot added the no-issue-activity label Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(storage): Improve data alignment for multi-table compaction groups #13037

perf(storage): Improve data alignment for multi-table compaction groups #13037

Li0k commented Oct 24, 2023

hzxa21 commented Nov 8, 2023

Li0k commented Nov 13, 2023

Li0k commented Apr 8, 2024

github-actions bot commented Jun 12, 2024

perf(storage): Improve data alignment for multi-table compaction groups #13037

perf(storage): Improve data alignment for multi-table compaction groups #13037

Comments

Li0k commented Oct 24, 2023

hzxa21 commented Nov 8, 2023

Li0k commented Nov 13, 2023

Backfill Test

Resource

Background

Result

CPU

Barrier Latency

Read Duration - iter

SStable Count

SStable Size

Compaction Skip Count

Compaction Task

Compacting Task count

Lsm Compact Pending Bytes

Conclusion

Li0k commented Apr 8, 2024

github-actions bot commented Jun 12, 2024