arrangement backfill can be slow when there are consecutive tombstones in upstream table #17267

hzxa21 · 2024-06-14T17:25:16Z

Describe the bug

Assuming we only have two vnodes and there are consecutive tombstones in upstream MV/Table state, which is possible when there is a temporal filter in the upstream:

vnode_1:
pk_1 -> tomb
pk_2 -> tomb
...
pk_N -> tomb
pk_(N+1) -> row
pk_(N+2) -> row
...

vnode_2:
pk_1 -> tomb
pk_2 -> tomb
...
pk_M -> tomb
pk_(M+1) -> row
pk_(M+2) -> row
...

With arrangement backfill, vnode_1 and vnode_2 are iterated independently and on seeing a barrier, backfill will stop in the current epoch as long as there is at least one visible row emitted in either of the two vnode iterator. Therefore, it is possible that the slow vnode will never update its current position and can hardly make progress in the next epoch because the consecutive tombstones will be repetitively scanned. Consider the following case:

In epoch1, vnode_1 and vnode_2 start the backfill snapshot read by scanning the upstream table independently.
vnode_1 scans [pk_1, pk_(N+1)] and emits pk_(N+2) -> row.
vnode_2 is slightly slower and just scans [pk_1, pk_(M-1)], which are all tombstones.
epoch2 comes. Backfill is interrupted. vnode_1's position is updated to pk_(N+2) while vnode_2's position remains to be left unbounded.

Error message/log

No response

To Reproduce

No response

Expected behavior

No response

How did you deploy RisingWave?

No response

The version of RisingWave

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

hzxa21 · 2024-07-10T09:35:43Z

Given that we target at releasing the 1st version of serverless backfill in v1.11, which can fundamentally solve this issue because backfill will no longer be interrupted. I wonder whether we still need to fix arrangement backfill in this corner case.

hzxa21 added the type/bug Something isn't working label Jun 14, 2024

github-actions bot added this to the release-1.10 milestone Jun 14, 2024

hzxa21 mentioned this issue Jun 14, 2024

fix(backfill): make tombstone iteration progress across all vnodes per epoch #17266

Closed

9 tasks

lmatz added the block-release-v1.10 label Jun 17, 2024

hzxa21 mentioned this issue Jun 24, 2024

Expose internal key in storage iterator for backfill #17427

Open

fuyufjh removed this from the release-1.10 milestone Jul 10, 2024

lmatz removed the block-release-v1.10 label Jul 26, 2024

kwannoel mentioned this issue Oct 8, 2024

CI: test_backfill_tombstone takes long time #17365

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

arrangement backfill can be slow when there are consecutive tombstones in upstream table #17267

arrangement backfill can be slow when there are consecutive tombstones in upstream table #17267

hzxa21 commented Jun 14, 2024

hzxa21 commented Jul 10, 2024

arrangement backfill can be slow when there are consecutive tombstones in upstream table #17267

arrangement backfill can be slow when there are consecutive tombstones in upstream table #17267

Comments

hzxa21 commented Jun 14, 2024

Describe the bug

Error message/log

To Reproduce

Expected behavior

How did you deploy RisingWave?

The version of RisingWave

Additional context

hzxa21 commented Jul 10, 2024