disable conflict check for those row_id generated by rowIdGenerator #14796

st1page · 2024-01-25T07:52:44Z

We have find the conflict check can be the bottleneck in #14635, even if we have optimized it with bloom filter and operator cach.

For the AppendOnly table without primary key, we do not need conflict check because all the row_id is generated by rowIdGenerator.

dev=> explain create table t(v int) append only;
                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 StreamMaterialize { columns: [v, _row_id(hidden)], stream_key: [_row_id], pk_columns: [_row_id], pk_conflict: NoCheck }
 └─StreamRowIdGen { row_id_index: 1 }
   └─StreamUnion { all: true }
     └─StreamExchange [no_shuffle] { dist: SomeShard }
       └─StreamDml { columns: [v, _row_id] }
         └─StreamSource
(6 rows)

when the table is non-append only, we still need to check for for the delete and update DML statment. But the the op rows are most from insert or the table's upstream connector like kafka. So if we find a way to determine those rows whose row_id is generated by rowIdGenerator, we can save the conflict check.

dev=> explain create table t(v int);
                                                        QUERY PLAN                                                         
---------------------------------------------------------------------------------------------------------------------------
 StreamMaterialize { columns: [v, _row_id(hidden)], stream_key: [_row_id], pk_columns: [_row_id], pk_conflict: Overwrite }
 └─StreamRowIdGen { row_id_index: 1 }
   └─StreamUnion { all: true }
     └─StreamExchange { dist: HashShard(_row_id) }
       └─StreamDml { columns: [v, _row_id] }
         └─StreamSource
(6 rows)

The text was updated successfully, but these errors were encountered:

chenzl25 · 2024-01-25T08:00:24Z

Is it possible that _row_id is deleted from DML?

st1page · 2024-01-26T08:29:36Z

Is it possible that _row_id is deleted from DML?

Yes, So we need some way to let MaterializeExecutor if the row needs to be checked (comes from Update/delete DML)

lmatz · 2024-04-18T12:40:40Z

This only applies to the case where there is no downstream MV or index, right?

With MV or index, we have to do the checks for inserts anyways.

st1page · 2024-04-18T12:47:35Z

This only applies to the case where there is no downstream MV or index, right?

With MV or index, we have to do the checks for inserts anyways.

It can optimized for all tables without primary key, whether there are MVs downstream or not

github-actions · 2024-08-01T02:08:55Z

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean.
Don't worry if you think the issue is still valuable to continue in the future.
It's searchable and can be reopened when it's time. 😄

github-actions bot added this to the release-1.7 milestone Jan 25, 2024

st1page modified the milestones: release-1.7, release-1.8 Mar 6, 2024

st1page modified the milestones: release-1.8, release-1.9 Apr 8, 2024

st1page self-assigned this Apr 8, 2024

st1page modified the milestones: release-1.9, release-1.10 May 14, 2024

st1page removed their assignment May 14, 2024

st1page modified the milestones: release-1.10, release-1.9 May 14, 2024

fuyufjh removed this from the release-1.9 milestone May 14, 2024

github-actions bot added the no-issue-activity label Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

disable conflict check for those row_id generated by rowIdGenerator #14796

disable conflict check for those row_id generated by rowIdGenerator #14796

st1page commented Jan 25, 2024

chenzl25 commented Jan 25, 2024

st1page commented Jan 26, 2024

lmatz commented Apr 18, 2024

st1page commented Apr 18, 2024

github-actions bot commented Aug 1, 2024

disable conflict check for those row_id generated by rowIdGenerator #14796

disable conflict check for those row_id generated by rowIdGenerator #14796

Comments

st1page commented Jan 25, 2024

chenzl25 commented Jan 25, 2024

st1page commented Jan 26, 2024

lmatz commented Apr 18, 2024

st1page commented Apr 18, 2024

github-actions bot commented Aug 1, 2024