Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimizer: introduce 2 new operators for scanning sys table / CDC table #13146

Closed
fuyufjh opened this issue Oct 30, 2023 · 6 comments
Closed
Assignees
Milestone

Comments

@fuyufjh
Copy link
Member

fuyufjh commented Oct 30, 2023

Recommend to introduce a new physical operator CdcTableScan, because scanning from RW table, sys table or CDC table are completely different in terms of implementation.

Here the if self.predicate().always_true() actaully implies a huge difference in implementation, which should be reflected by different phyiscal operators, instead of such if-else branch.

I guess this might be introduced by sys table before, so it's acceptable for me too keep it as is now and do refactor later.

Originally posted by @fuyufjh in #12535 (comment)

@kwannoel
Copy link
Contributor

kwannoel commented Nov 7, 2023

+1. Arrangement backfill now requires full TableCatalog, not just the TableDesc. This creates yet another branch between RW Table (now requires full TableCatalog), Sys Table (which just has TableDesc).

I think we should refactor it before adding full TableCatalog for arrangement backfill to avoid making it even more complicated.

@kwannoel
Copy link
Contributor

kwannoel commented Nov 7, 2023

Additionally, they should have their own respective proto definitions as well, rather than rely on the same Chain definition.

@kwannoel
Copy link
Contributor

kwannoel commented Nov 10, 2023

Plan:

Misc:

@st1page
Copy link
Contributor

st1page commented Nov 16, 2023

Separate logical scan for cdc table.

I guess it is not needed for the logical scan, actually, I do not think there are many branches in the logical phase currently. I think just separate the physical operator is enough

after checking the current implementation, LGTM. because we do not have many common logic such as predicate push down and column pruning on the CDC table's backfill and stream scan...

Copy link
Contributor

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

@kwannoel
Copy link
Contributor

From #13146 (comment)

Unify some common functionality between different scan nodes:

It does not seem there's much in common, both between stream_table_scan and stream_cdc_scan, and also their core parts (generic::TableScan, generic::CdcScan).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants