Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat/perf: make UDF invoked only once if the updates do not actually change value #15877

Closed
lmatz opened this issue Mar 25, 2024 · 2 comments
Closed
Milestone

Comments

@lmatz
Copy link
Contributor

lmatz commented Mar 25, 2024

A request from a user:

I am not so sure how many times my UDF will be called though
On every new row I think yes but how do I make sure it is called on an existing row only if it changed ?

Answer:

n Update operation might become 2 rows UpdateDelete+UpdateInsert, so the UDF is called twice.

In theory, we can detect such no-op updates and compact them before invoking the UDF,
but since UDF is treated as a normal expression and is everywhere, it's hard to do it right now.

Link: #14855 (comment)

IIRC, the join operator always detects and compacts because the join operator may yield too many intermediate results, which can be very bad. (will double check)

@github-actions github-actions bot added this to the release-1.8 milestone Mar 25, 2024
@BugenZhao
Copy link
Member

Is this exactly #12201?

@lmatz lmatz closed this as not planned Won't fix, can't repro, duplicate, stale Mar 25, 2024
@xiangjinwu
Copy link
Contributor

IIRC, the join operator always detects and compacts because the join operator may yield too many intermediate results, which can be very bad. (will double check)

may be related: #14835

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants