Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Filter that also drops columns #9337

Open
jlowe opened this issue Sep 28, 2023 · 1 comment
Open

[FEA] Filter that also drops columns #9337

jlowe opened this issue Sep 28, 2023 · 1 comment
Labels
feature request New feature or request performance A performance related task/issue reliability Features to improve reliability or bugs that severly impact the reliability of the plugin

Comments

@jlowe
Copy link
Member

jlowe commented Sep 28, 2023

A somewhat common occurrence in queries is to see a filter on one or more columns followed by a project that drops the columns that were filtered on (i.e.: the columns were only needed to perform the filter). Currently we're manifesting the result of the filter on such columns only to drop them in the subsequent project. It would save time and GPU memory if we were able to transform the Filter --> Project into a FilterWithDrop that did not manifest the filter result for columns that are no longer needed after the filter.

@jlowe jlowe added feature request New feature or request ? - Needs Triage Need team to review and classify performance A performance related task/issue labels Sep 28, 2023
@revans2 revans2 added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Sep 29, 2023
@revans2
Copy link
Collaborator

revans2 commented Sep 29, 2023

This is very similar to #8831 we might also want to look into window dropping columns. I know that it can happen, especially if the column is just used to create the window and does not need to continue afterwards. I don't think window would be a performance improvement. We are not going to gather the data for window. But it would be a potential memory reduction by dropping the column slightly earlier. But maybe not that big of a win.

The only other operator I could think of that might drop a column after it is used and we could avoid a gather is hash aggregate, but that would require changes to CUDF, and I don't think it is that common of an operation.

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request performance A performance related task/issue reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

No branches or pull requests

3 participants