Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor full join iterator to allow access to build tracker #10246

Merged
merged 3 commits into from
Jan 24, 2024

Conversation

jlowe
Copy link
Contributor

@jlowe jlowe commented Jan 22, 2024

Relates to #10240. When looping over build partitions for each stream batch, a full outer join requires that we keep track of which rows in the build side partitions have been referenced across all stream batches. After all the stream batches have been processed across all sub-partitions, the build-side row tracking data per partition can be used to perform the anti-join needed to finish results of the full outer join.

This refactors the full outer join iterator to have a sub-iterator that performs the left or right outer join and tracks the build side rows as it goes. After it is done iterating, callers can release the tracking data. This removes the need for a final batch concept in the abstract join iterator, but the abstract iterator does need to know whether it's safe to close the iterator early when hasNext returns false.

@jlowe jlowe self-assigned this Jan 22, 2024
@jlowe
Copy link
Contributor Author

jlowe commented Jan 22, 2024

build

@jlowe
Copy link
Contributor Author

jlowe commented Jan 23, 2024

build

@jlowe jlowe merged commit 35f64fc into NVIDIA:branch-24.02 Jan 24, 2024
40 checks passed
@jlowe jlowe deleted the full-join-refactor branch January 24, 2024 14:57
@sameerz sameerz added the task Work required that improves the product but is not user facing label Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants