Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Left/full outer join incorrect for CollectLeft / broadcast #1055

Open
Dandandan opened this issue Sep 15, 2024 · 3 comments
Open

Left/full outer join incorrect for CollectLeft / broadcast #1055

Dandandan opened this issue Sep 15, 2024 · 3 comments
Labels
bug Something isn't working
Milestone

Comments

@Dandandan
Copy link
Contributor

Dandandan commented Sep 15, 2024

Describe the bug
See discussion here apache/datafusion#12454

The "broadcast join" (CollectLeft) is wrong for certain join types which produce results on unmatched left rows.

To Reproduce
Run a broadcast join with left / full outer on more than one node

Expected behavior

Additional context

@Dandandan Dandandan added the bug Something isn't working label Sep 15, 2024
@Dandandan Dandandan changed the title Left/full outer join incorrect Left/full outer join incorrect for CollectLeft / broadcast Sep 15, 2024
@Dandandan
Copy link
Contributor Author

There is a proposal in DataFusion for adding a hook to support sharing the join state apache/datafusion#12523

We tested this at Coralogix, this works very well for us.

@andygrove andygrove added this to the 0.13.0 milestone Nov 20, 2024
@Dandandan
Copy link
Contributor Author

It could be disabled as well, although that will likely hurt performance by quite a bit.

@milenkovicm
Copy link
Contributor

should we take this once it gets merged in DF ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants