Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite MultiProduct #835

Merged

Conversation

Philippe-Cholet
Copy link
Member

@Philippe-Cholet Philippe-Cholet commented Jan 3, 2024

Fixes #337. Closes #603 since this is an alternative.

#834 legitimately asked that we work on the bugfix #603. But after rebasing it, I saw that I could finally fuse the iterator (and therefore have a working specialization test). I thought of using #603 as a base but I eventually chose to make my own changes, starting over. I however credited @JakobDegen where it made sense.

I made commits as atomic as I could.

So this fuses the iterator which is a breaking change, which won't affect most people.

Copy link
Member

@phimuemue phimuemue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Philippe-Cholet .

We should do this.

@jswrenn
Copy link
Member

jswrenn commented Jan 3, 2024

Thanks for this! I'll give it a review tomorrow!

@Philippe-Cholet
Copy link
Member Author

Philippe-Cholet commented Jan 4, 2024

TL;DR The choice I made in db5d906 does not seem to affect performance.

I just tried cur: Option<I::Item>, in each MultiProductIter (as before in "master") instead of cur: Option<Vec<I::Item>>, in MultiProductInner (as in this PR), benchmarked both and got similar performances (both a bit slower for next than in "master" +8%, but with bugfix and fused).

EDIT: It's not as polished as this PR but this is the alternative I'm talking about.

Copy link
Member

@jswrenn jswrenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but needs some more documentation. For starters, could you add some more documentation to the type definitions?

src/adaptors/multi_product.rs Outdated Show resolved Hide resolved
src/adaptors/multi_product.rs Outdated Show resolved Hide resolved
src/adaptors/multi_product.rs Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Jan 18, 2024

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (62d2b65) 93.43% compared to head (085990c) 93.69%.

Files Patch % Lines
src/adaptors/multi_product.rs 97.56% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #835      +/-   ##
==========================================
+ Coverage   93.43%   93.69%   +0.25%     
==========================================
  Files          48       48              
  Lines        6778     6755      -23     
==========================================
- Hits         6333     6329       -4     
+ Misses        445      426      -19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Philippe-Cholet
Copy link
Member Author

I rebased on master and added 3 commits to comment "state and values" with code.

Copy link
Member

@phimuemue phimuemue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

Philippe-Cholet and others added 12 commits January 22, 2024 14:07
By wrapping "inner" in an option, I'll be able to fuse the iterator.

Keep the current value of each iterator in `cur`: none at the beginning, some later.
Previously, each `MultiProductIter` remembered its own element. Now, we have a vector of them. That way, we can update `cur` in-place and clone it to generate the item, I think that's simpler (less unwrap) and maybe more efficient. But we have two different vectors instead of a single bigger vector.
Co-Authored-By: Jakob Degen <[email protected]>
An empty product is supposed to generate a single empty vector, test this.

Co-Authored-By: Jakob Degen <[email protected]>
`count` is generally not a cheap operation so I avoid it when the result would be zero anyway.
Similar to `count` but with "size hint" operations.
While adding comments, I realized I could set the iterator as finished in the case of the nullary cartesian product.
It adds a simple invariant. I thought of making a comment but a debug check is better.
@Philippe-Cholet
Copy link
Member Author

I merely rebased (no conflict) to see if the new CI still accepts this.
Do we merge this (and #853) now or in the next release?

@jswrenn
Copy link
Member

jswrenn commented Jan 22, 2024

I'll cut a release with our non-breaking changes first, probably tomorrow.

@Philippe-Cholet Philippe-Cholet removed this from the next milestone Jan 22, 2024
@Philippe-Cholet Philippe-Cholet added this to the next milestone Jan 30, 2024
@Philippe-Cholet Philippe-Cholet added this pull request to the merge queue Jan 30, 2024
@Philippe-Cholet Philippe-Cholet removed this pull request from the merge queue due to a manual request Jan 30, 2024
@Philippe-Cholet
Copy link
Member Author

Now that "0.12.1" is released, I renamed the associated milestone and created a new one.
I think it's time to merge this and approve/merge #853 as well, right?

@Philippe-Cholet Philippe-Cholet added this pull request to the merge queue Jan 30, 2024
Merged via the queue into rust-itertools:master with commit 33541c4 Jan 30, 2024
13 checks passed
@Philippe-Cholet Philippe-Cholet deleted the rewrite-multiproduct branch January 30, 2024 14:12
stdrc added a commit to risingwavelabs/risingwave that referenced this pull request Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

multi_cartesian_product of zero iterators should produce one empty Vec
4 participants