Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: tokenization of ArgsKwargsPackedFunction #555

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

pfackeldey
Copy link
Collaborator

Fixes #553.

Now, the token is generated deterministically and thus the dak_cache is hit correctly for multiple repetitive invocations of dak.map_partitions with the same arguments.

Unfortunately I couldn't use __dask_tokenize__ because one needs to repack the args and kwargs correctly.

Not sure how much of a performance gain there is, would be nice to measure it somehow...

@lgray lgray closed this Dec 16, 2024
@lgray lgray reopened this Dec 16, 2024
@codecov-commenter
Copy link

codecov-commenter commented Dec 16, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.49%. Comparing base (8cb8994) to head (d68b470).
Report is 194 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #555      +/-   ##
==========================================
- Coverage   93.06%   92.49%   -0.58%     
==========================================
  Files          23       22       -1     
  Lines        3290     3439     +149     
==========================================
+ Hits         3062     3181     +119     
- Misses        228      258      +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pfackeldey
Copy link
Collaborator Author

@lgray do you want to add something else here? I would add a test tomorrow, and then the PR is ready from my side 👍

@pfackeldey
Copy link
Collaborator Author

This one is ready from my side @martindurant and @lgray 👍

@lgray
Copy link
Collaborator

lgray commented Dec 17, 2024

@pfackeldey with #558 you'll need to map_partitions a function that has structured inputs to get it to pack it into ArgsKwargsPackedFunction. Just want to make sure your test stays pertinent!

@martindurant
Copy link
Collaborator

+1

@lgray
Copy link
Collaborator

lgray commented Dec 17, 2024

i.e. you'll have to pass a dict or something like that which contains dask collections.

@pfackeldey
Copy link
Collaborator Author

pfackeldey commented Dec 17, 2024

I hope this is now correct @lgray and @martindurant

edit: I don't know what the failing test is about, but I don't think it's related to my changes? I'm confused...

@lgray lgray closed this Dec 17, 2024
@lgray lgray reopened this Dec 17, 2024
@lgray
Copy link
Collaborator

lgray commented Dec 17, 2024

Actually @martindurant I'm going to revert that change. Dealing with all the cases is worse than the string manipulation.

@martindurant
Copy link
Collaborator

Sorry :|

@lgray lgray closed this Dec 17, 2024
@lgray lgray reopened this Dec 17, 2024
@lgray
Copy link
Collaborator

lgray commented Dec 17, 2024

that one test is weirdly flakey...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wrong token generation with dak.map_partitions
4 participants