-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lru_cache to module_available #8716
Add lru_cache to module_available #8716
Conversation
A significant time in xarray.backends.common.py:AbstractWriteableDataStore.set_variables is spent on common.py:is_dask_collection as it checks for the presence of the module dask. This time becomes significant in the case of many small files.
Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient. |
Seems very reasonable, any objections from anyone? Otherwise will merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have assumed that module lookup is super fast but never tested it.
Thanks, looks like a good addition!
This will change the behavior when a module is installed while python is running (not a blocker, just something to be aware of). |
I just checked and installing a package while python is running both stops import statements from throwing ImportException and changes the return value |
It isn't unreasonably slow, but in one instance we called |
I don't think we should care about someone installing packages while python is running. |
I apologize for the confusion, but this was actually the wrong |
@headtr1ck I added some more statistics about what is going on in the other PR. |
Apologize for the confusion, but I unfortunately made an error while preparing the PR and didn't include the caching of the correct |
Our application creates many small netcdf3 files: https://github.com/equinor/ert/blob/9c2b60099a54eeb5bb40013acef721e30558a86c/src/ert/storage/local_ensemble.py#L593 .
A significant time in xarray.backends.common.py:AbstractWriteableDataStore.set_variables is spent on common.py:is_dask_collection as it checks for the presence of the module dask which takes about 0.3 ms.
This time becomes significant in the case of many small files. This PR uses lru_cache to avoid rechecking for the presence of dask as it should not change for the lifetime of the application.
whats-new.rst
api.rst