Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torch 2.5.0 crashes ocf-datapipes and fails testing #263

Closed
AUdaltsova opened this issue Oct 18, 2024 · 4 comments
Closed

Torch 2.5.0 crashes ocf-datapipes and fails testing #263

AUdaltsova opened this issue Oct 18, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@AUdaltsova
Copy link
Contributor

Describe the bug

Getting this error from CI tests on a pull request:

    from torch.utils.data.datapipes.iter.combining import T_co, _ChildDataPipe, _ForkerIterDataPipe
E   ImportError: cannot import name 'T_co' from 'torch.utils.data.datapipes.iter.combining' (/opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/torch/utils/data/datapipes/iter/combining.py)

CI is running with torch 2.5.0:

Requirement already satisfied: torch>=2.0.0 in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from PVNet==3.0.59) (2.5.0)

As far as I can tell due to this commint in torch 2.5.0 that refactored const T_co to _T_co, causing import to fail. We can refactor ocf-datapipes in the future or just put the requirement torch < 2.5.0 in PVNet for now

The last CI checks that ran before mine ran into this were here and this run loaded torch 2.4.1:

Requirement already satisfied: torch>=2.0.0 in /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages (from PVNet==3.0.58) (2.4.1)
@AUdaltsova AUdaltsova added the bug Something isn't working label Oct 18, 2024
@AUdaltsova
Copy link
Contributor Author

I've added the requirement to this PR and it started passing the CI tests (uses torch 2.4.1 now). I'm hesitant to refactor ocf-datapipes as I'm not sure what function these parts serve and the refactor should probably involve a bit more looking at the code, so what do we think about the requirement change for now? @peterdudfield @Sukh-P

Has occurred to me that even after refactor we'll need to change dependencies as depending on const names either 2.4.1- or 2.5.0+ are supported, they seem to be mutually exclusive on this

@peterdudfield
Copy link
Contributor

Yea I think its ok to pin for the moment. You can put an future issue in to upgrade pvnet to 2.5.0, and see if someone else wants to do it

@Sukh-P
Copy link
Member

Sukh-P commented Oct 18, 2024

Thanks for investigating this!

So my two cents is that this seems very much linked to the torch datapipes/ocf_datapipes and is only causing an issue in PVNet because of the current ocf_datapipes dependency which we plan to remove when ocf-data-sampler has feature parity with ocf_datapipes. Given this, would it be cleaner to pin torch in ocf_datapipes instead? In the short term it makes no difference as torch 2.4.1 would be chosen by the dependency resolver in PVNet but when ocf_datapipes is removed it means we don't have to change the torch pinning in PVNet again, happy with either way though

@AUdaltsova
Copy link
Contributor Author

Thanks for investigating this!

So my two cents is that these seems very much linked to the torch datapipes/ocf_datapipes and is only causing an issue in PVNet because of the current ocf_datapipes dependency which we plan to remove when ocf-data-sampler has feature parity with ocf_datapipes. Given this, would it be cleaner to pin torch in ocf_datapipes instead? In the short term it makes no difference as torch 2.4.1 would be chosen by the dependency resolver in PVNet but when ocf_datapipes is removed it means we don't have to change the torch pinning in PVNet again, happy with either way though

Yes very good point, just saw that this was a copy-over to support datapipes tooling specifically, so yes, probably best to pinon ocf_datapipes as shoud not affect dsampler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants