-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
linux torch builds report an incompatible version string #315
Comments
It's not up to me, but I'm going to be the devil's advocate here:
I don't understand why the >>> packaging.version.Version(importlib.metadata.version("torch"))
<Version('2.5.1.post108')>
I would suggest reporting that anyway, since |
we can try to align with the versions reported in the pip packages. The CPU version is
The Cuda version is
These seem to be semver parsable. But.... I agree that semver parsability is not something we strive for. But in the case, "matching upstream behavior". In this case they are "aligned" so we would have
but.... i swear i tried to match pypi behavior when i first saw. So as next steps:
We could.... but we would need a champion to investigate this on various platforms since pytorch compilation behavior has been hard to predict. |
For the record, I can play with it later this week, if you want me to. If we were to go back to 2.4.x, it would probably make sense to backport the |
Those two small fixes would be good to bundle together. |
The ask about libtorch_python.so was also from myself. I have two (rather ugly) workarounds for both, so I don't rely on you guys backporting the fix.
Thinking about this, I agree. But to be fair, this package was the very first package to fail the semver test, and our environment is rather large with lots of exotic packages. So lets rephrase to "matching upstream behavior" and I guess that would be useful. |
I created a |
Here's the fix: diff --git a/recipe/build.sh b/recipe/build.sh
index 3bf2963..9593991 100644
--- a/recipe/build.sh
+++ b/recipe/build.sh
@@ -63,7 +63,9 @@ CMAKE_FIND_ROOT_PATH+=";$SRC_DIR"
unset CMAKE_INSTALL_PREFIX
export TH_BINARY_BUILD=1
export PYTORCH_BUILD_VERSION=$PKG_VERSION
-export PYTORCH_BUILD_NUMBER=$PKG_BUILDNUM
+# Always pass 0 to avoid appending ".post" to version string.
+# https://github.com/conda-forge/pytorch-cpu-feedstock/issues/315
+export PYTORCH_BUILD_NUMBER=0
export INSTALL_TEST=0
export BUILD_TEST=0
diff --git a/recipe/bld.bat b/recipe/bld.bat
index 30cc5d4..85d8fd2 100644
--- a/recipe/bld.bat
+++ b/recipe/bld.bat
@@ -2,7 +2,9 @@
set TH_BINARY_BUILD=1
set PYTORCH_BUILD_VERSION=%PKG_VERSION%
-set PYTORCH_BUILD_NUMBER=%PKG_BUILDNUM%
+rem Always pass 0 to avoid appending ".post" to version string.
+rem https://github.com/conda-forge/pytorch-cpu-feedstock/issues/315
+set PYTORCH_BUILD_NUMBER=0
if "%pytorch_variant%" == "gpu" (
set build_with_cuda=1 It's a bit weird to be changing that, given that it's been clearly intended to be used like that upstream. I'm not making a PR not to trigger another CI run, @h-vetinari can probably merge it into one of the existing PRs before merging (or retrying CI again). With it applied, I can confirm: >>> importlib.metadata.version("torch")
'2.5.1' |
(edited the diff to add Windows change as well) |
Thanks a lot for figuring this out! I'd suggest to pick this into the next PR, I'm hoping current CI in #305 passes and we can merge that as-is. |
Do we want to instead match upstream's crazy |
The patch above to avoid The local version specifiers I don't see harm in amending the version strings to match upstream's wheels, but I also don't think it's worth the effort of patching to get to that state. |
Agreed. |
This is part of #316, and is only waiting for the open-gpu server to come back online. |
Closing as fixed as part of #316 |
A PR with the associated fixes targetted to the v2.4.x branch will be considered. |
Sorry, does that mean you'll be doing it or you will consider accepting it when somebody else does? I.e. should I add it to my TODO? :-) |
no.
yes
I can't answer the "should". But hopefully the answers to the other two questions can guide you on how you spend your own time ^_^ |
But I also want to motivate that others can try as well. it doesn't have to be you. |
Well, I have a build setup, resources and an idea what needs to be done :-). |
@hmaarrfk, @h-vetinari, I see that 2.4.1 was built for CUDA 11.8 and 12.0. Should I modify the recipe to force building for 12.0 again or just update it for 12.6? |
no opinion |
I'm going to stay with 12.0 for now, without rerendering — since apparently the branch was somehow made like that. |
I'd build it for 11.8 and 12.6 (if you're asking for my preference) |
Just FYI: sometimes our tooling with break. There are still some part of the build process that will pull in the latest versions. so its not always guaranteed that this works |
In conda-forge, we were building with CUDA 12.0 originally to provide the broadest range of support for CUDA 12 minor versions After discussion in issue ( conda-forge/conda-forge-pinning-feedstock#6630 ), we decided to move to CUDA 12.x where x is latest (currently 12.6) |
... and part of that discussion was that the move to 12.6 doesn't reduce support in any meaningful way (otherwise I wouldn't have suggested it for the 2.4 builds) |
Yes. Wasn't trying to move the decision in one way or another Just got the impression that Michal was confused about why the change occurred. So wanted to provide context Thank you for filling in the bits I missed above 🙏 |
Solution to issue cannot be found in the documentation.
Issue
When installing pytorch from conda-forge, the version reported through importlib doesn't match the version installed on conda-forge, as in it contains a version suffix. In case of 2.4.1, the version reported is:
2.4.1.post300
This causes a number of issues. In "simple" cases, it is that semver cannot parse this string:
It can also cause issues installing pip packages that depend on specific torch versions. For example, we ran into a case where the package requirements were torch >2.4,<=2.4.1. This fails with this version string.
It would be fantastic if the conda package could report just 2.4.1 assuming this being feasible.
Knowing that backports are hard, I am fine with this being fixed just for 2.5
Installed packages
Environment info
The text was updated successfully, but these errors were encountered: