Move libtorch bits to PREFIX & enable cuDSS, cuSPARSELt, CUPTI on win; fix duplicate `.pyc` versions #328

mgorny · 2025-01-19T14:12:28Z

Checklist

Used a personal fork of the feedstock to propose changes
Bumped the build number (if the version is unchanged)
Reset the build number to 0 (if the version changed)
Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
Ensured the license file is being packaged.

Fixes #319
Fixes #320
Fixes #321
Fixes #327
Fixes #329

Fixes conda-forge#319 Fixes conda-forge#320 Fixes conda-forge#321

conda-forge-admin · 2025-01-19T14:13:57Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe/meta.yaml:

ℹ️ The recipe is not parsable by parser conda-souschef (grayskull). This parser is not currently used by conda-forge, but may be in the future. We are collecting information to see which recipes are compatible with grayskull.
ℹ️ The recipe is not parsable by parser conda-recipe-manager. The recipe can only be automatically migrated to the new v1 format if it is parseable by conda-recipe-manager.

_{This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/12939294964. Examine the logs at this URL for more detail.}

h-vetinari · 2025-01-19T22:28:55Z

Would also be good to address #327 in this PR.

hmaarrfk · 2025-01-19T23:31:42Z

(draft until all builds are enabled again)

h-vetinari · 2025-01-20T10:24:59Z

So the extra CUDA deps on windows seem to work fine 🥳

mgorny · 2025-01-20T13:11:25Z

Okay, I'll try adding some logic for that here later today.

mgorny · 2025-01-20T13:59:32Z

Ok, I've decided to instead start by adding error handling to .bat, to see which of the existing commands fail.

By the way, any chance you could give me the power to run CI myself?

h-vetinari · 2025-01-20T14:48:38Z

By the way, any chance you could give me the power to run CI myself?

You already should have AFAIU, and I don't have the power to change it further. This should be a matter of Quansight/open-gpu-server#52 plus a sync of the setup to https://github.com/conda-forge/.cirun (which happened), so I'm really not sure where this goes wrong. Pinging @jaimergp & @aktech

One conceivable angle is that the lack of authorization for the windows agents prevents you from starting any job at all (though IIRC you already weren't able to start things before we merged windows support). For that, you could make a PR like conda-forge/.cirun#27

h-vetinari

I'd prefer to add

if %ERRORLEVEL% neq 0 exit 1

to calls that can fail, rather than adding || goto error everywhere. There's no functional difference AFAIK, and although both styles have their pros & cons, the goto-variant is quite rare in conda-forge (a quick search shows 170 feedstocks using goto, while ~1.6 feedstocks use an errorlevel variant).

recipe/bld.bat

aktech · 2025-01-20T15:01:56Z

@mgorny You need to add your name here: https://github.com/conda-forge/.cirun/blob/218150c90899c7dfeb771d0bd9b176ca191848f3/.access.yml#L89 like this PR: conda-forge/.cirun#27 and this is subject to @wolfv's approval I believe as they (prefix.dev) are the ones sponsoring the windows machine.

and you need to be in any of these roles for this repo to be able to trigger workflows that involves GPU CI runs. cc @jaimergp

h-vetinari

We're going to need existence tests for the DLLs to be under %LIBRARY_BIN%.

The tests are already there, and just need to be modified, e.g.

pytorch-cpu-feedstock/recipe/meta.yaml

Line 199 in 39b84c3

- if not exist %SP_DIR%\torch\lib\{{ each_lib }}.dll exit 1 # [win]

(similar below)

h-vetinari · 2025-01-20T15:09:37Z

Thanks for the quick response @aktech!

and you need to be in any of these roles for this repo to be able to trigger workflows that involves GPU CI runs.

Ah, perhaps it's just a question of making @mgorny a maintainer here? That's a hypothesis we can test easily (given your involvement here, I'm assuming you have nothing against this @mgorny) 😉

h-vetinari · 2025-01-20T15:17:31Z

By the way, any chance you could give me the power to run CI myself?

You already should have AFAIU, and I don't have the power to change it further.

Well, perhaps I did have the power after all: 4ff92b3 ;-)

We'll find out on your next push here!

mgorny · 2025-01-20T15:18:52Z

Ok, I'll replace || goto :error with if errorlevel checks, then push and take a break.

h-vetinari · 2025-01-20T15:19:59Z

We'll find out on your next push here!

Sorry, I was talking about the linux builds. For the windows agents, we'll definitely still need to wait for conda-forge/.cirun#32

h-vetinari · 2025-01-20T15:23:49Z

Ok, I'll replace || goto :error with if errorlevel checks

You're probably aware, but just for completeness (and because batch is an insane language): if errorlevel is a very different construct from if %errorlevel%. In particular, the more "advanced" relational operators like NEQ are not supported; more details

mgorny · 2025-01-20T15:24:29Z

Ah, okay, sorry. Either case, I really need to take a break now, so maybe it'll be done when I get back.

h-vetinari · 2025-01-20T15:25:43Z

Ah, okay, sorry. Either case, I really need to take a break now, so maybe it'll be done when I get back.

No worries. Perhaps until then the PR is merged. ;-)

When you do get back, please squash f9fe7bf & ad726bc to avoid the back-and-forth.

h-vetinari · 2025-01-23T06:22:18Z

Well, that last commit to see the internals of the failed call to ninja was a wash. Not sure if this is just missing a flush, but in any case, it's not obvious where this is going wrong. I looked at the sources of the two failing tests, and it's not obvious what issues they could encounter - certainly it doesn't look like they're using exotic symbols, but I guess we can always run into missing transitive dependencies that require some of the static libs we've removed (if that is even the problem).

h-vetinari · 2025-01-23T07:36:24Z

OK, so I downloaded the previous pytorch cpu build, set up the test environment, and was able to run both tests successfully

(test) E:\upstreams\pytorch>python test\test_autograd.py TestAutograd.test_multi_grad_all_hooks
Using C:\Users\[...]\AppData\Local\torch_extensions\torch_extensions\Cache\py312_cpu as PyTorch extensions root...
Emitting ninja build file C:\Users\[...]\AppData\Local\torch_extensions\torch_extensions\Cache\py312_cpu\test_autograd_cpp_node\build.ninja...
Building extension module test_autograd_cpp_node...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] cl /showIncludes -DTORCH_EXTENSION_NAME=test_autograd_cpp_node -DTORCH_API_INCLUDE_EXTENSION_H -IE:\miniforge\envs\test\Lib\site-packages\torch\include -IE:\miniforge\envs\test\Lib\site-packages\torch\include\torch\csrc\api\include -IE:\miniforge\envs\test\Lib\site-packages\torch\include\TH -IE:\miniforge\envs\test\Lib\site-packages\torch\include\THC -IE:\miniforge\envs\test\Include -D_GLIBCXX_USE_CXX11_ABI=0 /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc /std:c++17 -c C:\Users\[...]\AppData\Local\torch_extensions\torch_extensions\Cache\py312_cpu\test_autograd_cpp_node\main.cpp /Fomain.o
Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30158 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

[2/2] "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64/link.exe" main.o /nologo /DLL c10.lib torch_cpu.lib torch.lib /LIBPATH:E:\miniforge\envs\test\Lib\site-packages\torch\lib torch_python.lib /LIBPATH:E:\miniforge\envs\test\libs /out:test_autograd_cpp_node.pyd
   Creating library test_autograd_cpp_node.lib and object test_autograd_cpp_node.exp
Loading extension module test_autograd_cpp_node...
.
----------------------------------------------------------------------
Ran 1 test in 10.331s

OK

(test) E:\upstreams\pytorch>python test\test_custom_ops.py TestCustomOp.test_autograd_function_backed_op
Using C:\Users\[...]\AppData\Local\torch_extensions\torch_extensions\Cache\py312_cpu as PyTorch extensions root...
Creating extension directory C:\Users\[...]\AppData\Local\torch_extensions\torch_extensions\Cache\py312_cpu\mylib...
Emitting ninja build file C:\Users\[...]\AppData\Local\torch_extensions\torch_extensions\Cache\py312_cpu\mylib\build.ninja...
Building extension module mylib...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] cl /showIncludes -DTORCH_EXTENSION_NAME=mylib -DTORCH_API_INCLUDE_EXTENSION_H -IE:\miniforge\envs\test\Lib\site-packages\torch\include -IE:\miniforge\envs\test\Lib\site-packages\torch\include\torch\csrc\api\include -IE:\miniforge\envs\test\Lib\site-packages\torch\include\TH -IE:\miniforge\envs\test\Lib\site-packages\torch\include\THC -IE:\miniforge\envs\test\Include -D_GLIBCXX_USE_CXX11_ABI=0 /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /wd4624 /wd4067 /wd4068 /EHsc /std:c++17 -c C:\Users\[...]\AppData\Local\torch_extensions\torch_extensions\Cache\py312_cpu\mylib\main.cpp /Fomain.o
Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30158 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

[2/2] "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64/link.exe" main.o /nologo /DLL c10.lib torch_cpu.lib torch.lib /LIBPATH:E:\miniforge\envs\test\Lib\site-packages\torch\lib torch_python.lib /LIBPATH:E:\miniforge\envs\test\libs /out:mylib.pyd
   Creating library mylib.lib and object mylib.exp
Loading extension module mylib...
.
----------------------------------------------------------------------
Ran 1 test in 12.550s

OK

Then I moved all the DLLs to %LIBRARY_PREFIX%\bin, deleted the torch_extensions cache, and ran it again - also ran fine.

Note though the /LIBPATH:E:\miniforge\envs\test\Lib\site-packages\torch\lib for the linker - IOW, this is where the test expects the .lib files. The reason our windows compilers don't check in %LIBRARY_LIB% by default is that this is only done when building packages (which isn't the case during the test phase). This is in contrast to what our unix compilers do, which also explains why moving everything to $PREFIX/lib works there.

For completeness, I also tried again after deleting the static libraries (not the import libraries for the DLLs), and again these two tests ran fine (so deleting fmt.lib wasn't the issue as I had feared).

I think the nicest approach here would be to modify our compiler activation on windows to look into %LIBRARY_LIB% by default. For that I opened conda-forge/vc-feedstock#90, let's see how that goes. If that is not acceptable for some reason, then the easiest will be to just move the DLLs, and leave the .libs in %SP_DIR%\torch\lib (even if that means we might have to patch downstream packages to look there specifically).

h-vetinari · 2025-01-23T11:15:02Z

OK, with conda-forge/vc-feedstock#90, I can now move all the .libs to %LIBRARY_LIB% and the tests still work. However, what didn't work was moving the headers to %LIBRARY_INC%. That's because the torch headers actually have a nested structure:

Lib/site-packages/torch/include/torch/csrc/api/include/torch/all.h
                        ^^^^^^^                ^^^^^^^
                     entry point 1          entry point 2

And just moving torch/csrc/api/include/torch/all.h to %LIBRARY_INC% will not resolve #include <torch/all.h>. Pytorch generally adds both paths to the includes, and so does the test infrastructure.

I don't understand how this works out on unix TBH (where we're moving the headers to $PREFIX/include, but not handling torch/csrc/api/include explicitly AFAICT). Maybe it somehow gets persisted in INTERFACE_INCLUDE_DIRECTORIES?

Anyway, I think I'll try patching this, unless someone has better ideas?

mgorny · 2025-01-23T13:05:50Z

I don't understand how this works out on unix TBH (where we're moving the headers to $PREFIX/include, but not handling torch/csrc/api/include explicitly AFAICT). Maybe it somehow gets persisted in INTERFACE_INCLUDE_DIRECTORIES?

On Linux, we still have a bunch of headers in ${SP_DIR}/torch/include and symlinks for the directories moved to ${CONDA_PREFIX}/include.

Anyway, I think I'll try patching this, unless someone has better ideas?

Patching sounds good to me. For CMake, I guess changing TORCH_INSTALL_PREFIX should work. However, given that they explicitly repeat it in tests, I'm worried that other Python packages also hardwire these paths.

h-vetinari · 2025-01-23T13:25:39Z

symlinks for the directories moved to ${CONDA_PREFIX}/include.

Ah right, forgot the symlink!

For CMake, I guess changing TORCH_INSTALL_PREFIX should work. However, given that they explicitly repeat it in tests, I'm worried that other Python packages also hardwire these paths.

We might need to do both? For now I'd let this run through and see what comes out. Feel free to push more fixes, I'll be offline for the next ~8h.

recipe/patches/0017-point-include-paths-to-PREFIX-include.patch

mgorny · 2025-01-23T13:42:35Z

I guess it would make sense to remove the symlinks from Linux too, for consistency.

hmaarrfk · 2025-01-23T15:21:13Z

I’m worried about moving too many files around without patching. I’m too busy in the coming weeks to read through the bottom thread but here it is for reference

pytorch/pytorch#144690

I’ll remind us that the main goal of having the so in our shared library is to avoid conda packaging from giving downstream packages the need to think about whether or not they have to add something to the missing dso whitelist.

mgorny · 2025-01-23T16:50:33Z

Okay, I'm going not to touch that part and try #330 instead.

Include Python in build env also when building natively. Otherwise, we end up with Python 3.13 in build env and Python files are byte-compiled for both target Python and 3.13.

h-vetinari · 2025-01-23T19:02:23Z

Phew, I'm glad this finally passed. #329 still isn't solved, let me reinstate your a514e05.

I'm fine to keep the symlink on linux - it makes life easier, but yeah, the lack of symlinks on windows will probably bite use in a few ways here. I'm mostly keen to get this PR merged and check again what happens when building torchvision etc.

At least for now we've matched the design from the unix side pretty closely (modulo the symlinks) in how the libs are all in %LIBRARY_{LIB,BIN}%, and the ATen, c10, shm, torch headers are in %LIBRARY_INc%, while the rest stays in %SP_DIR%\torch\include.

mgorny · 2025-01-23T19:22:16Z

Don't you want to enable the remaining Windows builds? I wanted to add removal of TH_BINARY_BUILD too. If that's fine, I'll push it along with rerender to restore all build targets.

h-vetinari · 2025-01-23T21:57:30Z

Don't you want to enable the remaining Windows builds?

Of course, but I wanted to solve all the current issues first, and also ask @hmaarrfk and @isuruf for review, if they have time. CC also @danpetry if interested.

I wanted to add removal of TH_BINARY_BUILD too. If that's fine, I'll push it along with rerender to restore all build targets.

How high is your confidence that this will work first try? I think this PR is already doing a lot, and I'd like to get it out so we can start building dependent packages and test out whether things work as intended. I wouldn't be surprised if follow-ups turn out to be necessary anyway, so I'd prefer to put TH_BINARY_BUILD into the next round.

h-vetinari · 2025-01-23T22:03:00Z

FWIW, #329 is now fixed as well, so I think this is ready (modulo a revert of 3ebf1a9 + a rerender, which I'm holding off on for now, in case we need some more rounds after feedback).

This reverts commit 3ebf1a9.

h-vetinari · 2025-01-23T22:21:12Z

Nevermind, I mind found an issue with the windows CUDA builds that needs fixing anyway, so pushing a rerendered build, but will cancel everything but the windows+CUDA job. And it'll also be good to see if CI on osx passes with the couple relevant changes here.

…nda-forge-pinning 2025.01.23.20.59.37

This reverts commit fc5693c.

Attempt reenabling cuDSS, cuSPARSELt and CUPTI on Windows

c3b2a1f

Fixes conda-forge#319 Fixes conda-forge#320 Fixes conda-forge#321

mgorny requested review from Tobias-Fischer, baszalmstra, beckermr, benjaminrwilson, h-vetinari, hmaarrfk, jeongseok-meta and sodre as code owners January 19, 2025 14:12

hmaarrfk marked this pull request as draft January 19, 2025 23:31

h-vetinari reviewed Jan 20, 2025

View reviewed changes

recipe/bld.bat Outdated Show resolved Hide resolved

h-vetinari previously requested changes Jan 20, 2025

View reviewed changes

mgorny added 3 commits January 20, 2025 17:38

Bump build number

673dd87

Skip non-Windows builds

9008ca4

Remove osx-64 code from bld.bat

d241483

h-vetinari force-pushed the more-win branch 2 times, most recently from 2934eba to d569bbb Compare January 23, 2025 03:12

h-vetinari mentioned this pull request Jan 23, 2025

also update INCLUDE/LIB when not building packages conda-forge/vc-feedstock#90

Merged

adapt the way pytorch constructs include paths to $PREFIX/include

e0e34e3

h-vetinari force-pushed the more-win branch from d569bbb to e0e34e3 Compare January 23, 2025 12:08

mgorny commented Jan 23, 2025

View reviewed changes

recipe/patches/0017-point-include-paths-to-PREFIX-include.patch Show resolved Hide resolved

Match Python in build env

c26c87a

Include Python in build env also when building natively. Otherwise, we end up with Python 3.13 in build env and Python files are byte-compiled for both target Python and 3.13.

h-vetinari changed the title ~~Attempt reenabling cuDSS, cuSPARSELt and CUPTI on Windows~~ Move libtorch bits to PREFIX & enable cuDSS, cuSPARSELt, CUPTI on win; fix duplicate .pyc versions Jan 23, 2025

h-vetinari marked this pull request as ready for review January 23, 2025 22:05

h-vetinari added 3 commits January 24, 2025 09:12

fix handling of CUDA libraries on windows

6355886

add regression test for duplicate .pyc files

6c7d83f

Revert "DEBUG: build only one python version"

eb3c6d0

This reverts commit 3ebf1a9.

h-vetinari added 3 commits January 24, 2025 09:30

Merge remote-tracking branch 'upstream/main' into more-win

92b4f30

MNT: Re-rendered with conda-build 25.1.1, conda-smithy 3.45.3, and co…

1c1ac05

…nda-forge-pinning 2025.01.23.20.59.37

Revert "pin python in build environment to same version as in host"

1c3a925

This reverts commit fc5693c.

h-vetinari force-pushed the more-win branch from 934c8b9 to 1c3a925 Compare January 23, 2025 22:31

h-vetinari mentioned this pull request Jan 23, 2025

Pkg 6001 win cf AnacondaRecipes/pytorch-feedstock#53

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move libtorch bits to PREFIX & enable cuDSS, cuSPARSELt, CUPTI on win; fix duplicate `.pyc` versions #328

Move libtorch bits to PREFIX & enable cuDSS, cuSPARSELt, CUPTI on win; fix duplicate `.pyc` versions #328

mgorny commented Jan 19, 2025 •

edited by h-vetinari

Loading

conda-forge-admin commented Jan 19, 2025 •

edited

Loading

h-vetinari commented Jan 19, 2025

hmaarrfk commented Jan 19, 2025

h-vetinari commented Jan 20, 2025

mgorny commented Jan 20, 2025

mgorny commented Jan 20, 2025

h-vetinari commented Jan 20, 2025

h-vetinari left a comment

aktech commented Jan 20, 2025 •

edited

Loading

h-vetinari left a comment

h-vetinari commented Jan 20, 2025

h-vetinari commented Jan 20, 2025

mgorny commented Jan 20, 2025

h-vetinari commented Jan 20, 2025 •

edited

Loading

h-vetinari commented Jan 20, 2025

mgorny commented Jan 20, 2025

h-vetinari commented Jan 20, 2025

h-vetinari commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

mgorny commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

mgorny commented Jan 23, 2025

hmaarrfk commented Jan 23, 2025

mgorny commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

mgorny commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

Move libtorch bits to PREFIX & enable cuDSS, cuSPARSELt, CUPTI on win; fix duplicate .pyc versions #328

Are you sure you want to change the base?

Move libtorch bits to PREFIX & enable cuDSS, cuSPARSELt, CUPTI on win; fix duplicate .pyc versions #328

Conversation

mgorny commented Jan 19, 2025 • edited by h-vetinari Loading

conda-forge-admin commented Jan 19, 2025 • edited Loading

h-vetinari commented Jan 19, 2025

hmaarrfk commented Jan 19, 2025

h-vetinari commented Jan 20, 2025

mgorny commented Jan 20, 2025

mgorny commented Jan 20, 2025

h-vetinari commented Jan 20, 2025

h-vetinari left a comment

Choose a reason for hiding this comment

aktech commented Jan 20, 2025 • edited Loading

h-vetinari left a comment

Choose a reason for hiding this comment

h-vetinari commented Jan 20, 2025

h-vetinari commented Jan 20, 2025

mgorny commented Jan 20, 2025

h-vetinari commented Jan 20, 2025 • edited Loading

h-vetinari commented Jan 20, 2025

mgorny commented Jan 20, 2025

h-vetinari commented Jan 20, 2025

h-vetinari commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

mgorny commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

mgorny commented Jan 23, 2025

hmaarrfk commented Jan 23, 2025

mgorny commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

mgorny commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

h-vetinari commented Jan 23, 2025

Move libtorch bits to PREFIX & enable cuDSS, cuSPARSELt, CUPTI on win; fix duplicate `.pyc` versions #328

Move libtorch bits to PREFIX & enable cuDSS, cuSPARSELt, CUPTI on win; fix duplicate `.pyc` versions #328

mgorny commented Jan 19, 2025 •

edited by h-vetinari

Loading

conda-forge-admin commented Jan 19, 2025 •

edited

Loading

aktech commented Jan 20, 2025 •

edited

Loading

h-vetinari commented Jan 20, 2025 •

edited

Loading