Decouple Filter MP Rules function from cuda imports #117

fabianlim · 2025-01-03T16:21:11Z

Unwittingly due to the recent PR #106 we had introduced a filter_mp_rules function and placed it into models.utils. Unfortunately, that file does cuda kernel imports, and will cause pip install failures in machines with no GPU. This is is critical because it precludes building images in CI machines that typically do not have GPUs.

Failure

#33 15.79 File "/usr/lib64/python3.11/ctypes/__init__.py", line 394, in __getitem__ #33 15.79 func = self._FuncPtr((name_or_ordinal, self)) #33 15.79 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #33 15.79 AttributeError: /home/tuning/.local/lib/python3.11/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cdequantize_blockwise_fp32 ..#33 ERROR: process "/bin/sh -c if [[ \"${ENABLE_FMS_ACCELERATION}\" == \"true\" ]]; then python -m pip install --user \"$(head bdist_name)[fms-accel]\"; python -m fms_acceleration.cli install fms_acceleration_peft; python -m fms_acceleration.cli install fms_acceleration_foak; python -m fms_acceleration.cli install fms_acceleration_aadp; fi" did not complete successfully: exit code: 1 ------ > importing cache manifest from docker-na-private.artifactory.swg-devops.com/wcp-ai-foundation-team-docker-virtual/sft-trainer-aim:release_ubi9_py311: ------ ------ > importing cache manifest from sft-trainer-aim:release_ubi9_py311: ------ ------

To fix this we move filter_mp_rules out in a different function. We seem to observe that this will solve the problem

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

Abhishek-TAMU

All the changes of moving filter_mp_rules out of .models.utils looks good. Also this change helps fix the issue and passes the image build in GitHub actions here.

move filter_mp functions out

614e7a8

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

fabianlim requested a review from Abhishek-TAMU January 3, 2025 16:24

Abhishek-TAMU approved these changes Jan 3, 2025

View reviewed changes

fabianlim merged commit f08a886 into main Jan 3, 2025
7 checks passed

fabianlim deleted the fix/foak branch January 3, 2025 16:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decouple Filter MP Rules function from cuda imports #117

Decouple Filter MP Rules function from cuda imports #117

fabianlim commented Jan 3, 2025

Abhishek-TAMU left a comment

Decouple Filter MP Rules function from cuda imports #117

Decouple Filter MP Rules function from cuda imports #117

Conversation

fabianlim commented Jan 3, 2025

Abhishek-TAMU left a comment

Choose a reason for hiding this comment