Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No access to rocminfo in a production environment - ability to manually set GPU arch. #1444

Open
isaranto opened this issue Dec 11, 2024 · 1 comment
Labels
AMD integration contributions-welcome We welcome contributions to fix this issue! cross-platform

Comments

@isaranto
Copy link

System Info

System Info

Working on a kubernetes deployment with debian + pytorch 2.4.0 + ROCm 6.1.
The deployment is using the multiple backend alpha release available in the parent bitsandbytes repo.

Reproduction

Trying to load a model with bitsandbytes fails because there is no access to rocminfo.

def get_rocm_gpu_arch() -> str:
    logger = logging.getLogger(__name__)
    try:
        if torch.version.hip:
            result = subprocess.run(["rocminfo"], capture_output=True, text=True)
            match = re.search(r"Name:\s+gfx([a-zA-Z\d]+)", result.stdout)
ERROR:bitsandbytes.cuda_specs:Could not detect ROCm GPU architecture: [Errno 2] No such file or directory: 'rocminfo'
WARNING:bitsandbytes.cuda_specs:
ROCm GPU architecture detection failed despite ROCm being available.

https://github.com/ROCm/bitsandbytes/blob/4aad810bc1d93c38a5316ec54c822cd12b1f1cd2/bitsandbytes/cuda_specs.py#L54

Expected behavior

I would prefer if I could set the architecture via an environment variable and rocminfo would be the fallback option if the env var is not set.
Here is the related cope snippet.
Happy to work on this if other people feel it is a good workaround.

@matthewdouglas
Copy link
Member

cc: @pnunna93

@matthewdouglas matthewdouglas added contributions-welcome We welcome contributions to fix this issue! AMD integration cross-platform labels Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AMD integration contributions-welcome We welcome contributions to fix this issue! cross-platform
Projects
None yet
Development

No branches or pull requests

2 participants