Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use multiple GPUs – CUDA Out of Memory issue #105

Open
Devloper-RG opened this issue Sep 18, 2024 · 4 comments
Open

Unable to use multiple GPUs – CUDA Out of Memory issue #105

Devloper-RG opened this issue Sep 18, 2024 · 4 comments
Assignees

Comments

@Devloper-RG
Copy link

While Using other models like meta-llama/Meta-Llama-3.1-8B-Instruct
I'm encountering a torch.OutOfMemoryError when trying to load a model on multiple GPUs. I have 4 GPUs, each with 14.57 GiB memory, but the model fails to allocate memory on GPU 0, even though other GPUs should share the load.

@eustlb
Copy link
Collaborator

eustlb commented Sep 18, 2024

Hey @Devloper-RG, thanks for raising this issue and testing the lib in a multi-GPU setup 🙏
I'd be glad to help on that, can you provide a reproducer?

@eustlb eustlb self-assigned this Sep 18, 2024
@andimarafioti
Copy link
Member

I guess the issue here is that we are pushing to cuda as a device.

@Devloper-RG
Copy link
Author

Devloper-RG commented Sep 19, 2024

Hey @eustlb , thanks for getting back to me!

I made some modifications to the code to use the meta-llama/Meta-Llama-3.1-8B-Instruct model by updating the arguments_classes/language_model_arguments.py script. Also adjusted the LLM/language_model.py script to allow the model to be accessed via Hugging Face.

Thereafter I ran the server on a Google Cloud Platform (GCP) VM with 2 NVIDIA T4 GPUs. During testing, I noticed that one of the GPUs consistently overloads, leading to a torch.OutOfMemoryError.

I tried using the DataParallel method, but it didn’t resolve the issue. I also attempted to run the model in lower precision, which worked on a single GPU, but I’d like to use higher precision models and fully leverage multiple GPUs for better performance.

Any help with getting multi-GPU support working would be greatly appreciated!

@andimarafioti
Copy link
Member

Hi @Devloper-RG , if you can give a snippet with some reproducible code, that would be very helpful. Otherwise, we can't know what your issue is. We have run this on a setup with multiple GPUs without any problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants