Replies: 3 comments 13 replies
-
Hi @sridhar what happens if you try running this code? It's possible that the code exits with an error which is why you would see no process running on the GPU. Looking at your memory GPU usage it might have to do with your large batch size (the default size is |
Beta Was this translation helpful? Give feedback.
-
The code is running (for a day now). I can see all the tqdm status and all the batches being processed. It just runs on a single CPU core :( |
Beta Was this translation helpful? Give feedback.
-
Thanks for such a detailed report @sridhar - the results you get are expected. If you look at the code of negative mining it really shouldn't use much GPU. You could increase the batch size (say 16/32) but I doubt you can load GPU much more than you currently are. I am not sure why your adaptation runs more than a day when the example adaptation above on 10k documents takes roughly 15 min, end-to-end. What happens if you shrink your corpus to 10k documents? |
Beta Was this translation helpful? Give feedback.
-
Hi all,
I'm trying to train an EmbeddingRetriever using GPL on Google Cloud VM with a Nvidia Tesla A100:
This is on a 12 core Xeon box.
I can see that rest of the code is being scheduled on the GPU (tokenizer, qa generation etc).. however, PseudoLabelGenerator is not. Here's the code that uses it. Am I missing something here?
Beta Was this translation helpful? Give feedback.
All reactions