Save Model after every X Number of Iterations #2090

aerithnetzer · 2024-07-18T19:27:22Z

aerithnetzer
Jul 18, 2024

Hello! Hoping anyone can help me out.

I am currently working on what I would consider a very large topic model (over 1 million large documents) and it currently takes many hours on my university's HPC clusters to run. However, if something fails in the middle (or I hit walltime) I would like to be able to pick up training where I left off. So, is there a way to save the model every X number of iterations? I have looked through discussions and documentation but I can't seem to find a good way to make this happen. I am also happy to try and make this change myself if need be.

Thanks in advance!

Answered by MaartenGr

Jul 19, 2024

If it's a bit over 1 million documents, you can use cuML instead and it should all run within the hour I believe. Have you checked the guide on GPU acceleration? It shows how to run BERTopic quite fast using I believe 1 million documents.

With respect to your question, there isn't something like "iterations" in the underlying algorithm of BERTopic since it highly depends on the underlying dimensionality reduction and clustering algorithms.

View full answer

MaartenGr · 2024-07-19T09:32:57Z

MaartenGr
Jul 19, 2024
Maintainer

If it's a bit over 1 million documents, you can use cuML instead and it should all run within the hour I believe. Have you checked the guide on GPU acceleration? It shows how to run BERTopic quite fast using I believe 1 million documents.

With respect to your question, there isn't something like "iterations" in the underlying algorithm of BERTopic since it highly depends on the underlying dimensionality reduction and clustering algorithms.

2 replies

aerithnetzer Jul 19, 2024
Author

Hi! Thank you for your response. Yes, I was running BERTopic on RAPIDS. At first I thought that it was a problem with BERTopic, but it seems I was mistaken and I had a weird misconfiguration in the environment I was running on so it wasn't actually doing anything with CUDA. Sorry for my mistake!

MaartenGr Jul 19, 2024
Maintainer

No problem! Glad to hear that the problem is resolved 😄

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Save Model after every X Number of Iterations #2090

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Save Model after every X Number of Iterations #2090

aerithnetzer Jul 18, 2024

Replies: 1 comment · 2 replies

MaartenGr Jul 19, 2024 Maintainer

aerithnetzer Jul 19, 2024 Author

MaartenGr Jul 19, 2024 Maintainer

aerithnetzer
Jul 18, 2024

Replies: 1 comment 2 replies

MaartenGr
Jul 19, 2024
Maintainer

aerithnetzer Jul 19, 2024
Author

MaartenGr Jul 19, 2024
Maintainer