pre-calculating and loading embeding vectors #806

avafor · 2022-10-25T16:24:14Z

avafor
Oct 25, 2022

Hi,
I would like to use the BERTopic on the fly where the users could get topics in a reasonable amount of time for a large dataset, i.e. over 10k documents. In this case, it would make sense if I calculate the embedding vectors in advance, save them somewhere and when a new querry comes in, I only pass the embedding vectors to the BERTopic. Is such a functionality already implemented?
Thank you very much.

MaartenGr · 2022-10-26T04:41:42Z

MaartenGr
Oct 26, 2022
Maintainer

Yes, you can pre-calculate the embeddings for your documents and then supply them to BERTopic as follows:

from bertopic import BERTopic

topic_model = BERTopic()
topics, probs = topic_model.fit_transform(docs, embeddings)

1 reply

avafor Oct 26, 2022
Author

perfect! Thank you very much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pre-calculating and loading embeding vectors #806

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

pre-calculating and loading embeding vectors #806

avafor Oct 25, 2022

Replies: 1 comment · 1 reply

MaartenGr Oct 26, 2022 Maintainer

avafor Oct 26, 2022 Author

avafor
Oct 25, 2022

Replies: 1 comment 1 reply

MaartenGr
Oct 26, 2022
Maintainer

avafor Oct 26, 2022
Author