Skip to content

How to view which rows comprise a topic #663

Answered by MaartenGr
ecksteing asked this question in Q&A
Discussion options

You must be logged in to vote

The package follows, to a certain extent, sklearn's API in that whenever you use transform on a set of documents, it will return the topics in the same order. Let's say you have the following code:

from bertopic import BERTopic
from sklearn.datasets import fetch_20newsgroups

docs = fetch_20newsgroups(subset='all',  remove=('headers', 'footers', 'quotes'))['data']

topic_model = BERTopic()
topics, probs = topic_model.fit_transform(docs)

Here, docs is a list of documents on which you train the model. Running .fit_transform(docs) will return the variable topics. In topics, you will find the topics that belong to each documents. The topic in topics[0] corresponds to the document in docs[0], t…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@ecksteing
Comment options

Answer selected by ecksteing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants