Replies: 2 comments 1 reply
-
No. What happens is when you adjust any parameter of HDBSCAN, BERTopic will still model all the topics that come out of that (N) using c-TF-IDF. If you have set Do note though that HDBSCAN does not allow you to select the number of topics directly. |
Beta Was this translation helpful? Give feedback.
-
Thank you for the answer. One last question: if M > N (i.e., |
Beta Was this translation helpful? Give feedback.
-
Hello. I have a quick question about selecting N topics. If I say that I want N topics from my original documents, am I saying, mathematically, that my documents are clustered into N clusters, and c-TFIDF is selecting the tokens with the highest values? If so, should the number of N topics always be the same as the number of HDBSCAN clusters generated? If I set HDBSCAN to find N clusters, and request M topics, are the results compromised?
Beta Was this translation helpful? Give feedback.
All reactions