Replies: 1 comment 5 replies
-
I believe there might be some code missing, like |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I believe I successfully removed the stop words in the original model, as I don't see them appearing in the topics. However, after reducing the outliers, I notice that all those stop words reappear in the newly set topics. It seems that the stop word removal does not extend to the task of outlier reduction. I need some help with this!
Below is my code:
Then I used four different strategies to reduce the outliers.
In all cases, when I represent topics using get_topics(), I can see all the stopwords that I removed through vectorizer_model = CountVectorizer(ngram_range=(1, 2), stop_words=mylist) in the original model reappearing in the topics.
Is there any way I can remove these stopwords from the topics generated through outlier reduction? I don't want to remove them before generating embeddings because I'm concerned it might ruin the original meanings of each sentence.
Thanks for your time in advance.
Beta Was this translation helpful? Give feedback.
All reactions