Skip to content

Can topic labels contain non-words? #930

Answered by MaartenGr
salderma asked this question in Q&A
Discussion options

You must be logged in to vote

If the words 'https' and 'http' are not found in the documents on which you trained the model, then they cannot end up in the topic representations. That means that there might be something going wrong with preprocessing the data and that you input documents that still contain these words. Thus, I would advise checking out the input data and making sure that these words are not found there.

Could you also share all code for training your model? Perhaps we can find something there.

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@salderma
Comment options

@MaartenGr
Comment options

Answer selected by salderma
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants