Skip to content

Commit

Permalink
Added a short explanation of the difference between zeroshot and guid…
Browse files Browse the repository at this point in the history
…ed topic modeling to both of the respective documentations so that users immediately know that there are two very similar methods for providing pre-defined topics
  • Loading branch information
janspoerer committed Dec 8, 2024
1 parent c3ec85d commit 2faf380
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 0 deletions.
6 changes: 6 additions & 0 deletions docs/getting_started/guided/guided.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
!!! Note
Difference between Zero-shot and Guided BERTopic:
Guided BERTopic is similar - yet not equivalent - to [Zeros-shot Topic Modeling](https://maartengr.github.io/BERTopic/getting_started/zeroshot/zeroshot.html).
Use Guided BERTopic to boost certain keyword's importance. Use [Zeros-shot Topic Modeling](https://maartengr.github.io/BERTopic/getting_started/zeroshot/zeroshot.html) to try to categorize documents into predefined topics ("zero-shot topics") before the clustering the remaining, unclassified documents, using the default unsupervised BERTopic topic exploration algorithm.


Guided Topic Modeling or Seeded Topic Modeling is a collection of techniques that guides the topic modeling approach by setting several seed topics to which the model will converge to. These techniques allow the user to set a predefined number of topic representations that are sure to be in documents. For example, take an IT business that has a ticket system for the software their clients use. Those tickets may typically contain information about a specific bug regarding login issues that the IT business is aware of.

To model that bug, we can create a seed topic representation containing the words `bug`, `login`, `password`,
Expand Down
4 changes: 4 additions & 0 deletions docs/getting_started/zeroshot/zeroshot.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
!!! Note
Difference between Zero-shot and Guided BERTopic:
Zeros-shot Topic Modeling is similar - yet not equivalent - to [Guided BERTopic](https://maartengr.github.io/BERTopic/getting_started/guided/guided.html). Use [Guided BERTopic](https://maartengr.github.io/BERTopic/getting_started/guided/guided.html) to boost certain keyword's importance. Use [Zeros-shot Topic Modeling](https://maartengr.github.io/BERTopic/getting_started/zeroshot/zeroshot.html) to try to categorize documents into predefined topics ("zero-shot topics") before the clustering the remaining, unclassified documents, using the default unsupervised BERTopic topic exploration algorithm.

Zero-shot Topic Modeling is a technique that allows you to find topics in large amounts of documents that were predefined. When faced with many documents, you often have an idea of which topics will definitely be in there. Whether that is a result of simply knowing your data or if a domain expert is involved in defining those topics.

This method allows you to not only find those specific topics but also create new topics for documents that would not fit with your predefined topics.
Expand Down

0 comments on commit 2faf380

Please sign in to comment.