Hierarchy Merge and Custom Labelling Using LLMs #2179
Unanswered
stephenhibbert
asked this question in
Q&A
Replies: 1 comment
-
Doing this for parent topics is definitely possible but not trivial I think. You essentially would have to recalculate representative documents for each of the aggregated topics in order to get nice representations. It would be trivial however if you already merge the topics in the main model and then apply |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I noted this comment in bertopic/plotting/_hierarchy.py - NOTE: Custom labels are only generated for the original un-merged topics.
Had anyone implemented an LLM representation for parent topics? Looking for any tips on how to merge and represent hierarchy.
To give a concrete example, in the hierarchy below, I would hope to label the highlighted node as "Physical Authentication Systems" or similar. It abstracts the specific device or object, and represents the more abstract class of physical things having authentication.
I'm imagining something like this could be used to ask a language model to hopefully grasp the higher level of abstraction and give a good label.
hierarchical_prompt = """
I have multiple topics [CHILD_TOPICS_ARRAY] that are related by a common ancestor. These topics contain the following representative documents:
[DOCUMENTS]
The topics are mutually described by the following keywords: '[KEYWORDS]'.
Based on the information about the topic above, please create a short label of this common ancestor topic. Make sure you to only return the label and nothing more.
"""
custom_labels: If bool, whether to use custom topic labels that were defined using
topic_model.set_topic_labels
.If
str
, it uses labels from other aspects, e.g., "Aspect1".NOTE: Custom labels are only generated for the original
un-merged topics.
https://github.com/MaartenGr/BERTopic/blob/9518035d41087a801ae39000e6ea1f3641983396/bertopic/plotting/_hierarchy.py#L48C1-L49C41
Beta Was this translation helpful? Give feedback.
All reactions