Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce an cutoff when searching for relevant memories #7

Open
syntex01 opened this issue Mar 23, 2023 · 8 comments
Open

Introduce an cutoff when searching for relevant memories #7

syntex01 opened this issue Mar 23, 2023 · 8 comments

Comments

@syntex01
Copy link
Contributor

The function that searches a kB article that is similar to the prompt, could also include an variable that defines a minimum distance in the embedded space so that any article is returned. Sometimes and even very often, the closest article might have absolutely no relationship with the topic. This will greatly reduce token use. Not only when providing kb articles but also when we want to identify how many articles to update. This could be done dynamically instead of trying to update the closest three.

@kyb3r
Copy link
Owner

kyb3r commented Mar 23, 2023

threshold of > 0.6 cosine_similarity should be ok, very very different texts usually have similarity below 0.5, similar texts are usually 0.75 upwards

@syntex01
Copy link
Contributor Author

Yes and we can also just experiment to find a value that works well.

@haukepribnow
Copy link

Out of curiosity: How sure are you that embeddings & cosine similarity will really be enough for successful knowledge linking at a large scale? I've just posted a long post with several questions about this in the discussion section of David's HMCS repository (daveshap/HierarchicalMemoryConsolidationSystem#4) - feel free to chime in there too if you like.

@kyb3r
Copy link
Owner

kyb3r commented Mar 26, 2023

Right now we don’t use only cosine similarity. We also use chatgpt to classify whether or not a knowledge base article is relevant to a roll up summary.

So cosine similarity is kinda of like a filter, but cosine similarity is not always accurate, so using an LLM as a function to make a classification is what we are doing in addition to that

@haukepribnow
Copy link

Understood - thanks a lot for your response.

With my question, I was referring less to roll-up summaries but more to the "recall" mechanism. According to the following code...

emergent/emergent/memory.py

Lines 261 to 273 in b59165d

def query(self, query: str) -> KnowledgeNode:
"""
This method is responsible for querying the memory for a given query.
"""
query_embedding = get_embedding(query)
# find the most similar knowledge node
if self.knowledge_nodes:
knowledge_node = max(
self.knowledge_nodes,
key=lambda node: cosine_similarity(node.embedding, query_embedding),
)
return knowledge_node
return None
...a cosine similarity comparison is used to determine the (one) most likely KB article that may be relevant in answering user input.

I was wondering on your thoughts on that part: How do you plan to further develop this "recall" (/ "lookup") mechanism? How do you expect it to perform at a scale? And how do you plan on handling the situation when one KB article happens to have a more similar embedding vector to the user input than another, despite the found KB article being the "wrong" one for the respective situation?

@syntex01
Copy link
Contributor Author

syntex01 commented Mar 26, 2023

So one thing we thought about was introducing an internal dialogue of the chat agent. In this internal dialogue the agent will decide if additional information is needed. And will then state what the information should be about. We will then search for the closest KB articles in embedded space. The agent will be provided with the headings of the n closest articles and will then decide which are in fact relevant. This method has the drawback of using up quite a few tokens, but I don't see any other way of addressing the problems you also identified.

@haukepribnow
Copy link

Ah, that's an interesting approach. Thanks a lot for sharing.

@kyb3r
Copy link
Owner

kyb3r commented Mar 28, 2023

IMG_0245

I read this conversation and its an interesting idea.

The plan is to give the agent access to tools similar to chatgpt plugins. So it will autonomously choose when it needs to get a relevant memory. Hypothetical document embeddings could be a possible way to make the query that agent makes more accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants