-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce an cutoff when searching for relevant memories #7
Comments
threshold of > 0.6 cosine_similarity should be ok, very very different texts usually have similarity below 0.5, similar texts are usually 0.75 upwards |
Yes and we can also just experiment to find a value that works well. |
Out of curiosity: How sure are you that embeddings & cosine similarity will really be enough for successful knowledge linking at a large scale? I've just posted a long post with several questions about this in the discussion section of David's HMCS repository (daveshap/HierarchicalMemoryConsolidationSystem#4) - feel free to chime in there too if you like. |
Right now we don’t use only cosine similarity. We also use chatgpt to classify whether or not a knowledge base article is relevant to a roll up summary. So cosine similarity is kinda of like a filter, but cosine similarity is not always accurate, so using an LLM as a function to make a classification is what we are doing in addition to that |
Understood - thanks a lot for your response. With my question, I was referring less to roll-up summaries but more to the "recall" mechanism. According to the following code... Lines 261 to 273 in b59165d
I was wondering on your thoughts on that part: How do you plan to further develop this "recall" (/ "lookup") mechanism? How do you expect it to perform at a scale? And how do you plan on handling the situation when one KB article happens to have a more similar embedding vector to the user input than another, despite the found KB article being the "wrong" one for the respective situation? |
So one thing we thought about was introducing an internal dialogue of the chat agent. In this internal dialogue the agent will decide if additional information is needed. And will then state what the information should be about. We will then search for the closest KB articles in embedded space. The agent will be provided with the headings of the n closest articles and will then decide which are in fact relevant. This method has the drawback of using up quite a few tokens, but I don't see any other way of addressing the problems you also identified. |
Ah, that's an interesting approach. Thanks a lot for sharing. |
I read this conversation and its an interesting idea. The plan is to give the agent access to tools similar to chatgpt plugins. So it will autonomously choose when it needs to get a relevant memory. Hypothetical document embeddings could be a possible way to make the query that agent makes more accurate. |
The function that searches a kB article that is similar to the prompt, could also include an variable that defines a minimum distance in the embedded space so that any article is returned. Sometimes and even very often, the closest article might have absolutely no relationship with the topic. This will greatly reduce token use. Not only when providing kb articles but also when we want to identify how many articles to update. This could be done dynamically instead of trying to update the closest three.
The text was updated successfully, but these errors were encountered: