Introduce an cutoff when searching for relevant memories #7

syntex01 · 2023-03-23T21:56:56Z

The function that searches a kB article that is similar to the prompt, could also include an variable that defines a minimum distance in the embedded space so that any article is returned. Sometimes and even very often, the closest article might have absolutely no relationship with the topic. This will greatly reduce token use. Not only when providing kb articles but also when we want to identify how many articles to update. This could be done dynamically instead of trying to update the closest three.

kyb3r · 2023-03-23T22:12:26Z

threshold of > 0.6 cosine_similarity should be ok, very very different texts usually have similarity below 0.5, similar texts are usually 0.75 upwards

syntex01 · 2023-03-23T22:18:39Z

Yes and we can also just experiment to find a value that works well.

haukepribnow · 2023-03-25T22:29:36Z

Out of curiosity: How sure are you that embeddings & cosine similarity will really be enough for successful knowledge linking at a large scale? I've just posted a long post with several questions about this in the discussion section of David's HMCS repository (daveshap/HierarchicalMemoryConsolidationSystem#4) - feel free to chime in there too if you like.

kyb3r · 2023-03-26T12:56:39Z

Right now we don’t use only cosine similarity. We also use chatgpt to classify whether or not a knowledge base article is relevant to a roll up summary.

So cosine similarity is kinda of like a filter, but cosine similarity is not always accurate, so using an LLM as a function to make a classification is what we are doing in addition to that

haukepribnow · 2023-03-26T17:26:15Z

Understood - thanks a lot for your response.

With my question, I was referring less to roll-up summaries but more to the "recall" mechanism. According to the following code...

emergent/emergent/memory.py

Lines 261 to 273 in b59165d

    
               def query(self, query: str) -> KnowledgeNode: 
        
                   """ 
        
                   This method is responsible for querying the memory for a given query. 
        
                   """ 
        
                   query_embedding = get_embedding(query) 
        
                   # find the most similar knowledge node 
        
                   if self.knowledge_nodes: 
        
                       knowledge_node = max( 
        
                           self.knowledge_nodes, 
        
                           key=lambda node: cosine_similarity(node.embedding, query_embedding), 
        
                       ) 
        
                       return knowledge_node 
        
                   return None

...a cosine similarity comparison is used to determine the (one) most likely KB article that may be relevant in answering user input.

I was wondering on your thoughts on that part: How do you plan to further develop this "recall" (/ "lookup") mechanism? How do you expect it to perform at a scale? And how do you plan on handling the situation when one KB article happens to have a more similar embedding vector to the user input than another, despite the found KB article being the "wrong" one for the respective situation?

syntex01 · 2023-03-26T17:42:46Z

So one thing we thought about was introducing an internal dialogue of the chat agent. In this internal dialogue the agent will decide if additional information is needed. And will then state what the information should be about. We will then search for the closest KB articles in embedded space. The agent will be provided with the headings of the n closest articles and will then decide which are in fact relevant. This method has the drawback of using up quite a few tokens, but I don't see any other way of addressing the problems you also identified.

haukepribnow · 2023-03-26T19:23:52Z

Ah, that's an interesting approach. Thanks a lot for sharing.

kyb3r · 2023-03-28T09:02:45Z

I read this conversation and its an interesting idea.

The plan is to give the agent access to tools similar to chatgpt plugins. So it will autonomously choose when it needs to get a relevant memory. Hypothetical document embeddings could be a possible way to make the query that agent makes more accurate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce an cutoff when searching for relevant memories #7

Introduce an cutoff when searching for relevant memories #7

syntex01 commented Mar 23, 2023

kyb3r commented Mar 23, 2023

syntex01 commented Mar 23, 2023

haukepribnow commented Mar 25, 2023

kyb3r commented Mar 26, 2023

haukepribnow commented Mar 26, 2023

syntex01 commented Mar 26, 2023 •

edited

Loading

haukepribnow commented Mar 26, 2023

kyb3r commented Mar 28, 2023

Introduce an cutoff when searching for relevant memories #7

Introduce an cutoff when searching for relevant memories #7

Comments

syntex01 commented Mar 23, 2023

kyb3r commented Mar 23, 2023

syntex01 commented Mar 23, 2023

haukepribnow commented Mar 25, 2023

kyb3r commented Mar 26, 2023

haukepribnow commented Mar 26, 2023

syntex01 commented Mar 26, 2023 • edited Loading

haukepribnow commented Mar 26, 2023

kyb3r commented Mar 28, 2023

syntex01 commented Mar 26, 2023 •

edited

Loading