How to retrieve the source documents actually used for an answer in a generative RAG pipeline? #8441
-
In a basic generative RAG pipeline (cf. https://haystack.deepset.ai/tutorials/27_first_rag_pipeline), how does haystack support retrieving the documents actually used by the LLM to produce the answer? I noticed that I can get the pipeline to return not only the LLM's answer but also the documents returned by the retriever by using for example an AnswerBuilder (as opposed to without it only the last component's result, i.e., the LLM's answer, is returned). However, this will yield all documents fed into the LLM, i.e., retrieved by the DocumentRetriever. But how to get only those that the answer is actually built upon? I noticed that there is a reference_pattern parameter (cf. https://docs.haystack.deepset.ai/docs/answerbuilder) which sounds somewhat related, but I couldn't find any information as to how to use it (if it is relevant at all). |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi @fhamborg, you need to specify that in the prompt, here is an example:
The main aspects are: give a reference to each content/document in the prompt and instruct the LLM to reference it. |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot :) |
Beta Was this translation helpful? Give feedback.
Hi @fhamborg, you need to specify that in the prompt, here is an example:
…