How to get the id of the original splitted Document #3017
-
Hello! I was reading the documentation/API/Q&A but still could not find a way so I can split a original Document (parent), save those new splitted Documents (with new Ids and own embbedings) into Milvus, run a query and return the splitted documents but also their parent Ids (parent Document). My basic idea is to save smaller fragments of a long text into milvus and query for the whole Document later. Thanks. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @wilsonlimaneto - sorry this has taken a couple of days to respond. It sounds to me like you might make use if the As for their ids, you could make use of the Here is the API reference for that: https://haystack.deepset.ai/reference/preprocessor Hope this helps. Let me know 😊 |
Beta Was this translation helpful? Give feedback.
Hi @wilsonlimaneto - sorry this has taken a couple of days to respond. It sounds to me like you might make use if the
PreProcessor
. You could 2 things: 1. Either use thesplit_length
argument when you're first preprocessing the files to createDocuments
- or 2. Since you already seem to haveDocuments
, you could use thesplit
function. This one works on a single document.As for their ids, you could make use of the
meta
field to store the parent ids, or, you could make use of theid_hash_keys
to construct one with the parent id included also.Here is the API reference for that: https://haystack.deepset.ai/reference/preprocessor
Hope this helps. Let me know 😊