Skip to content

How to get the id of the original splitted Document #3017

Discussion options

You must be logged in to vote

Hi @wilsonlimaneto - sorry this has taken a couple of days to respond. It sounds to me like you might make use if the PreProcessor. You could 2 things: 1. Either use the split_length argument when you're first preprocessing the files to create Documents - or 2. Since you already seem to have Documents, you could use the split function. This one works on a single document.

As for their ids, you could make use of the meta field to store the parent ids, or, you could make use of the id_hash_keys to construct one with the parent id included also.

Here is the API reference for that: https://haystack.deepset.ai/reference/preprocessor

Hope this helps. Let me know 😊

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@wilsonlimaneto
Comment options

Answer selected by wilsonlimaneto
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants