Handling duplicates for uploaded files #2808
vivianonlea
started this conversation in
General
Replies: 1 comment
-
Hello @vivianonlea! Sorry for the late reply. I moved this issue into Discussions as this sounds more like a general question to me. The behavior you describe is by design. Haystack is trying to have no duplicate content in the document store, so it does its best to detect them and overwrite them when found. In general I recommend to stick with this, as removing duplicates improves query time significantly. If you have a good reason to keep the duplicates, there are workarounds. One is
Hope it helps! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! So I have been trying to implement the version control feature for files that has been uploaded to haystack. For now, here are some of my observations:
I am just curious why this is happening and what part of the code should I modify so it doesn't replace files with the same name/content by default? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions