Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Representative resources and dedicated events for duplicates/collisions #93

Open
kirillt opened this issue Oct 30, 2024 · 0 comments
Open

Comments

@kirillt
Copy link
Member

kirillt commented Oct 30, 2024

We might need separate events to track duplicates. Something like DuplicateAdded(id, path) and DuplicateRemoved(id, path). Although, I'm not sure that duplicate removal can be useful, maybe duplicate addition is enough. It could be used to allow the user to select representative manually. Just idea for future.

We could simplify added field of the IndexUpdate structure. From API point of view, we don't need a collection of paths attached to the addition event, only one path (representative). We can take any path as a representative of the group, because we don't distinguish duplicates. A single representative should be enough for the app to do something with it, e.g. render thumbnail.

So when unique resource is detected, we take its path as the representative. When a duplicate appears, we skip it or emit its path in separate event. If during unique addition, several paths were introduced at once, we take the shortest path and use it as representative. All other paths should be mentioned in DuplicateAdded event.


We can also use term "Collision" since in case of non-cryptographic hash function distinct bytes can result in same id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant