Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate debug file chunk uploads #2279

Open
szokeasaurusrex opened this issue Nov 27, 2024 · 0 comments
Open

Deduplicate debug file chunk uploads #2279

szokeasaurusrex opened this issue Nov 27, 2024 · 0 comments

Comments

@szokeasaurusrex
Copy link
Member

szokeasaurusrex commented Nov 27, 2024

In the case where the same chunk appears more than once in a debug file chunk upload, we only need to upload the chunk once. However, we currently upload duplicate chunks as many times as they appear in the file we are trying to upload. This was observed while implementing the new chunk uploading tests (#2275).

We should deduplicate chunks before upload. Likely, this can be done easily by using a HashSet (just throw all the chunks we want to upload into the HashSet before uploading to eliminate duplicates).

We should also modify our chunk uploading tests to verify that each unique chunk gets uploaded only once. This will likely require using a multiset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant