Skip to content

How to read all blobs in a repository with fastest throughput? #906

Answered by Byron
bradlarsen asked this question in Q&A
Discussion options

You must be logged in to vote

Thanks for asking! I am looking forward to seeing Nosey Parker unfold it's ultimate performance potential :).

Remembering what I do, I also think that it's probably not needed, worth-it or feasible to maintain a global "seen" set of object ids like is currently done, even though it's probably something to experiment with.

To obtain a each and every blob, one would have to efficiently decode the object database and not miss an object. In a probably-more-complex-way-than-you-need-it kind of fashion, this is done in gix_odb::Store::verify_integrity(), which will traverse each index (and pack), each multi-index (and multiple packs) and each loose object database, across all potentially linked…

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@bradlarsen
Comment options

@bradlarsen
Comment options

@Byron
Comment options

@bradlarsen
Comment options

@Byron
Comment options

Answer selected by Byron
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants