-
We are using the MD5 hashes from the .dvc files to check the consistency of the stored files. When storing entire directories, DVC calculates a MD5 hash with the suffix .dir. How is this MD5 hash calculated? We will need this MD5 hash to check the consistency of the entire directory... Thank you very much. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Hey, I think this link should give an answer https://dvc.org/doc/user-guide/project-structure/internal-files#directories (specifically, the |
Beta Was this translation helpful? Give feedback.
.dir
files are JSON files that contain a listing of the files in the directory. The hash is the MD5 of the JSON content. This hash would mainly only be useful to know whether or not you have downloaded the correct JSON content for a given.dir
file.How are you using the API to download files?
Also, just to be clear, if you are using DVC 2.x, the MD5 hash computed by DVC is not always the MD5 of the original file content, and you cannot compare the DVC MD5 hash to the results of something like
md5sum
(or python'shashlib.md5
)