You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been having an issue with some files failing at checksum in some studies. Upon investigation, for at least some of these failing samples, it appears to be due to the pipeline not picking the correct MD5 value from the metadata.
For example, manually downloading the this file finishes and yields a 7b730 checksum:
It appears as if there are three fastq files, and the workflow is grabbing the first one (maybe an index read? it's much smaller than the other two) and renaming it _1.fastq.gz, then comparing against the latter's MD5. I haven't looked in the code yet to determine where the logic is that's splitting reads 1 and 2, but it appears that it might be making too liberal an assumption about the structure of the fastq_ftp field?
Description of the bug
I've been having an issue with some files failing at checksum in some studies. Upon investigation, for at least some of these failing samples, it appears to be due to the pipeline not picking the correct MD5 value from the metadata.
For example, manually downloading the this file finishes and yields a
7b730
checksum:However, looking at the
command.sh
file for this operation, the pipeline is comparing with a3fcee
checksum:If we look at the metadata downloaded for this run, we see both checksums being represented, but in different columns:
It appears as if there are three fastq files, and the workflow is grabbing the first one (maybe an index read? it's much smaller than the other two) and renaming it
_1.fastq.gz
, then comparing against the latter's MD5. I haven't looked in the code yet to determine where the logic is that's splitting reads 1 and 2, but it appears that it might be making too liberal an assumption about the structure of thefastq_ftp
field?Maybe related to issue #260 ?
Either way, this is leading to failed downloads, it seems like it might properly be considered a bug.
Command used and terminal output
Relevant files
No response
System information
No response
The text was updated successfully, but these errors were encountered: