You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When processing SRA RNA-seq fastq files with Fastool as part of the Trinity package, Fastool appends a /H to the end of sequence ids which then causes errors downstream in Trinity.
Hi,
this is due to the SRA file header, the --illumina-trinity option called by Trinity was meant to be used with Illumina FastQ files with their typical header. In this case a quick work around would be to run fastool alone first on the R1 and R2 dataset with the options:
When processing SRA RNA-seq fastq files with Fastool as part of the Trinity package, Fastool appends a /H to the end of sequence ids which then causes errors downstream in Trinity.
Here are the first few lines of an SRA file: https://gist.github.com/tsackton/8c5508a4b60a1e33f6f2
When I run: fastool --to-fasta --illumina-trinity sra_test.fq > sra_test.1.fa , the output headers look like this:
If I remove everything after the first space in the sra example (with seqtk seq -C), the output is normal:
The /H files do not work with Trinity, while the normal files after seqtk seq -C processing do.
This is tested with the latest version of fastool, compiled on Centos 6 with gcc 4.8.2
The text was updated successfully, but these errors were encountered: