Support for pipes #26

gcambray · 2023-06-02T15:00:37Z

Hi Daniel,

thanks for the tool, just starting to use it instead of UMI-tools and it goes much faster!

At the moment, I'm using it in fastq mode as part of a pipeline in python.
As I just want the tool to run on UMI -not the read sequence- I do create a file where I write a fastq of the UMI parsed in a previous step.
Instead of writing/reading all the data to/from a file, I'd rather pipe the data in UMIcollapse.
Likewise, I'd rather pipe the data out instead of reading from the generated file and deleting it.

Would be great to support e.g. the '-' notation for arg -i and -o to specify reading/writting from stdin/stdout respectively.
Would this be possible to implement?

In the mean time I tried to emulate this by passing /dev/fd/n 'files' (on unix) as arguments to both -i and -o.
This works great until I use the --tag option, in which case I do not receive anything on the output stream.
If I provide a real file as -i, then I get an output in the stream.
I suspect that internally the input is tagged to produce the output and that somehow doesn't work if input is a stream...

With thanks and best regards

Daniel-Liu-c0deb0t · 2023-06-06T21:34:19Z

For tracking clusters with --tag, two passes need to be made over the input. Therefore, this is only possible with an input file. The reason why UMICollapse is designed this way is to avoid having to load all the reads into memory in one pass. I would suggest using a temporary file as input.

gcambray · 2023-06-07T12:23:10Z

Many thanks for the explanation — yes I indeed turned to temporary file as input! Piping the output is still possible. Best. Guillaume Cambray, PhD Team Leader 'Synthetic, Functional and Evolutionary Genomics' Center for Structural Biochemistry (CBS) CNRS-INSERM-Université de Montpellier +33 6 08 86 06 89 ***@***.*** / ResearchGate / GScholar / LinkedIn

…

On 6 Jun 2023 at 23:34 +0200, Daniel Liu ***@***.***>, wrote: For tracking clusters with --tag, two passes need to be made over the input. Therefore, this is only possible with an input file. The reason why UMICollapse is designed this way is to avoid having to load all the reads into memory in one pass. I would suggest using a temporary file as input. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for pipes #26

Support for pipes #26

gcambray commented Jun 2, 2023

Daniel-Liu-c0deb0t commented Jun 6, 2023

gcambray commented Jun 7, 2023 via email

Support for pipes #26

Support for pipes #26

Comments

gcambray commented Jun 2, 2023

Daniel-Liu-c0deb0t commented Jun 6, 2023

gcambray commented Jun 7, 2023 via email