Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OptiType killed OOM #120

Open
Akazhiel opened this issue Jan 19, 2021 · 4 comments
Open

OptiType killed OOM #120

Akazhiel opened this issue Jan 19, 2021 · 4 comments

Comments

@Akazhiel
Copy link

Hello!
I'm trying to run optitype in the cloud in a machine with 64GB of RAM with RNA Samples that are around 5GB each due to it being paired end. And the process is killed because it runs out of memory. I've also run OptiType in a pipeline with DNA Tumor-Normal pair samples which are hlatyped in parallel without memory problems.

My questions is, how much memory is needed to run OptiType?

Best regards!

@huguesfontenelle
Copy link

In my experience, 12GB is sufficient.
But I don't feed the entire FASTQ's, only the HLA region on chr6 that is relevant.

@b-niu
Copy link

b-niu commented Feb 26, 2022

Hello @Akazhiel , in my own situation, a server with 128GB RAM gets OOM often when treating WES samples with fastq.gz files size of 20GB.
This is really upset and I am seeking for a workaround.

Perhaps extract reads on chr6 from bam files, and run optitype will help?

@riasc
Copy link

riasc commented Nov 4, 2023

Hello, is there a solution for this? I found that files >100,000 reads cause a kill signal... But these are only 400-500MB files. But works well if I stay below that.

@karlestira
Copy link

karlestira commented Apr 13, 2024

If you are running razers3 and it ran out of memory, you can try to split input file.

bgzip -cd [your fastq] | split -l 40000000 -a 5 -d --filter='razers3 -i 97 -m 99999 --distance-range 0 -pa -tc 0 -o $FILE.bam [your ref] /dev/stdin' /dev/stdin [split prefix]
samtools cat [split prefix]*.bam | samtools view -o res.bam

Then give merged bam files instead of fastqs to optitype.

You can use gzip instead of bgzip(but it is mush slower), adjust split unit(40M lines means 10M reads in my case, which need 4G mem when align reads).
I strongly suggest to use samtools view to recompress the bam file from samtools cat, because samtools cat use some tricky method to merge bam files, and it may not be supported by older decompressor. And, very important, if using samtools cat for directly output, make sure the output file is not in input list, or you will get a infinite file size.

Another things you should notice, is that you should force single thread in razers3, it has multi-thread inconsistency, may lead to a little problem in its result. And its multi-thread seem to have very less speed-up in a small ref.

If you think split fastq is not a good plan, you can also use bowtie2 to filter fastq.

bowtie2 --no-unal --very-sensitive-local --local --omit-sec-seq -p 10 --reorder .....(index and fastq)

Bowtie2 use about 200MB memory(will not increase when input become larger), and can give a filtered bam, remove useless seq. Then you can convert its bam to fastq for optitype use. However, I'm not sure this method give completely consistent output compared with directly use raw fastq.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants