-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory error with many reference genomes as input #58
Comments
how many distinct kmers reported at the beginning of the output? |
I don't see k-mers reported in the std output. |
can you please paste the full bcalm output? |
I've pasted it below.
|
ahh thanks, there is something really odd. |
Yes exactly - it's a multifasta of reference genomes. |
that sounds quite reasonable but I never tried that sort of counting. The k-mer counter is really optimized for counting reads, or single reference genomes, but not really collections of reference genomes of various lengths, and probably its guess for internal memory allocation parameters is thrown off by the various genome sizes. What |
Yes, I've tried 500GB-650GB, and also various values for max-disk. |
Yeah, I'm sorry it didn't work, I could look into it sometimes. Meanwhile you could give https://github.com/pmelsted/bifrost a try for outputting unitigs! |
Thanks |
I'm leaving this open, would like to make bcalm able to process collections of genomes |
OK - It could be a good idea as a number of recent k-mer indexing/querying tools recommend using bcalm, not necessarily in the usual context of assembly of sequencing reads. |
REINDEER is not limited by the bug you point out, as it makes BCALM process one genome (= one color) after the other, i.e. smaller chunks at a time. |
David, This may be a use-case for TwoPaCo: |
Thanks! I'll look into it. |
I may have identified the issue with BCALM taking too much memory with many ref genomes. It can be temp-fixed by doing this change.
(reduces mem by 5x on my end) |
I'm trying to run BCALM on a very large database and I keep getting errors that look like:
Is there some way to get around this? More/fewer cores? Lower max-memory (I don't have much more RAM available)? Large value for
-max-disk
parameter?Thanks!
The text was updated successfully, but these errors were encountered: