Running atlas on a single machine #419
Replies: 14 comments
-
Do you use the cluster mode of atlas? e.g. have you installed a cluster profile for it? I’m not sure if I already explained this before. But, if you run multiple samples on your 1.5TB machine and allow each job to take 1.2 TB you will get an error if more than one sample is executed simutaneosly.
You should tell atlas the memory limit of your machine or use a cluster profile both is explained in the documentation.
Unless you have abnormal big samples 100GB memory should be sufficient.
|
Beta Was this translation helpful? Give feedback.
-
update: however, when I run fewer samples it works... Is there any possibility to configure one wdir with the whole project but then run partially this initial qc step for smaller groups of samples? (like modifying the samples.tsv or something?) so, in the config file I am using (for the "small machine"): threads and memory for jobs needing high amount of memory. e.g GTDB-tk,checkm or assembly I though that this way snakemake would distribute the resources equally to all samples... but then, it is not doing this, right? so, what would you suggest here as config, if I have 70 samples of 10 M reads each for example..? PS: this machine has 36 CPU/500GB |
Beta Was this translation helpful? Give feedback.
-
The memory in the config file is for a single step for a single sample. If you run atlas on a single machine (in contrast to a cluster) you have to tell atlas the max memory of your system so that it Please follow the steps here: https://metagenome-atlas.readthedocs.io/en/latest/usage/getting_started.html#single-machine-execution |
Beta Was this translation helpful? Give feedback.
-
ok, thanks a lot. I think that is the problem, indeed. Now, as there are many samples, I also put in --resources mem=1200. However, I have some big samples (100 M reads each), so for these I am trying with 500 GB in the config file (because 100 GB did not work). Let's see if it works. |
Beta Was this translation helpful? Give feedback.
-
This worked fine, although it seems there is a lot to optimize.. for example, during assembly, rule error_correction, Tadpole2 step will use lot of memory (some huge samples I have of 120 M read counts each need 700 GB mem in order to work), but run_spades (which is the most time consuming step) uses really few memory.. Most of the time will be spent in run_spades, without being able to start any other sample at the same time (I can only run 2 samples at a time in the 1500 GB machine) |
Beta Was this translation helpful? Give feedback.
-
Yes, cluster submission is the recommended way. You can run the pipeline up to run_spades if you really want e.g. by with the arguments: |
Beta Was this translation helpful? Give feedback.
-
OK, that worked to run the first rules (merge pairs, error_correction...for each of these samples some of these steps took 900 GB RAM). Now I lowered the mem in the config file and re-run with "atlas run assembly --resources mem=1500 --jobs 36". The idea was that multiple run_spades are lauched but only one started ... rule run_spades: I put this in the config file (just as a first try), although is probably too little memory for the assembly, not sure: threads and memory (GB) for most jobs especially from BBtools, which are memory demanding threads and memory for jobs needing high amount of memory. e.g GTDB-tk,checkm or assembly |
Beta Was this translation helpful? Give feedback.
-
I think @Sofie8 has a similar problem running atlas on a single machine. I read somewhere that spades don't parallelize well beyond 8 threads. Also, the steps for GTDB and checkM explode with too many threads. On the other hand, you said you have big samples, then you would need much more memory for spades. I suggest you to use something closer to the default settings:
Now you would run atlas if you have 1.5 TB
I'm not sure if you should set the |
Beta Was this translation helpful? Give feedback.
-
From the snakemake docs:
If you specify 8 threads in the config and run atlas with or without the jobs argument the speeds should be run with 8 threads. |
Beta Was this translation helpful? Give feedback.
-
yep, that works... my mistake was putting threads: 36 in config file .... and then setting --jobs 36.... with assembly_threads: 8 |
Beta Was this translation helpful? Give feedback.
-
This may be linked #319 |
Beta Was this translation helpful? Give feedback.
-
however, I think that when having huge samples like I do (from 50 to 200 M reads each) running it in cluster mode would be the same in terms of spades needing a lot of memory (between 500- 1200 GB probably) for each sample, right? I mean what difference would it make to do it in cluster mode? can the assembly step be split in this case? |
Beta Was this translation helpful? Give feedback.
-
Ok, seems you have very big samples. If you don't manage to assemble your reads with spades you should either subsample, split your samples or use megahit. Why using cluster mode: In my institution, we have a cluster system with multiple high-memory nodes. If I use Atlas in cluster-mode each sample will be sent to a different node so that I can assemble ~5-20 samples in parallel, depending on the availability of the cluster nodes. All the other steps before and after are also executed besides the assembly. In single-machine mode you submit atlas to one big cluster and in your case, you will probably only assemble one sample at a time. I also have time constraints on my clusters. E.g. there are many nodes where I can run a job for <12 h and only one high-memory node where I can run atlas for >1d. Spades has different checkpoints during the assembly and is configured to restart-from the last checkpoint. By the way, where do you work? |
Beta Was this translation helpful? Give feedback.
-
I think I'm in a similar situation here, as the average #reads across all my files is 47 M, with a maximum of 113 M for one sample (and the second one in line containing 83 M reads). Do I understand correctly that setting:
worked in this case for both preprocessing and assembly using spades? Or do you have to increase the memory for preprocessing and then lower it again for spades? And which one should be increased/lowered again then: "mem", "large_mem" or "assembly_memory"? I have a single machine with 36 threads and 738 GB, so I was planning to run atlas like so:
or should I also add --jobs 36? |
Beta Was this translation helpful? Give feedback.
-
calculate_insert_size_400GB.log
calculate_insert_size_1200GB.log
Hi there!
I found these errors running my samples in 500 GB / 1500 GB machines. I think it is related to the Tadpole2 memory, but I put 400GB and 1200GB respectively in the config file, so I am not sure why it fails... could it be that there are too many/too big samples and snakemake somehow does not make it to use resources available correctly? If run independently each sample they run OK. Also, if I run Tadpole2 step activating its own env, it will also run OK for one sample....
Best
Beta Was this translation helpful? Give feedback.
All reactions