Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metabat2 binning #117

Open
ebustos128 opened this issue Oct 7, 2021 · 3 comments
Open

metabat2 binning #117

ebustos128 opened this issue Oct 7, 2021 · 3 comments

Comments

@ebustos128
Copy link

Hi,
I'm trying to do a binning with metabat2 using the following config.yaml:

------ Samples ------

samples: '' # specify a list samples to use or '' to use all samples

------ Resources ------

threads : 16 # single task nb threads

------ Assembly parameters ------

data: /home/ebustos/05.PATOS_metagenomes/23.STRONG_Runs/04.Tomeu_metagenomes/02.STRONG_samples # path to data folder

----- Annotation database -----

cog_database: /home/ebustos/05.PATOS_metagenomes/23.STRONG_Runs/cogs/Cog # COG database

----- Binner ------

binner: metabat2

----- Binning parameters ------

contig_size: 1500

read_length: 150
assembly:
assembler: spades
k: [77]
mem: 200000
threads: 16

----- BayesPaths parameters ------

bayespaths:
nb_strains: 16
nmf_runs: 1
max_giter: 1
min_orf_number_to_merge_bins: 10
min_orf_number_to_run_a_bin: 10
percent_unitigs_shared: 0.1

----- DESMAN parameters ------

desman:
execution: 1
nb_haplotypes: 10
nb_repeat: 5
min_cov: 1

----- Evaluation ------

#evaluation:
#execution: 1
# genomes: /home/ebustos/05.PATOS_metagenomes/23.STRONG_Runs/01.PATOS_samples/01.Pond1/01.Cycle1/Eval # path to reference genomes

Do you think that this config.yaml is correct? Also, I have checked the snake files and I haven't see any option to make the binning with metabat2.

Best,
Esteban

@Sebastien-Raguideau
Copy link
Collaborator

Hi Esteban,
I admit it's a bit hard to say anything right there. Can you instead share your config as a file? Some important features of .yaml are indentations and spaces after colon. Try a yaml file validator to check your file is a correct yaml.
Usually, the simplest way is to just take the template config file and complete it. Which you seem to have done, so it should be fine.
Also, a good way to check if the config file is valid, is to try and launch STRONG with it, that would be quite fast.
Regarding metabat2, ... what options were you looking for. Are you saying that you didn't find the part of the code were metabat2 is used or that you would like to be able to run metabat2 with some options?

@ebustos128
Copy link
Author

Hi,
I have using a old version of STRONG, now I'm running the STRONG with metabat2 binning but I have the following issue:

[Sun Oct 10 14:25:51 2021]
Error in rule create_bin_folders:
jobid: 156
output: binning/metabat2/list_mags.tsv, binning/metabat2/SCG_table_metabat2.csv
shell:
/gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/scripts/SCG_in_Bins.py binning/metabat2/clustering_metabat2.csv annotation/SCG.fna annotation/assembly.bed profile/split.bed /gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/scg_data/scg_cogs_to_run.txt -all subgraphs/bin_init/ -l binning/metabat2/list_mags.tsv -T 0.75 -t binning/metabat2/SCG_table_metabat2.csv
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Full Traceback (most recent call last):
File "/home/c988/c9881009/.conda/envs/mamba/envs/STRONG/lib/python3.6/site-packages/snakemake/executors/init.py", line 2395, in run_wrapper
basedir,
File "/gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/Binning.snake", line 266, in __rule_create_bin_folders
File "/home/c988/c9881009/.conda/envs/mamba/envs/STRONG/lib/python3.6/site-packages/snakemake/shell.py", line 263, in new
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '/gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/scripts/SCG_in_Bins.py binning/metabat2/clustering_metabat2.csv annotation/SCG.fna annotation/assembly.bed profile/split.bed /gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/scg_data/scg_cogs_to_run.txt -all subgraphs/bin_init/ -l binning/metabat2/list_mags.tsv -T 0.75 -t binning/metabat2/SCG_table_metabat2.csv' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/c988/c9881009/.conda/envs/mamba/envs/STRONG/lib/python3.6/site-packages/snakemake/executors/init.py", line 592, in _callback
raise ex
File "/home/c988/c9881009/.conda/envs/mamba/envs/STRONG/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/c988/c9881009/.conda/envs/mamba/envs/STRONG/lib/python3.6/site-packages/snakemake/executors/init.py", line 578, in cached_or_run
run_func(*args)
File "/home/c988/c9881009/.conda/envs/mamba/envs/STRONG/lib/python3.6/site-packages/snakemake/executors/init.py", line 2407, in run_wrapper
ex, lineno, linemaps=linemaps, snakefile=file, show_traceback=True
snakemake.exceptions.RuleException: CalledProcessError in line 115 of /gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/Binning.snake:
Command '/gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/scripts/SCG_in_Bins.py binning/metabat2/clustering_metabat2.csv annotation/SCG.fna annotation/assembly.bed profile/split.bed /gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/scg_data/scg_cogs_to_run.txt -all subgraphs/bin_init/ -l binning/metabat2/list_mags.tsv -T 0.75 -t binning/metabat2/SCG_table_metabat2.csv' returned non-zero exit status 1.
File "/gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/Binning.snake", line 115, in __rule_create_bin_folders

RuleException:
CalledProcessError in line 115 of /gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/Binning.snake:
Command '/gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/scripts/SCG_in_Bins.py binning/metabat2/clustering_metabat2.csv annotation/SCG.fna annotation/assembly.bed profile/split.bed /gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/scg_data/scg_cogs_to_run.txt -all subgraphs/bin_init/ -l binning/metabat2/list_mags.tsv -T 0.75 -t binning/metabat2/SCG_table_metabat2.csv' returned non-zero exit status 1.
File "/gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/Binning.snake", line 115, in __rule_create_bin_folders
File "/home/c988/c9881009/.conda/envs/mamba/envs/STRONG/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Job failed, going on with independent jobs.
Exiting because a job execution failed. Look above for error message
Complete log: /gpfs/gpfs1/scratch/c9881009/projects/PATOS/02.Trimmed_files/05.PATOS_STRONG_results/.snakemake/log/2021-10-09T103712.730549.snakemake.log
unlocking
removing lock
removing lock
removed all locks

Seems to fail when I run with metabat2 binning because using concoct all pipeline is working fine!
Best,
Esteban

@Sebastien-Raguideau
Copy link
Collaborator

Hi Esteban,

Sorry for delay, I was away for a bit.

Yeah, so this is a downside of snakemake, that log doesn't tell us why it failed. There is no specific log corresponding to that script so the best way do debug it would be for you to rerun the failing command out of STRONG.
That would be:
/gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/scripts/SCG_in_Bins.py binning/metabat2/clustering_metabat2.csv annotation/SCG.fna annotation/assembly.bed profile/split.bed /gpfs/gpfs1/scratch/c9881009/apps/STRONG/SnakeNest/scg_data/scg_cogs_to_run.txt -all subgraphs/bin_init/ -l binning/metabat2/list_mags.tsv -T 0.75 -t binning/metabat2/SCG_table_metabat2.csv
I predict that this will be an issue with 1 of the file it is using. I'm unclear if that script works when any of corresponding file is empty.

Also obligatory pebkac question, did you use the same output folder for both concoct and metabat2 runs? I don't think this is clearly specified in the doc, but you should not do so. That would be quite problematic for downstream analysis. Though it is possible to symlink assembly and profile folder in the other STRONG folder for speed up.

Best,
Seb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants