ValueError: The gene family has not beed associated to a partition. #262

frdel1 · 2024-08-07T11:49:02Z

Hi,
I am experiencing the following error with ppanggolin all:
"ValueError: The gene family has not beed associated to a partition."

Steps to reproduce:

# get a bunch of genomes to create the pangenome
datasets download genome accession GCF_009935005.1 GCF_001015835.1 GCF_009933955.1 GCF_001932715.1 GCF_027863375.1 --include gbff

# create the organism.gbff.list file
# create the pangenome with ppanggolin all
conda activate ppanggolin-2.1.0
ppanggolin all --anno /path/to/organism.gbff.list --cpu 1 --identity 0.8 --output /path/to/output_ppanggolin_all

Best wishes

The text was updated successfully, but these errors were encountered:

jpjarnoux · 2024-08-07T15:41:21Z

Hi !

Sorry to hear about that.
Could you launch your command again with the option --verbose 2 and share the results ?

Thanks

frdel1 · 2024-08-07T16:36:52Z

consol.out.txt

Sure, here it is.
Are you able to reproduce the bug by downloading the set of genomes specified in the datasets download genome accession command and running ppanggolin all ?

jpjarnoux · 2024-08-08T10:08:47Z

Hi!

Thanks for the output. As I suspected, you don't have enough genomes in your pangenome.
The partitioning method is based on the NEM algorithm, and to work with the default parameters, we suggest using at least 15 genomes. You can find more information about the PPanGGOLiN method in the publication here.

Yet all is not lost. First, add the -K 2 option to the' all' command. This option will force PPanGGOLiN to compute only two partitions.
Then, If it did not work, I could suggest following the step-by-step pangenome construction in the documentation (skip the workflow part), or if you kept your pangenome, you could directly use the command explained here to custom the partitioning. @ggautreau will be a greater help than me at this stage.

frdel1 · 2024-08-08T10:17:09Z

Hi!
Thanks for the explanation and the tips, I will follow your advice and use at least 15 genomes then.

jpjarnoux · 2024-08-08T12:11:28Z

Another tip, if you don't mind me saying so.
You can build a pangenome with all genomes of your species from RefSeq or GenBank, for example, and project the pangenome on your five genomes of interest as explained here.

frdel1 · 2024-08-08T12:21:38Z

Thanks ! I have tried ppanggolin projection already, good stuff :)

JeanMainguy · 2024-08-22T11:59:54Z

Hi,
We've changed the log to show a warning instead of a debug message when the partition step fails, making it easier to spot the problem in the version 2.1.1.
Ideally, PPanGGOLiN should still work even if partitioning fails, as mentioned in issue #270.

jpjarnoux added help wanted question labels Aug 8, 2024

jpjarnoux self-assigned this Aug 8, 2024

JeanMainguy removed the help wanted label Aug 8, 2024

This was referenced Aug 21, 2024

Print a warning log when partition step fails #269

Merged

Let PPanGGOLIN keep running if partition step fails #270

Open

JeanMainguy closed this as completed Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: The gene family has not beed associated to a partition. #262

ValueError: The gene family has not beed associated to a partition. #262

frdel1 commented Aug 7, 2024

jpjarnoux commented Aug 7, 2024

frdel1 commented Aug 7, 2024

jpjarnoux commented Aug 8, 2024

frdel1 commented Aug 8, 2024

jpjarnoux commented Aug 8, 2024

frdel1 commented Aug 8, 2024

JeanMainguy commented Aug 22, 2024

ValueError: The gene family has not beed associated to a partition. #262

ValueError: The gene family has not beed associated to a partition. #262

Comments

frdel1 commented Aug 7, 2024

jpjarnoux commented Aug 7, 2024

frdel1 commented Aug 7, 2024

jpjarnoux commented Aug 8, 2024

frdel1 commented Aug 8, 2024

jpjarnoux commented Aug 8, 2024

frdel1 commented Aug 8, 2024

JeanMainguy commented Aug 22, 2024