Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: evaluate-busco brakes when input mags > ~1,300 #142

Merged
merged 14 commits into from
Mar 6, 2024

Conversation

Sann5
Copy link
Contributor

@Sann5 Sann5 commented Feb 26, 2024

What's new

  • the plotting library vega has a limit on the size of the plots that can be computed.
  • a user was trying to plot something really big and therefore the evaluate-busco action was erroring out.
  • this PR fixes that such that even when the dataset is really big a visualization is produced.
  • even though we can avoid error in the action there is no going around the limit on the size of the plot. If one tries to plot more than ~1,300 MAGs the output will be an error image instead of the plot.
  • as a workaround to be able to work with datasets containing more than ~1,300 MAGs, we added a search box function to the visualization such that users can display an arbitrary subset of the samples. As long as this subset is bellow the allowed limit then a plot will be produced.
  • Closes BUG: evaluate-busco brakes when input mags > ~1,300 #140

Run it locally

  1. Assuming you already have a local copy of q2-moshpit that is installed in your virtual environment.
cd q2-moshpit
gh pr checkout 142
  1. Let's get you some data to play with:
cd <download here>
wget https://polybox.ethz.ch/index.php/s/slGyW1uQbyxmyb9/download -O mags.qza
  1. Test it out!
qiime moshpit evaluate-busco --i-bins mags.qza --verbose --p-lineage-dataset bacteria_odb10 --p-cpu 6 --o-visualization mags.qzv 
qiime tools view mags.qzv 

Running the tests

pytest -W ignore -vv --pyargs q2_moshpit

@Sann5 Sann5 added the bug Something isn't working label Feb 26, 2024
@Sann5 Sann5 requested a review from misialq February 26, 2024 10:40
@Sann5 Sann5 self-assigned this Feb 26, 2024
Copy link
Contributor

@misialq misialq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Sann5, thanks for fixing that! I changed the code slightly to remove the dependency on the vegatransformer - with this present I would constantly get errors that some packages are missing in the environment. Besides, this was needed to display more than 5k rows, right? So I guess we don't need it anyway, unless you kept it there on purpose? In any case, I'm merging this one now and we can see later whether we should think about that transformer again.

@misialq misialq merged commit 1e9cc2c into bokulich-lab:main Mar 6, 2024
5 of 7 checks passed
@Sann5
Copy link
Contributor Author

Sann5 commented Mar 6, 2024

I'm not super sure @michal but think the transformer might be needed. I can't remember if the visualization broke earlier (with a smaller dataset) or rather if it was needed for the search box function to work correctly. But if you tried this out with a large dataset and it worked then we are good to go :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: evaluate-busco brakes when input mags > ~1,300
2 participants