-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add busco-results input to dereplicate-mags
#213
ENH: Add busco-results input to dereplicate-mags
#213
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #213 +/- ##
==========================================
+ Coverage 95.60% 95.64% +0.04%
==========================================
Files 34 34
Lines 1956 1975 +19
Branches 226 229 +3
==========================================
+ Hits 1870 1889 +19
Misses 48 48
Partials 38 38 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @VinzentRisch, nice work, thanks! See some suggestions below :)
Co-authored-by: Michal Ziemski <[email protected]>
Co-authored-by: Michal Ziemski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @VinzentRisch, seems like there was a coverage failure and this time I think it's correct - could you please investigate and address accordingly? 🙏
…ch/q2-moshpit into 161_dereplicate_busco
Hi Michal |
Hey @VinzentRisch, excellent question - thanks for bringing that up! I think there may be some cases like this, although not with BUSCO. For example, CheckM has a contamination score: if a user was to present a table with those results, they may want to pick MAGs with the lowest value of contamination. So your suggestion makes sense - it would be nice if you introduce one more param and maybe set it to 'max' by default. 🚀 |
Copilot
AI
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 suggestion.
Files not reviewed (1)
- q2_moshpit/dereplication/tests/data/busco_results.tsv: Language not supported
Comments skipped due to low confidence (3)
q2_moshpit/dereplication/tests/test_dereplication.py:150
- Add a test case to ensure that the dereplicate_mags function works correctly when metadata and metadata_column are provided.
obs_mags, obs_pa = dereplicate_mags(mags, self.dist_matrix, threshold=0.99)
q2_moshpit/dereplication/derep.py:300
- The walrus operator ':=' is used, which is not available in Python versions prior to 3.8. Ensure compatibility with the required Python version or avoid using the walrus operator.
values := metadata_column[bins]
q2_moshpit/dereplication/derep.py:293
- [nitpick] The error message could be more specific. Consider changing it to 'The specified metadata column must contain numerical values.'
raise ValueError('The specified metadata column has to be numerical.')
Co-authored-by: Copilot <[email protected]>
dereplicate-mags
dereplicate-mags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks @VinzentRisch!
solves #161
dereplicate-mags
.