Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Changed read_in_txt helper function to use unique column as index for allele mapping data #27

Merged
merged 6 commits into from
Jan 22, 2024

Conversation

VinzentRisch
Copy link
Collaborator

@VinzentRisch VinzentRisch commented Jan 10, 2024

This PR was created to fix #26.

  • Added new map_type parameter to read_in_txt.
  • If map_type is set to "allele" the column Reference Sequence is used as the index for the count tables. Reference Sequence values are unique and this solves the issue with the biom table format. Only downside is that the unique identifiers are quite long and not very readable (eg. Prevalence_Sequence_ID:97743|ID:1437|Name:AcrF|ARO:3000502) instead of just AcrF
  • If map_type is set to "gene" the behaviour is as before and the column ARO Term is used as the index.

Set up an environment

CONDA_SUBDIR=osx-64 mamba create -yn q2-amr \
  -c conda-forge -c bioconda -c qiime2 -c defaults \
  -c https://packages.qiime2.org/qiime2/2023.9/shotgun/released/ \
  qiime2 q2cli q2templates q2-types q2-types-genomics rgi

conda activate q2-amr
conda config --env --set subdir osx-64

pip install --no-deps --force-reinstall \
  git+https://github.com/misialq/rgi.git@py38-fix \
  git+https://github.com/bokulich-lab/q2-amr.git

Run it locally

  1. First, clone the repo and checkout the PR branch:
git clone https://github.com/bokulich-lab/q2-amr.git
cd q2-amr
git fetch origin pull/27/head:pr-27
git checkout pr-27
pip install -e .
  1. Download test files reads.qza and card_db.qza

  2. Test it out!
    Takes about 10 min.

qiime amr annotate-reads-card --i-reads 25mb_reads.qza --i-card-db card_db.qza --output-dir test_27 --p-threads 2 --p-include-wildcard

@VinzentRisch VinzentRisch marked this pull request as draft January 10, 2024 11:28
@codecov-commenter
Copy link

codecov-commenter commented Jan 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (c179878) 93.35% compared to head (2efc691) 93.37%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #27      +/-   ##
==========================================
+ Coverage   93.35%   93.37%   +0.01%     
==========================================
  Files          18       18              
  Lines        1054     1056       +2     
==========================================
+ Hits          984      986       +2     
  Misses         70       70              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@VinzentRisch VinzentRisch requested a review from misialq January 10, 2024 14:53
@VinzentRisch VinzentRisch marked this pull request as ready for review January 10, 2024 16:46
@VinzentRisch VinzentRisch added the bug Something isn't working label Jan 17, 2024
@VinzentRisch VinzentRisch requested review from misialq and removed request for misialq January 17, 2024 09:51
@misialq
Copy link
Contributor

misialq commented Jan 18, 2024

Hey @ChristosMatzoros, could you please check this one out? Thanks!

Copy link
Contributor

@ChristosMatzoros ChristosMatzoros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for putting this together @VinzentRisch. I executed both the old and new code and achieved the expected results. You did an excellent job introducing the unique identifier for allele mappings, effectively preventing duplicate IDs in the biom table. I consider this to be the most straightforward and clean solution. Regarding the size of the unique identifiers, I believe they do not detract from readability; quite the opposite, they may provide important information to the user. Everything seems ok from my side!

@VinzentRisch
Copy link
Collaborator Author

Hi @misialq, @ChristosMatzoros just reviewed and approved my changes, but i still can't merge the PR.
I think it is because @ChristosMatzoros lacks the needed permissions to approve my changes.
Can you give @ChristosMatzoros these permissions?

@misialq
Copy link
Contributor

misialq commented Jan 22, 2024

Hey @VinzentRisch, sorry about that - I fixed the permissions and you should be able to merge now (after updating). 🚀

@VinzentRisch VinzentRisch removed the request for review from misialq January 22, 2024 10:50
@VinzentRisch VinzentRisch merged commit 57ea746 into bokulich-lab:main Jan 22, 2024
6 checks passed
@VinzentRisch VinzentRisch deleted the 26_change_read_in_text branch January 22, 2024 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: "biom.exception.TableException: Duplicate observation IDs" when running annotate-reads-card
4 participants