Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add action fetch-busco-db and modify evaluate-busco accordingly #162

Merged
merged 72 commits into from
Jun 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
6c971e1
implement fetch_busco_db
Sann5 May 10, 2024
7ea1233
implement tests
Sann5 May 10, 2024
61d128d
update plugin_setup
Sann5 May 10, 2024
bdcc7dd
update plugin_setup + function signatures
Sann5 May 10, 2024
9007811
implement offline evaluate-busco
Sann5 May 10, 2024
4b8db4e
implement semantic type
Sann5 May 13, 2024
beee05c
add tests
Sann5 May 13, 2024
e2866d1
merge semantic type branch to fetch db branch
Sann5 May 13, 2024
9df3d03
fix import of semantic types and formats
Sann5 May 13, 2024
a54d8fe
update evaluate-busco API
Sann5 May 13, 2024
b1c06ea
fix imports
Sann5 May 13, 2024
c77be35
re organize the register format comands
Sann5 May 14, 2024
705035a
add dummy validate method to the busco db sematic type
Sann5 May 14, 2024
9b62f7b
type in function must be format not type
Sann5 May 14, 2024
c5400c6
in online mode download db to temp directory
Sann5 May 14, 2024
5ba4eec
add test data
Sann5 May 14, 2024
83193c6
add tests to increase coverage
Sann5 May 14, 2024
b827ab4
include test data
Sann5 May 15, 2024
42524b6
Update q2_moshpit/busco/fetch_busco_db.py
Sann5 May 17, 2024
dbe823f
Update q2_moshpit/busco/fetch_busco_db.py
Sann5 May 17, 2024
7e1a7f0
Update q2_moshpit/busco/fetch_busco_db.py
Sann5 May 17, 2024
c25fc01
Update q2_moshpit/busco/busco.py
Sann5 May 17, 2024
eb9f5bf
Update q2_moshpit/busco/fetch_busco_db.py
Sann5 May 17, 2024
924844b
Update q2_moshpit/plugin_setup.py
Sann5 May 17, 2024
fd6a862
Update q2_moshpit/plugin_setup.py
Sann5 May 17, 2024
23c7888
lint
Sann5 May 17, 2024
ddb5be0
Merge branch 'main' into fetch_busco_db_iss_122_from_main
Sann5 May 17, 2024
71814d0
Merge branch 'main' into fetch_busco_db_iss_122_from_main
Sann5 May 21, 2024
b953ddb
Merge branch 'main' into fetch_busco_db_iss_122_from_main
Sann5 May 21, 2024
8ccc28b
Merge branch 'main' into fetch_busco_db_iss_122_from_main
misialq May 24, 2024
0f113af
Merge branch 'main' into fetch_busco_db_iss_122_from_main
misialq May 24, 2024
6ece14e
CI: trigger [add:q2-demux:4d0d6c4],[add:qiime2:d56401b]
misialq May 24, 2024
651a3df
Trigger CI [add:q2-demux:4d0d6c4],[add:qiime2:d56401b]
misialq May 24, 2024
81ca80e
fix test: specify cwd in test
Sann5 May 27, 2024
bf29e1a
Trigger CI [add:q2-demux:4d0d6c4],[add:qiime2:d56401b]
Sann5 May 27, 2024
2163d29
Merge branch 'fetch_busco_db_iss_122_from_main' of github.com:Sann5/q…
Sann5 May 27, 2024
1dfc3da
Trigger CI [add:q2-demux:4d0d6c4],[add:qiime2:d56401b]
Sann5 May 27, 2024
1e5a753
Trigger CI [add:q2-demux:4d0d6c4],[add:qiime2:d56401b]
Sann5 May 29, 2024
6c9b343
Update CI to use dev/prod tags [add:q2-demux:4d0d6c4],[add:qiime2:d56…
misialq Jun 5, 2024
8082ab3
Merge branch 'main' into fetch_busco_db_iss_122_from_main
Sann5 Jun 5, 2024
2af12cc
fetch_busco_db parameter defaults False
Sann5 Jun 5, 2024
c3943a1
Merge branch 'fetch_busco_db_iss_122_from_main' of github.com:Sann5/q…
Sann5 Jun 5, 2024
e014115
make busco_db mandatory [add:q2-demux:4d0d6c4],[add:qiime2:d56401b],[…
Sann5 Jun 5, 2024
35e697d
Update q2_moshpit/busco/utils.py
Sann5 Jun 5, 2024
81fd6f8
refactor fetch_busco_db.py to database.py
Sann5 Jun 5, 2024
a513c91
Merge branch 'fetch_busco_db_iss_122_from_main' of github.com:Sann5/q…
Sann5 Jun 5, 2024
de392e1
Trigger CI [add:q2-demux:4d0d6c4],[add:qiime2:d56401b],[stable]
misialq Jun 5, 2024
5bf8df9
Split coverage upload into a separate workflow
misialq Jun 5, 2024
4d0f880
Trigger CI [add:q2-demux:4d0d6c4],[add:qiime2:d56401b],[stable]
misialq Jun 5, 2024
0ab9bb7
Merge branch 'main' into fetch_busco_db_iss_122_from_main
misialq Jun 5, 2024
3487c35
Trigger CI [add:q2-demux:4d0d6c4],[add:qiime2:d56401b],[stable]
misialq Jun 5, 2024
16ae882
Update artifact actions to v4 [add:q2-demux:4d0d6c4],[add:qiime2:d564…
misialq Jun 5, 2024
76b5e3f
Update artifact actions to v4 [add:q2-demux:4d0d6c4],[add:qiime2:d564…
misialq Jun 5, 2024
ce046e8
Merge branch 'main' into fetch_busco_db_iss_122_from_main
misialq Jun 6, 2024
76aef52
Merge branch 'main' into fetch_busco_db_iss_122_from_main
Sann5 Jun 6, 2024
b290559
Trigger CI [add:q2-demux:4d0d6c4],[add:qiime2:d56401b],[stable]
misialq Jun 6, 2024
277b9a4
Change the path to the coverage report [add:q2-demux:4d0d6c4],[add:qi…
misialq Jun 6, 2024
215b833
Add workflow ID to the artifact download step [add:q2-demux:4d0d6c4],…
misialq Jun 7, 2024
ec5b8e1
Add GitHub token to the artifact download step [add:q2-demux:4d0d6c4]…
misialq Jun 7, 2024
fbf5b2a
Add personal token to the artifact download step [add:q2-demux:4d0d6c…
misialq Jun 7, 2024
3815eae
Try a different id? [add:q2-demux:4d0d6c4],[add:qiime2:d56401b],[stable]
misialq Jun 7, 2024
90cb58e
Revert the ID? [add:q2-demux:4d0d6c4],[add:qiime2:d56401b],[stable]
misialq Jun 7, 2024
6503f01
Use GtiHub script instead [add:q2-demux:4d0d6c4],[add:qiime2:d56401b]…
misialq Jun 7, 2024
885791b
Use GtiHub script instead [add:q2-demux:4d0d6c4],[add:qiime2:d56401b]…
misialq Jun 7, 2024
cdd688c
Use GtiHub script instead [add:q2-demux:4d0d6c4],[add:qiime2:d56401b]…
misialq Jun 7, 2024
46044bb
Use GtiHub script instead [add:q2-demux:4d0d6c4],[add:qiime2:d56401b]…
misialq Jun 7, 2024
fd5f94d
Merge remote-tracking branch 'upstream/main' into fetch_busco_db_iss_…
misialq Jun 7, 2024
231e678
Trigger CI [add:q2-demux:4d0d6c4],[add:qiime2:d56401b],[stable]
misialq Jun 7, 2024
b30133f
Update the script to V7 [add:q2-demux:4d0d6c4],[add:qiime2:d56401b],[…
misialq Jun 7, 2024
a58e553
Merge remote-tracking branch 'upstream/main' into fetch_busco_db_iss_…
misialq Jun 7, 2024
62c5769
Trigger CI [add:q2-demux:4d0d6c4],[add:qiime2:d56401b],[stable]
misialq Jun 7, 2024
5a3b4b2
Pass PR number to codecov action [add:q2-demux:4d0d6c4],[add:qiime2:d…
misialq Jun 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .github/workflows/upload-coverage.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,29 @@ jobs:

- run: unzip coverage.zip

- name: Find associated PR
id: pr
uses: actions/[email protected]
with:
script: |
const response = await github.rest.search.issuesAndPullRequests({
q: 'repo:${{ github.repository }} is:pr sha:${{ github.event.workflow_run.head_sha }}',
per_page: 1,
})
const items = response.data.items
if (items.length < 1) {
console.error('No PRs found')
return
}
const pullRequestNumber = items[0].number
console.info("Pull request number is", pullRequestNumber)
return pullRequestNumber

- uses: codecov/codecov-action@v4
name: Upload coverage report
with:
files: ./coverage.xml
fail_ci_if_error: true
override_pr: ${{ steps.pr.outputs.result }}
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
1 change: 0 additions & 1 deletion q2_moshpit/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
#
# The full license is in the file LICENSE, distributed with this software.
# ----------------------------------------------------------------------------

from . import busco
from . import eggnog
from . import partition
Expand Down
2 changes: 0 additions & 2 deletions q2_moshpit/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,6 @@ def _process_common_input_params(processing_func, params: dict) -> List[str]:
arg_val
):
processed_args.extend(processing_func(arg_key, arg_val))
else:
continue

return processed_args

Expand Down
5 changes: 4 additions & 1 deletion q2_moshpit/busco/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,8 @@
# ----------------------------------------------------------------------------

from .busco import evaluate_busco, _evaluate_busco, _visualize_busco
from .database import fetch_busco_db

__all__ = ["evaluate_busco", "_evaluate_busco", "_visualize_busco"]
__all__ = [
"evaluate_busco", "_evaluate_busco", "_visualize_busco", "fetch_busco_db"
]
18 changes: 15 additions & 3 deletions q2_moshpit/busco/busco.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,12 @@
from q2_moshpit.busco.utils import (
_parse_busco_params, _collect_summaries, _rename_columns,
_parse_df_columns, _partition_dataframe, _calculate_summary_stats,
_get_feature_table, _cleanup_bootstrap, _get_mag_lengths
_get_feature_table, _cleanup_bootstrap, _get_mag_lengths,
_validate_lineage_dataset_input
)
from q2_moshpit._utils import _process_common_input_params, run_command
from q2_types.per_sample_sequences._format import MultiMAGSequencesDirFmt
from q2_moshpit.busco.types import BuscoDatabaseDirFmt
from q2_types.feature_data_mag._format import MAGSequencesDirFmt


Expand Down Expand Up @@ -74,7 +76,7 @@ def _run_busco(
"-o",
sample
])
run_command(cmd)
run_command(cmd, cwd=os.path.dirname(output_dir))

path_to_run_summary = os.path.join(
output_dir, sample, "batch_summary.txt"
Expand Down Expand Up @@ -110,6 +112,7 @@ def _busco_helper(bins, common_args):

def _evaluate_busco(
bins: Union[MultiMAGSequencesDirFmt, MAGSequencesDirFmt],
busco_db: BuscoDatabaseDirFmt,
mode: str = "genome",
lineage_dataset: str = None,
augustus: bool = False,
Expand All @@ -131,8 +134,16 @@ def _evaluate_busco(
scaffold_composition: bool = False,
) -> pd.DataFrame:
kwargs = {
k: v for k, v in locals().items() if k not in ["bins",]
k: v for k, v in locals().items() if k not in ["bins", "busco_db"]
}
kwargs["offline"] = True
kwargs["download_path"] = f"{str(busco_db)}/busco_downloads"

if lineage_dataset is not None:
_validate_lineage_dataset_input(
lineage_dataset, auto_lineage, auto_lineage_euk, auto_lineage_prok,
busco_db, kwargs # kwargs may be modified inside this function
)

# Filter out all kwargs that are None, False or 0.0
common_args = _process_common_input_params(
Expand Down Expand Up @@ -249,6 +260,7 @@ def _visualize_busco(output_dir: str, busco_results: pd.DataFrame) -> None:
def evaluate_busco(
ctx,
bins,
busco_db,
mode="genome",
lineage_dataset=None,
augustus=False,
Expand Down
47 changes: 47 additions & 0 deletions q2_moshpit/busco/database.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# ----------------------------------------------------------------------------
# Copyright (c) 2023, QIIME 2 development team.
#
# Distributed under the terms of the Modified BSD License.
#
# The full license is in the file LICENSE, distributed with this software.
# ----------------------------------------------------------------------------
import subprocess
from q2_moshpit._utils import colorify, run_command
from q2_moshpit.busco.types import BuscoDatabaseDirFmt


def fetch_busco_db(
virus: bool = False,
prok: bool = False,
euk: bool = False
) -> BuscoDatabaseDirFmt:
busco_db = BuscoDatabaseDirFmt(path=None, mode='w')

# Parse kwargs
if all([virus, prok, euk]):
args = ["all"]
else:
variable_and_flag = [
('virus', virus),
('prokaryota', prok),
('eukaryota', euk)
]
args = [name for name, flag in variable_and_flag if flag]

# Download
print(colorify("Downloading BUSCO database..."))
try:
run_command(cmd=["busco", "--download", *args], cwd=str(busco_db))
except subprocess.CalledProcessError as e:
raise Exception(
f"Error during BUSCO database download: {e.returncode}"
)

# Let user know that the process is complete but it still needs
# some time to copy files over.
print(colorify(
"Download completed. \n"
"Copying files from temporary directory to final location..."
))

return busco_db
Empty file.

Large diffs are not rendered by default.

7 changes: 6 additions & 1 deletion q2_moshpit/busco/tests/test_busco_feature_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def test_run_busco(self, mock_run):
'busco', '--lineage_dataset', 'bacteria_odb10',
'--cpu', '7', '--in', self.get_data_path('mags/sample1'),
'--out_path', self.temp_dir.name, '-o', 'sample1'
])
], cwd=os.path.dirname(self.temp_dir.name))

@patch(
"q2_moshpit.busco.busco._draw_detailed_plots",
Expand Down Expand Up @@ -135,9 +135,14 @@ def test_evaluate_busco_action(self):
'FeatureData[MAG]',
self.get_data_path('mags/sample2')
)
busco_db = qiime2.Artifact.import_data(
'ReferenceDB[BuscoDB]',
self.get_data_path('busco_db')
)
obs = evaluate_busco(
ctx=mock_ctx,
bins=mags,
busco_db=busco_db,
num_partitions=2
)
exp = ("collated_result", "visualization")
Expand Down
46 changes: 34 additions & 12 deletions q2_moshpit/busco/tests/test_busco_sample_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from unittest.mock import patch, ANY, call, MagicMock
from qiime2.plugin.testing import TestPluginBase
from q2_types.per_sample_sequences._format import MultiMAGSequencesDirFmt
from q2_moshpit.busco.types import BuscoDatabaseDirFmt


class TestBUSCOSampleData(TestPluginBase):
Expand All @@ -28,6 +29,10 @@ def setUp(self):
path=self.get_data_path('mags'),
mode="r",
)
self.busco_db = BuscoDatabaseDirFmt(
path=self.get_data_path("busco_db"),
mode="r"
)

def _prepare_summaries(self):
for s in ['1', '2']:
Expand Down Expand Up @@ -56,15 +61,21 @@ def test_run_busco(self, mock_run):
self.assertDictEqual(obs, exp)
mock_run.assert_has_calls([
call(
['busco', '--lineage_dataset', 'bacteria_odb10',
'--cpu', '7', '--in', self.get_data_path('mags/sample1'),
'--out_path', self.temp_dir.name, '-o', 'sample1'],
[
'busco', '--lineage_dataset', 'bacteria_odb10',
'--cpu', '7', '--in', self.get_data_path('mags/sample1'),
'--out_path', self.temp_dir.name, '-o', 'sample1'
],
cwd=os.path.dirname(self.temp_dir.name)
),
call(
['busco', '--lineage_dataset', 'bacteria_odb10',
'--cpu', '7', '--in', self.get_data_path('mags/sample2'),
'--out_path', self.temp_dir.name, '-o', 'sample2'],
)
[
'busco', '--lineage_dataset', 'bacteria_odb10',
'--cpu', '7', '--in', self.get_data_path('mags/sample2'),
'--out_path', self.temp_dir.name, '-o', 'sample2'
],
cwd=os.path.dirname(self.temp_dir.name)
),
])

@patch('q2_moshpit.busco.busco._run_busco')
Expand Down Expand Up @@ -99,15 +110,21 @@ def test_busco_helper(self, mock_len, mock_run):
)

@patch("q2_moshpit.busco.busco._busco_helper")
def test_evaluate_busco(self, mock_helper):
def test_evaluate_busco_offline(self, mock_helper):
_evaluate_busco(
bins=self.mags, mode="some_mode", lineage_dataset="bacteria_odb10"
bins=self.mags,
busco_db=self.busco_db,
mode="some_mode",
lineage_dataset="lineage_1"
)
mock_helper.assert_called_with(
self.mags,
['--mode', 'some_mode', '--lineage_dataset', 'bacteria_odb10',
'--cpu', '1', '--contig_break', '10', '--evalue', '0.001',
'--limit', '3']
[
'--mode', 'some_mode', '--lineage_dataset', 'lineage_1',
'--cpu', '1', '--contig_break', '10', '--evalue', '0.001',
'--limit', '3', '--offline', "--download_path",
f"{str(self.busco_db)}/busco_downloads"
]
)

@patch(
Expand Down Expand Up @@ -184,9 +201,14 @@ def test_evaluate_busco_action(self):
'SampleData[MAGs]',
self.get_data_path('mags')
)
busco_db = qiime2.Artifact.import_data(
'ReferenceDB[BuscoDB]',
self.get_data_path('busco_db')
)
obs = evaluate_busco(
ctx=mock_ctx,
bins=mags,
busco_db=busco_db,
num_partitions=2
)
exp = ("collated_result", "visualization")
Expand Down
38 changes: 38 additions & 0 deletions q2_moshpit/busco/tests/test_fetch_busco.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# ----------------------------------------------------------------------------
# Copyright (c) 2022-2023, QIIME 2 development team.
#
# Distributed under the terms of the Modified BSD License.
#
# The full license is in the file LICENSE, distributed with this software.
# ----------------------------------------------------------------------------
from q2_moshpit.busco.database import fetch_busco_db
from unittest.mock import patch
from qiime2.plugin.testing import TestPluginBase


class TestFetchBUSCO(TestPluginBase):
package = "q2_moshpit.busco.tests"

@patch("subprocess.run")
def test_fetch_busco_db_virus(self, subp_run):
busco_db = fetch_busco_db(virus=True, prok=False, euk=False)

# Check that command was called in the expected way
cmd = ["busco", "--download", "virus"]
subp_run.assert_called_once_with(cmd, check=True, cwd=str(busco_db))

@patch("subprocess.run")
def test_fetch_busco_db_prok_euk(self, subp_run):
busco_db = fetch_busco_db(virus=False, prok=True, euk=True)

# Check that command was called in the expected way
cmd = ["busco", "--download", "prokaryota", "eukaryota"]
subp_run.assert_called_once_with(cmd, check=True, cwd=str(busco_db))

@patch("subprocess.run")
def test_fetch_busco_db_all(self, subp_run):
busco_db = fetch_busco_db(virus=True, prok=True, euk=True)

# Check that command was called in the expected way
cmd = ["busco", "--download", "all"]
subp_run.assert_called_once_with(cmd, check=True, cwd=str(busco_db))
Loading
Loading