Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of BUSCO as a qiime2 visualizer in q2_moshpit #60

Merged
merged 92 commits into from
Oct 9, 2023
Merged
Show file tree
Hide file tree
Changes from 89 commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
ff17d1a
added assets from q2_checkm
Sann5 Aug 14, 2023
6f5cd83
copypasted and commented plot fucntionality from checkm
Sann5 Aug 14, 2023
7b501b6
started addapting visualization action to busco
Sann5 Aug 15, 2023
a4c2eaa
Generate busco graphs.
Sann5 Aug 16, 2023
ce7f3c2
First draft of BUSCO plugin. Untested.
Sann5 Aug 21, 2023
79b6d69
black and flake8 fomratting, precommit hook
Sann5 Aug 21, 2023
1bcb2d5
Succesfull cache of BUSCO. Still untested.
Sann5 Aug 22, 2023
be706fa
Made BUSCO render HTML and resized plot.
Sann5 Aug 23, 2023
15e5b0d
move auxiliary functions to utils
Sann5 Aug 24, 2023
078ed8f
include seaborn in package build?
Sann5 Aug 24, 2023
422063f
begining test suit for busco
Sann5 Aug 24, 2023
b694ed5
Moves tests to busco folder. Ignore .vscode
Sann5 Aug 24, 2023
1d6f29a
setup.py: exchanged checkm data for ci tests for buscos
Sann5 Aug 28, 2023
f57f549
added busco to list of requires packages for conda installation
Sann5 Aug 28, 2023
a5c1dc5
correction to BUSCO parameters in plugin_setup.py
Sann5 Aug 28, 2023
7600810
Same as last commit
Sann5 Aug 28, 2023
a25d1a6
typo in busco/utils.py
Sann5 Aug 28, 2023
8cc40e4
started developing the test suite for busco
Sann5 Aug 28, 2023
0729bae
Updates to visualization. Tooltips and gapps adressed.
Sann5 Aug 30, 2023
1c53d9c
Update to parameter valid ranges.
Sann5 Aug 31, 2023
920e238
Update to plot description in assests html
Sann5 Aug 31, 2023
eb05a12
paths are absolute, nned for so.path.split
Sann5 Aug 31, 2023
6de3dce
range of BUSCO argument
Sann5 Aug 31, 2023
da4bbfd
test_process_common_input_params new implementation
Sann5 Sep 1, 2023
3858cef
busco/utils.py: Amends to docstings.
Sann5 Sep 1, 2023
8c33680
random amends
Sann5 Sep 1, 2023
e03001d
added data for busco tests. all_run_summeries
Sann5 Sep 1, 2023
40c808e
compleated the test suite for busco
Sann5 Sep 1, 2023
f81ada3
indentation formatting change in ci.yml
Sann5 Sep 5, 2023
7896fdc
plugin_setup.py. added explanation to parameter
Sann5 Sep 5, 2023
57274f7
set absolute paths for test data in busco tests
Sann5 Sep 5, 2023
7c891b6
Merge branch 'main' into busco
Sann5 Sep 5, 2023
71a6cc5
Headre for busco/__init__.py
Sann5 Sep 5, 2023
a99171f
Revert "black and flake8 fomratting, precommit hook"
Sann5 Sep 5, 2023
482ec7b
reformat busoc related files to flake8
Sann5 Sep 5, 2023
94ea4de
adding q2templates to meta.yaml
Sann5 Sep 5, 2023
941248d
added altair to mate.yaml
Sann5 Sep 5, 2023
5a0743b
new way of getting the path to assets
Sann5 Sep 6, 2023
af8f1f5
changed the copytree function, hopeing that this one works
Sann5 Sep 6, 2023
7d8444b
added assests to setup.py
Sann5 Sep 6, 2023
5af7a45
flake8 error. trailing white space removed
Sann5 Sep 6, 2023
316a188
First working draft for secondary plot.
Sann5 Sep 7, 2023
848d84d
irrelevant changes to notebook
Sann5 Sep 8, 2023
26ebf84
Second plot added to busco. working imoplementation.
Sann5 Sep 8, 2023
f475d11
Update q2_moshpit/busco/utils.py
Sann5 Sep 18, 2023
904cbe9
Update q2_moshpit/busco/utils.py
Sann5 Sep 18, 2023
fcca323
Update q2_moshpit/busco/utils.py
Sann5 Sep 18, 2023
f614a83
Revert "indentation formatting change in ci.yml"
Sann5 Sep 18, 2023
675605e
ignore notebooks
Sann5 Sep 21, 2023
3c37c54
indentation on parameter descriptions
Sann5 Sep 21, 2023
1d7f4e8
change to the notebook that i ned to save
Sann5 Sep 21, 2023
bb333d6
moved test from busco to test_utils
Sann5 Sep 21, 2023
bdf5dfc
Bunch of work on reformatting the tests. Some parallel work on the so…
Sann5 Sep 21, 2023
7ab4d0f
seaborn -> matplotlib + busco arg parsing bug
Sann5 Sep 22, 2023
21ec706
Render debugging statement in busco tests
Sann5 Sep 22, 2023
178f4fc
Merge branch 'busco' of github.com:Sann5/q2-moshpit into busco
Sann5 Sep 22, 2023
d4c278c
trailing white spaces
Sann5 Sep 22, 2023
10e5ada
fixed bug on test_process_common_inputs_mix_with_falsy_values
Sann5 Sep 22, 2023
bea1dda
Include manifest files in order for busco tests to work
Sann5 Sep 22, 2023
c9a14e9
updated the dictionary to reflect changes in the html render
Sann5 Sep 22, 2023
bd6ffbf
Update integration test s.t viz output is possible.
Sann5 Oct 2, 2023
355591c
Update html base to work with base not tabbed
Sann5 Oct 2, 2023
f23c80b
Merge branch 'main' into busco
Sann5 Oct 2, 2023
a49e6a5
Reduce the height of each bar in plot from 18 to 9
Sann5 Oct 3, 2023
8289281
Remove notebooks
Sann5 Oct 3, 2023
173415b
Spell out the name fraction in the bottom axis
Sann5 Oct 3, 2023
84f5fed
Remove commented out code in base.html.
Sann5 Oct 3, 2023
9432a41
fix test_draw_busco_plots_for_render
Sann5 Oct 4, 2023
4d5a1a0
Change static func for setUpClass method in busco tests.
Sann5 Oct 4, 2023
d2e3e16
assert_frame_equal in busco test_collect_summaries_and_save
Sann5 Oct 4, 2023
bd433da
removed print statement from busco draw_n_busco_plots
Sann5 Oct 4, 2023
58d98bf
eliminate choices from busco_params "mode"
Sann5 Oct 4, 2023
eb76391
parse df columns function in busco utils + test
Sann5 Oct 4, 2023
1e528ab
check zipfiles with is_zipfile function
Sann5 Oct 4, 2023
dd3daf3
get rid of mock_run_busco, instead as test data
Sann5 Oct 5, 2023
7143a53
Added _parse_busco_params and re ordered the code.
Sann5 Oct 5, 2023
eff9ec4
Regret in last commit
Sann5 Oct 5, 2023
5fca4a8
busco tests replace for self.get_data_path(")
Sann5 Oct 5, 2023
195f6a7
update docstring for _parse_busco_params in utils
Sann5 Oct 5, 2023
8b8e342
change command name to evaluate-busco
Sann5 Oct 5, 2023
58ac8e3
assert calls of patches in busco tests
Sann5 Oct 5, 2023
002d9c2
Merge branch 'main' into busco
Sann5 Oct 5, 2023
b1bf79c
trailing white spaces
Sann5 Oct 5, 2023
c9b94b9
Show full uuid in downloadable plots
Sann5 Oct 6, 2023
ad4261a
Additional busco parsing test.
Sann5 Oct 6, 2023
3936383
Added making columns to parse function and updated test.
Sann5 Oct 6, 2023
db12969
Add package data for mock run busco test
Sann5 Oct 6, 2023
11400ed
Merge branch 'main' into busco
Sann5 Oct 6, 2023
1bba0cd
Add mock.ANY to patch calls.
Sann5 Oct 6, 2023
8a92a5e
ignore notebook files rather than the notebooks dir
Sann5 Oct 9, 2023
529d7dd
fixing spaces in param descriptions
Sann5 Oct 9, 2023
548529a
Put integration test in separate test file.
Sann5 Oct 9, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
Expand Down Expand Up @@ -133,3 +132,9 @@ dmypy.json

# Mac OS
.DS_Store

# VS code settings
.vscode

# Ignore notebooks
q2_moshpit/notebooks/
Sann5 marked this conversation as resolved.
Show resolved Hide resolved
3 changes: 3 additions & 0 deletions ci/recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,13 @@ requirements:
- samtools
- qiime2 {{ qiime2_epoch }}.*
- q2-types-genomics {{ qiime2_epoch }}.*
- q2templates {{ qiime2_epoch }}.*
- eggnog-mapper >=2.1.10
- diamond
- tqdm
- xmltodict
- altair
- busco >=5.0.0

test:
requires:
Expand Down
3 changes: 2 additions & 1 deletion q2_moshpit/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from .kraken2 import bracken, classification, database
from .metabat2 import metabat2
from . import eggnog
from . import busco


from ._version import get_versions
Expand All @@ -18,5 +19,5 @@

__all__ = [
'metabat2', 'bracken', 'classification', 'database',
'dereplicate_mags', 'eggnog'
'dereplicate_mags', 'eggnog', 'busco',
]
14 changes: 10 additions & 4 deletions q2_moshpit/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,15 @@
"""
processed_args = []
for arg_key, arg_val in params.items():
# bool is a subclass of int so to only reject ints we need to do:
if type(arg_val) != int and not arg_val: # noqa: E721
continue
else:
# This if condition excludes arguments which are falsy
# (False, None, "", []), except for integers and floats.
if ( # noqa: E721
type(arg_val) == int or
type(arg_val) == float or
arg_val
):
processed_args.extend(processing_func(arg_key, arg_val))
else:
continue

Check warning on line 64 in q2_moshpit/_utils.py

View check run for this annotation

Codecov / codecov/patch

q2_moshpit/_utils.py#L64

Added line #L64 was not covered by tests

return processed_args
20 changes: 20 additions & 0 deletions q2_moshpit/assets/busco/css/styles.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#plot {
margin-top: 50px;
}

.vega-bind {
margin-bottom: 15px;
}

.vega-bind-name {
margin-right: 10px;
white-space: nowrap
}

.header-inline {
display: inline-block;
float: left;
margin-right: 10px;
margin-top: 8px;
margin-bottom: 8px;
}
138 changes: 138 additions & 0 deletions q2_moshpit/assets/busco/index.html
Sann5 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
{% extends "base.html" %} {% block head %}
<title>Embedding Vega-Lite</title>
<script src="js/bootstrapMagic.js" type="text/javascript"></script>
<link href="css/styles.css" rel="stylesheet" />
<script type="text/javascript">
// temporary hack to make it look good with Bootstrap 5
removeBS3refs();
</script>
<script
src="https://cdn.jsdelivr.net/npm//vega@5"
type="text/javascript"
></script>
<script
src="https://cdn.jsdelivr.net/npm//[email protected]"
type="text/javascript"
></script>
<script
src="https://cdn.jsdelivr.net/npm//vega-embed@6"
type="text/javascript"
></script>
<link
crossorigin="anonymous"
href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css"
integrity="sha256-YvdLHPgkqJ8DVUxjjnGVlMMJtNimJ6dYkowFFvp4kKs="
rel="stylesheet"
/>
{% endblock %} {% block content %}
<script
crossorigin="anonymous"
integrity="sha256-9SEPo+fwJFpMUet/KACSwO+Z/dKMReF9q4zFhU/fT9M="
src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js"
></script>

<div class="row row-cols-1 row-cols-md-2 g-4">
<div class="col-lg-12">
<div class="card mt-3 h-100">
<h5 class="card-header">Plot description</h5>
<div class="card-body">
<p>
The left plot shows the results generated by BUSCO for <b>all bins</b> and
<b> samples</b>. "BUSCO attempts to provide a quantitative assessment
of the completeness in terms of the expected gene content of a genome
assembly, transcriptome, or annotated gene set. The results are
simplified into categories of Complete and single-copy, Complete and
duplicated, Fragmented, or Missing BUSCOs. BUSCO completeness results
make sense only in the context of the biology of your organism". Visit the
<a
href="https://busco.ezlab.org/busco_userguide.html#interpreting-the-results"
>
BUSCO User Guide </a
>
for more information.
</p>
<p>
Hoover over the graph to obtain information about the lineage dataset
used for each bin, and the number of genes in each BUSCO category.
</p>
<p>
The right barplot shows assembly statistics calculated for each bin using BBTools.
Specifically, it displays the statistics computed by the <b>stats.sh</b> procedure from BBMap.
View the
<a
href="https://github.com/BioInfoTools/BBMap/blob/master/sh/stats.sh"
>
source code and documentation
</a>
of stats.sh for more information.
</p>
<p>
Choose the assembly statistic that you wish to display from the drop-down manu below the graphs.
Hoover over the graph to show the numerical values that each bar represents.
</p>

<div style="align-items: center; display: flex">
<span class="header-inline">Downloads</span>
<div class="'col-lg-4">
<div
aria-label="Basic outlined example"
class="btn-group"
role="group"
>
<a
class="btn btn-outline-secondary"
href="all_batch_summeries.csv"
>BUSCO batch summary for all samples (csv)</a
>
<a class="btn btn-outline-secondary" href="BUSCO_plots.zip"
>BUSCO plots for all samples (zip)</a
>
</div>
</div>
</div>
</div>
</div>
</div>
</div>

<div class="row">
{% if vega_plots_overview is defined %}
<div class="col-lg-6">
<div id="plot"></div>
<div id="plot-controls"></div>
</div>
{% else %}
<p>Unable to generate the completeness plot</p>
{% endif %}
</div>

{% if vega_plots_overview is defined %}
<script id="spec" type="application/json">
{{
vega_plots_overview
}}
</script>

<script type="text/javascript">
$(document).ready(function () {

const spec = JSON.parse(document.getElementById("spec").innerHTML);

vegaEmbed("#plot", spec)
.then(function (result) {
result.view.logLevel(vega.Warn);
window.v = result.view;

// move the sliders to the right
const controls = document.getElementsByClassName("vega-bindings");
document.getElementById("plot-controls").appendChild(controls[0]);
})
.catch(function (error) {
// From 'js-error-handler.html'
handleErrors([error], $("#plot"));
});
});
</script>

{% endif %} {% endblock %} {% block footer %} {% set loading_selector =
'#loading' %} {% include 'js-error-handler.html' %} {% endblock %}
31 changes: 31 additions & 0 deletions q2_moshpit/assets/busco/js/bootstrapMagic.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
function removeBS3refs() {
// remove Bootstrap 3 CSS/JS reference
let head = document.getElementsByTagName("head")[0]
let links = head.getElementsByTagName("link")
for (let i = 0; i < links.length; i++) {
if (links[i].href.includes("q2templateassets/css/bootstrap")) {
links[i].remove()
}
}
let scripts = head.getElementsByTagName("script")
for (let i = 0; i < scripts.length; i++) {
if (scripts[i].src.includes("q2templateassets/js/bootstrap")) {
scripts[i].remove()
}
}
}

function adjustTagsToBS3() {
// adjust tags to BS3
let tabs = document.getElementsByClassName("nav nav-tabs")[0].children
for (let i = 0; i < tabs.length; i++) {
let isActive = tabs[i].className.includes("active")
tabs[i].className = "nav-item"
let link = tabs[i].getElementsByTagName("a")[0]
if (isActive) {
link.classList.add("active")
}
link.classList.add("nav-link")

}
}
11 changes: 11 additions & 0 deletions q2_moshpit/busco/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# ----------------------------------------------------------------------------
# Copyright (c) 2022-2023, QIIME 2 development team.
#
# Distributed under the terms of the Modified BSD License.
#
# The full license is in the file LICENSE, distributed with this software.
# ----------------------------------------------------------------------------

from .busco import evaluate_busco

__all__ = ["evaluate_busco"]
121 changes: 121 additions & 0 deletions q2_moshpit/busco/busco.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# ----------------------------------------------------------------------------
# Copyright (c) 2023, QIIME 2 development team.
#
# Distributed under the terms of the Modified BSD License.
#
# The full license is in the file LICENSE, distributed with this software.
# ----------------------------------------------------------------------------


import os
import tempfile
import q2_moshpit.busco.utils
from q2_moshpit.busco.utils import (
_parse_busco_params,
_render_html,
)
from q2_moshpit._utils import _process_common_input_params
from typing import List
from q2_types_genomics.per_sample_data._format import MultiMAGSequencesDirFmt


def evaluate_busco(
output_dir: str,
bins: MultiMAGSequencesDirFmt,
mode: str = "genome",
lineage_dataset: str = None,
augustus: bool = False,
augustus_parameters: str = None,
augustus_species: str = None,
auto_lineage: bool = False,
auto_lineage_euk: bool = False,
auto_lineage_prok: bool = False,
cpu: int = 1,
config: str = None,
contig_break: int = 10,
datasets_version: str = None,
download: List[str] = None,
download_base_url: str = None,
download_path: str = None,
evalue: float = 1e-03,
force: bool = False,
limit: int = 3,
help: bool = False,
list_datasets: bool = False,
long: bool = False,
metaeuk_parameters: str = None,
metaeuk_rerun_parameters: str = None,
miniprot: bool = False,
offline: bool = False,
quiet: bool = False,
restart: bool = False,
scaffold_composition: bool = False,
tar: bool = False,
update_data: bool = False,
version: bool = False,
) -> None:
"""
qiime2 visualization for the BUSCO assessment tool
<https://busco.ezlab.org/>.

Args:
see all possible inputs by running `qiime moshpit plot_busco`

Output:
plots.zip: zip file containing all of the busco plots
busco_output: all busco output files
qiime_html: html for rendering the output plots
"""

# Create dictionary with local variables
# (kwargs passed to the function or their defaults) excluding
# "output_dir" and "bins"
kwargs = {
k: v for k, v in locals().items() if k not in ["output_dir", "bins"]
}

# Filter out all kwargs that are None, False or 0.0
common_args = _process_common_input_params(
processing_func=_parse_busco_params, params=kwargs
)

# Creates output directory with path 'tmp'
with tempfile.TemporaryDirectory() as tmp:
# Run busco for every sample. Returns dictionary to report files.
# Result NOT included in final output
busco_results_dir = os.path.join(tmp, "busco_output")
path_to_run_summaries = q2_moshpit.busco.utils._run_busco(
output_dir=busco_results_dir,
mags=bins,
params=common_args,
)

# Collect result for each sample and save to file.
# Result included in final output (file for download)
all_summaries_path = os.path.join(
output_dir, "all_batch_summaries.csv"
)
all_summaries_df = q2_moshpit.busco.utils._collect_summaries_and_save(
all_summaries_path=all_summaries_path,
path_to_run_summaries=path_to_run_summaries,
)

# Draw BUSCO plots for all samples
# Result NOT included in final output
misialq marked this conversation as resolved.
Show resolved Hide resolved
plots_dir = os.path.join(tmp, "plots")
paths_to_plots = q2_moshpit.busco.utils._draw_busco_plots(
path_to_run_summaries=path_to_run_summaries,
plots_dir=plots_dir
)

# Zip graphs for user download
# Result included in final output (file for download)
zip_name = os.path.join(output_dir, "busco_plots.zip")
q2_moshpit.busco.utils._zip_busco_plots(
paths_to_plots=paths_to_plots,
zip_path=zip_name
)

# Render qiime html report
# Result included in final output
_render_html(output_dir, all_summaries_df)
Loading