-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
5 changed files
with
167 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
--- | ||
jupytext: | ||
formats: md:myst | ||
text_representation: | ||
extension: .md | ||
format_name: myst | ||
format_version: 0.13 | ||
jupytext_version: 1.11.5 | ||
kernelspec: | ||
display_name: Python 3 | ||
language: python | ||
name: python3 | ||
--- | ||
(data-export)= | ||
# Exporting data and connecting with other tools | ||
QIIME 2 offers various ways of visualizing and processing your data further, but sometimes you may want to use other tools | ||
that are not (yet) available through QIIME 2. This is, of course, possible and very easy to do: you can export your data | ||
from any QIIME 2 artifact and use it with any of your other favourite tools, as long as the underlying format is compatible. | ||
The formats that QIIME 2 supports are common and should be readable by most bioinformatics tools - most of the time, the | ||
artifacts will contain data in the original format that the underlying tool uses. Below are some examples of how you can | ||
export data from QIIME 2 and connect it with other tools. | ||
|
||
```{warning} | ||
QIIME 2 does not yet support exporting data from the cache. This means that you will need to manually copy the data from the | ||
cache directory to a location where you can access it with other tools. In our examples, the cache directory is located directly | ||
in the working directory and that is where we will copy the data from. Keep in mind that you should never temper with the files | ||
in the cache directory directly, as this may lead to broken artifacts and failed analyses. | ||
``` | ||
|
||
## Visualizing Kraken 2 reports with Pavian | ||
If you have used Kraken 2 to [classify your reads](kraken-reads), you can export the resulting reports from the corresponding | ||
QIIME 2 artifact and visualize them with [Pavian](https://github.com/fbreitwieser/pavian) which will allow you to explore the | ||
taxonomic composition of your samples in an interactive way. To export the Kraken 2 reports, you can use the following commands: | ||
```bash | ||
UUID=$(cat ./cache/keys/kraken_reports_reads | grep 'data' | awk '{print $2}') | ||
mkdir exported_reports | ||
cp -r ./cache/data/$UUID/data/* exported_reports/ | ||
``` | ||
This will find the UUID of the reports artifact, use it to locate the data within the cache directory, create a directory | ||
for the exported data and copy the files from the cache into it. You can then use those files (within the `exported_reports` | ||
directory) with Pavian. To give it a quick try, navigate to [Pavian's demo site](https://fbreitwieser.shinyapps.io/pavian/) | ||
and upload the exported files. | ||
|
||
## Microbial pangenomics with Anvi'o | ||
Another suite of tools you may be familiar with is the [Anvi'o](http://anvio.org/) platform. One of the workflows that Anvi'o | ||
provides is the microbial pangenomics analysis, which can be used to explore the gene clusters within your samples. You | ||
could export the MAGs obtained from the [binning step](mag-recovery) and use them as input to the `anvi-pan-genome` workflow, as | ||
described [here](https://merenlab.org/2016/11/08/pangenomics-v2/). To export the MAGs, you can use the following command: | ||
```bash | ||
UUID=$(cat ./cache/keys/mags | grep 'data' | awk '{print $2}') | ||
mkdir exported_mags | ||
cp -r ./cache/data/$UUID/data/* exported_mags/ | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
--- | ||
jupytext: | ||
formats: md:myst | ||
text_representation: | ||
extension: .md | ||
format_name: myst | ||
format_version: 0.13 | ||
jupytext_version: 1.11.5 | ||
kernelspec: | ||
display_name: Python 3 | ||
language: python | ||
name: python3 | ||
--- | ||
(data-import)= | ||
# Importing data from other tools | ||
The MOSHPIT pipeline allows you to start working directly with the NGS reads, which you can take through various analysis, | ||
like contig assembly, binning, and annotation. However, if you have already performed some of these steps outside of QIIME 2, | ||
you can import the results into an appropriate QIIME 2 artifact and continue from there. Below you can see some examples and | ||
use cases where this may be relevant. | ||
|
||
## Working with exisiting contigs | ||
In case you already have contigs assembled from your metagenomic data, you can import them into a `SampleData[Contigs]` | ||
artifact. This should not differ much from the typical import process (see [here](https://docs.qiime2.org/2024.10/tutorials/importing/) | ||
for more details on importing data), but the command may look like: | ||
```bash | ||
qiime tools cache-import \ | ||
--cache ./cache \ | ||
--key contigs \ | ||
--type "SampleData[Contigs]" \ | ||
--input-path ./<directory with contig FASTA files> | ||
``` | ||
Some actions in the MOSHPIT pipeline assume that contig IDs are unique across your entire sample set. If this is not the case, | ||
you may use the `qiime assembly rename-contigs` action to rename contigs with unique identifiers: | ||
```bash | ||
qiime assembly rename-contigs \ | ||
--i-contigs ./cache:contigs \ | ||
--p-uuid-type shortuuid \ | ||
--o-renamed-contigs ./cache:contigs_renamed | ||
``` | ||
From here, you should be able to continue with the rest of the MOSHPIT pipeline as described in our tutorials. | ||
|
||
## Working with existing MAGs | ||
You may also be interested in continuing your analysis with MAGs that you have already recovered using other tools. | ||
In this case, you can import the MAGs into a `SampleData[MAGs]` (non-dereplicated) or `FeatureData[MAG]` (dereplicated) | ||
artifact. Before you do that, you will need to rename each MAG's FASTA file using the [UUID4](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)) | ||
format: this is required to ensure that MAG IDs are unique across your entire sample set. Here is a sample Python script | ||
which could be used for that purpose: | ||
```python | ||
import os | ||
from uuid import uuid4 | ||
path = 'path/to/your/mag/directory/' | ||
|
||
for file in os.listdir(path): | ||
os.rename(os.path.join(path, file), os.path.join(path, f'{uuid4()}.fa'))) | ||
``` | ||
Once you have renamed the MAGs, you can import them into a QIIME 2 artifact: | ||
```bash | ||
qiime tools cache-import \ | ||
--cache ./cache \ | ||
--key mags \ | ||
--type "SampleData[MAGs]" \ | ||
--input-path ./<directory with MAG FASTA files per sample> | ||
``` | ||
for MAGs-per-sample, or: | ||
```bash | ||
qiime tools cache-import \ | ||
--cache ./cache \ | ||
--key mags \ | ||
--type "FeatureData[MAG]" \ | ||
--input-path ./<directory with MAG FASTA files> | ||
``` | ||
for dereplicated MAGs. From here, you should be able to continue with the rest of the MOSHPIT pipeline as described in our tutorials. | ||
|
||
## Importing other data | ||
If you have other data that you would like to import into QIIME 2, you can use the `qiime tools cache-import` command - no | ||
additional steps should be required. For example, you can import a set of Kraken 2 reports into a `SampleData[Kraken2Report % Properties('reads')]` | ||
like this: | ||
```bash | ||
qiime tools cache-import \ | ||
--cache ./cache \ | ||
--key kraken2_reports_reads \ | ||
--type "SampleData[Kraken2Report % reads]" \ | ||
--input-path ./<directory with Kraken 2 reports> | ||
``` | ||
|
||
```{note} | ||
Remember: you can import any existing data into QIIME 2 artifacts, as long as it matches the format required by the respective | ||
QIIME 2 semantic type. | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
--- | ||
jupytext: | ||
formats: md:myst | ||
text_representation: | ||
extension: .md | ||
format_name: myst | ||
format_version: 0.13 | ||
jupytext_version: 1.11.5 | ||
kernelspec: | ||
display_name: Python 3 | ||
language: python | ||
name: python3 | ||
--- | ||
(interoperability)= | ||
# Interoperability with other tools | ||
While most of the typical steps in a metagenomic analysis can be performed within QIIME 2, there are cases where you | ||
might want to use other tools to perform certain tasks. In this chapter, we will show you how you can get some data in | ||
and out of the QIIME 2 artifacts to continue your analysis workflow elsewhere. |