-
Notifications
You must be signed in to change notification settings - Fork 14
Generate MAGs from Reads
In this tutorial we will be transforming raw sequencing reads into Metagenome-Assembled Genomes (MAGs) using Qiime2. Before you start make sure you have a working virtual environment (instructions available here).
Approximate runtime: 5 hours
We will not be using any real data in this tutorial but instead will simulate a small dataset based on a set of genomes of known origin. However, feel free to skip this step if you have some read data you would like to apply this workflow to. Check out Qiime2 documentation to find out how to import your reads into Qiime.
To simulate reads, we can use the generate-reads
command from the q2-assembly plugin.
We can specify how many samples should be generated with home many reads, and which abundance distributions.
For now, let's simulate 3 samples with 20000 reads (uniform abundance distribution for 3 random genomes):
Estimated runtime: 45 minutes
# Download genomes
cd <download_here>
curl -L -o genomes.qza https://github.com/bokulich-lab/q2-moshpit/wiki/genomes.qza
# Simulate reads
qiime assembly generate-reads \
--i-genomes genomes.qza \
--p-sample-names sample{1,2} \
--p-n-reads 500000 \
--p-abundance uniform \
--p-n-genomes 5 \
--p-cpus 7 \
--output-dir reads \
--verbose
We can use the simulated reads to perform metagenome assembly. There are two assemblers available in the q2-assembly
plugin: we will use the MEGAHIT
assembler in this tutorial - feel free to try out the MetaSPAdes
assembler though.
Estimated runtime: 60 minutes
qiime assembly assemble-megahit \
--i-seqs reads/reads.qza \
--p-presets meta-sensitive \
--o-contigs contigs.qza \
--verbose \
--p-num-partitions 3 \
--parallel
Estimated runtime: 6 minutes
qiime assembly evaluate-contigs \
--i-contigs contigs.qza \
--p-min-contig 100 \
--o-visualization contigs.qzv \
--verbose
Before we perform the actual binning (MAG generation), we will need to map the reads to the assembled contigs. The resulting alignment map can then be used directly in the binning action.
We begin by generating a Bowtie2 index of the assembled contigs. This can be
achieved by using the index-contigs
action from the q2-assembly
plugin:
Estimated runtime: 60 seconds
qiime assembly index-contigs \
--i-contigs contigs.qza \
--p-threads 7 \
--p-seed 100 \
--o-index contigs-index.qza \
--verbose
Next, we will generate a reads-to-contigs alignment map using the map-reads-to-contigs
action from
q2-assembly:
Estimated runtime: 60 seconds
qiime assembly map-reads-to-contigs \
--i-indexed-contigs contigs-index.qza \
--i-reads reads/reads.qza \
--p-threads 7 \
--p-seed 100 \
--o-alignment-map reads-to-contigs-aln.qza \
--verbose
Finally, we are ready to perform contig binning using MetaBAT2 through the bin-contigs-metabat
action from
q2-moshpit:
Estimated runtime: 60 seconds
qiime moshpit bin-contigs-metabat \
--i-contigs contigs.qza \
--i-alignment-maps reads-to-contigs-aln.qza \
--p-num-threads 7 \
--p-seed 100 \
--o-mags mags.qza \
--o-contig-map map.qza \
--o-unbinned-contigs unbinned.qza \
--verbose
Once you have obtained your MAGs you can use q2-moshpit
to do quality control, gene prediction, or functional annotation.