Skip to content

MAG Quality Control

Santiago Castro Dau edited this page Apr 16, 2024 · 5 revisions

Once you have a Qiime artifact containing your MAGs (e.g. mags.qza) you can use BUSCO within q2-moshpit to evaluate their quality. Alternatively, you can use the q2-checkm plugin to evaluate them with CheckM.

You can find out more about how to generate a mags.qza artifact in the Generate MAGs from Reads tutorial

Evaluate MAGs with BUSCO

First, we need to download the busco database. We can do so using the fetch-busco-db action from q2-moshpit. The following command will download the busco lineage datasets for all prokaryotes.

Estimated runtime: 45 minutes

qiime moshpit fetch-busco-db \
  --p-virus False \
  --p-prok True \ 
  --p-euk False \
  --o-busco-db busco_prok_db.qza \
  --verbose

Then we can run busco using the evaluate-busco action. We assume here that you have a valid SampleData[MAGs] artifact (named mags.qza) to use as input.

Estimated runtime: 1.5 minutes

qiime moshpit evaluate-busco \
  --i-bins mags.qza \
  --i-busco-db busco_prok_db.qza \
  --p-lineage-dataset bacteria_odb10 \
  --p-cpu 6 \
  --o-visualization busco.qzv \
  --verbose

Evaluate MAGs with CheckM

The evaluate-bins action from q2-checkm to evaluate the quality of the generated MAGs. This action uses CheckM to estimate the completeness and contamination of the generated MAGs. Before we can run the evaluation itself, we need to download the CheckM database first:

Estimated runtime: 2 minutes

curl -L -o checkm-db.tar.gz https://data.ace.uq.edu.au/public/CheckM_databases/checkm_data_2015_01_16.tar.gz
mkdir checkm_db
tar -xzf checkm-db.tar.gz -C checkm_db
rm checkm-db.tar.gz

Now, run the binning evaluation:

Estimated runtime: 25 minutes

qiime checkm evaluate-bins \
  --i-bins mags.qza \
  --p-db-path ./checkm_db \
  --p-reduced-tree \
  --p-threads 8 \
  --p-pplacer-threads 6 \
  --o-visualization mags.qzv \
  --verbose

🏠 Home

🧑🏻‍🏫 Tutorials

🎬 Actions

Clone this wiki locally