The pipeline is designed to analyze real metagenome data obtained from ONT sequencing technology. Due to the complexity of real metagenomic data, this pipeline suggests a two-step approach. First, the assembly is split into Metagenome-Assembled Genomes (MAGs), and subsequently, the stRainy is applied to each MAG individually. Please note that only MAGs with coverage>30, contamination <20 and completeness >80 will be phased.
As an innput data use ONT reads
- flye_output initial metagenome assembly
- bins - initial MAG binning
- qa_bins - initial MAG quality
- strainy_final phased MAGs
- transformed_bins phased MAGs fasta files
- qa_transformed_bins - phased MAG quality
git clone https://github.com/katerinakazantseva/MetagenomeStrainy_ONT_pipeline.git
cd MetagenomeStrainy_ONT_pipeline
- Snakemake conda environment
- stRainy conda environment
- Clair3 conda environment
- Checkm2 conda environment
- metabat2 installed in snakemake conda environment
- Build a metagenome assembly with metaFlye
- Call MAGs with metabat
- Check the quality of MAGs with checkM2
- Filtering MAGs by completeness, coverage and contamination
- Phasing each MAG with stRainy
Before run please update parameters in snakemake file:
- input_reads - reads path (.fq)
- flye_path - Flye path
- conda_path Conda envs path (i.e "~/miniconda3/envs/")
- strainy_path - Strainy path
- clair_model_path - Clair model path (r1041_e82_400bps_sup_v420 is recomended see https://github.com/nanoporetech/rerio)
conda activate snakemake
snakemake --snakefil snakemake --cores 30 --use-conda