nf-m6anet is a Nextflow pipeline for m6A detection from Nanopore direct RNA-seq data based on m6anet. Starting from raw fast5 and fastq files, it aligns the sequencing reads to the transcriptome with minimap2, performs resquiggling with f5c re-implementation of Nanopolish and runs m6anet for m6A detection. It then filters high-quality m6A+ sites and performs lift-over of transcriptome-based to genome-based coordinates.
Prerequisites
Installation
git clone https://github.com/MaestSi/nf-m6anet.git
cd nf-m6anet
chmod 755 *
The nf-m6anet pipeline requires you to open nf-m6anet.conf configuration file and set the desired options. Then, you can run the pipeline using either docker or singularity environments just specifying a value for the -profile variable.
Usage:
nextflow -c nf-m6anet.conf run nf-m6anet.nf --samples="/path/to/samples.txt" --resultsDir="/path/to/resultsDir" -profile docker
Mandatory argument:
-profile Configuration profile to use. Available: docker, singularity
Other mandatory arguments which may be specified in the nf-m6anet.conf file
--samples Path to the tab-separated sample file including sample name, condition, path to fast5 folder and path to fastq file
--resultsDir Path to a folder where to store results
--transcriptome_fasta Path to the transcriptome fasta file
--gtf Path to genome annotation gtf file
--min_mapq Minimum mapping quality
--prob_mod_thr Probability modification threshold for calling a site as m6A+
--optArgs_f5c Optional arguments for f5c, for example "--kmer-model /path/to/rna004.nucleotide.5mer.model"
--optArgs_m6anet Optional arguments for m6Anet, for example "--pretrained_model HEK293T_RNA004" or "--pretrained_model arabidopsis_RNA002"
--postprocessingScript Path to Transcript_to_genome.R script
--bulkLevelScript Path to Calculate_m6anet_bulk.R script
Please refer to the following manuscripts for further information:
Hendra, C., Pratanwanich, P.N., Wan, Y.K. et al. Detection of m6A from direct RNA sequencing using a multiple instance learning framework. Nat Methods (2022). https://doi.org/10.1038/s41592-022-01666-1
Workman RE, Tang AD, Tang PS, Jain M, Tyson JR, Razaghi R, Zuzarte PC, Gilpatrick T, Payne A, Quick J, Sadowski N, Holmes N, de Jesus JG, Jones KL, Soulette CM, Snutch TP, Loman N, Paten B, Loose M, Simpson JT, Olsen HE, Brooks AN, Akeson M, Timp W. Nanopore native RNA sequencing of a human poly(A) transcriptome. Nat Methods. 2019 Dec;16(12):1297-1305. doi: 10.1038/s41592-019-0617-2. Epub 2019 Nov 18. Erratum in: Nat Methods. 2020 Jan;17(1):114. PMID: 31740818; PMCID: PMC7768885.
Gamaarachchi, H., Lam, C.W., Jayatilaka, G. et al. GPU accelerated adaptive banded event alignment for rapid comparative nanopore signal analysis. BMC Bioinformatics 21, 343 (2020). https://doi.org/10.1186/s12859-020-03697-x
Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34:3094-3100. doi:10.1093/bioinformatics/bty191