-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Miles Woodcock-Girard edited this page Jun 22, 2023
·
23 revisions
Semblans a tool that enables the automatic assembly of de novo transcriptomes for non-model organisms.
Through the collation of several external packages and the leveraging of C++ data streaming performance, Semblans streamlines the necessary pre-processing, quality control, assembly, and post-assembly steps, allowing a ‘hands-off’ assembly process without loss to versatility.
Semblans employs a variety of powerful third-party tools in its pipeline. The user is encouraged to learn about these packages individually on their own time. These are listed below:
-
Rcorrector
- Song, L., Florea, L. Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads. GigaSci 4, 48 (2015). https://doi.org/10.1186/s13742-015-0089-y
-
Trimmomatic
- Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinformatics, btu170.
-
Kraken2
- Wood, D.E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol 20, 257 (2019). https://doi.org/10.1186/s13059-019-1891-0
-
Trinity
- Haas, B., Papanicolaou, A., Yassour, M. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc 8, 1494–1512 (2013). https://doi.org/10.1038/nprot.2013.084
-
Salmon
- Patro, R., Duggal, G., Love, M. I., Irizarry, R. A., & Kingsford, C. (2017). Salmon provides fast and bias-aware quantification of transcript expression. Nature Methods.
-
Corset
- Davidson, N.M., Oshlack, A. Corset: enabling differential gene expression analysis for de novoassembled transcriptomes. Genome Biol 15, 410 (2014). https://doi.org/10.1186/s13059-014-0410-6
- TransDecoder
- SRA-Tools
-
BLAST+
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics. 2009 Dec 15;10:421. doi: 10.1186/1471-2105-10-421. PMID: 20003500; PMCID: PMC2803857.
-
Diamond
- Buchfink B, Reuter K, Drost HG, "Sensitive protein alignments at tree-of-life scale using DIAMOND", Nature Methods 18, 366–368 (2021). doi:10.1038/s41592-021-01101-x
-
FastQC
- Andrews, S., https://github.com/s-andrews/FastQC
-
HMMER
- Eddy et. al., hmmer.org
-
Panther DB, Scoring Tool
- Paul D. Thomas, Dustin Ebert, Anushya Muruganujan, Tremayne Mushayahama, Laurent-Philippe Albou and Huaiyu Mi Protein Society. 2022;31(1):8-22. doi:10.1002/pro.4218
In addition to external packages, Semblans employs external C++ libraries for many of its under-the-hood operations. These are listed below: