An introduction to Genome Annotation of non-model organisms
This is the github repo for the the Genome Annotation Workshop The workshop is focused on annotating genomes of non-model organisms using a custom pipeline of multiple tools. In the workshop different strategies, such as homology-based, ab initio, and de novo approaches are implemented, using a combination of short and long reads (Iso-Seq) available on NCBI. As examples, a single chromosome from three different organisms is used as a demonstration.
To install this site locally run the following commands:
Clone the repo and cd into it
git clone [[email protected]:griffithlab/rnabio.org.git](https://github.com/francicco/GenomeAnnotationWorkshop2024.git)
Install the following software and their dependencies:
STAR | Minimap2 | Samtools | Bedtools | Diamond | Miniprot | BRAKER | Cufflinks | BUSCO | compleasm | Stringtie | Scallop | IsoQuant | Trinity | TransDecoder | Portcullis | Mikado | Miniprot2SplicedNucl.py
| IsoQuantGTF2BED12v0.1.py
| sam2psl.py
| Analyze_Diamond_topHit_coverage.R
| UniProtDB ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gz
The workshop is divided into four section
- RNAseq mapping on the reference genome (Short-reads & Iso-Seq)
- Homology and evidence-based prediction of protein coding genes (PCGs) using BRAKER2
- De Novo annotation using Short-reads RNAseq using Trinity
- Ab Initio annotation using Short-reads RNAseq & Long-reads RNA-Seq
- Metrics to evaluate annotations, Splice-site filtering, and Annotation Consensus using Mikado
All data is available at this link, but don't forget to set up your environment!!!