Official release of EDTA
After a couple months' public testing and solving dozens of issues, EDTA has matured to a point that is worth an official release. Thank you all for your bug reports and feature requests.
Summary of functions and features
- Identify high-quality structurally intact TEs including LTR retrotransposons, TIR transposons, and Helitrons.
- Reduce false classifications and nested insertions between intact TEs to create a homogenized TE library.
- Accept user input TE library to identify novel TEs.
- Accept user input CDS to remove gene sequences in the TE library.
- Exclude user-specified genomic regions (i.e., gene regions) from TE masking.
- Perform whole-genome TE annotation and produce a gff file with both structurally intact and homology-based TE annotations.
- Produce self-evaluation results for users to check annotation consistency.
- Automatic checkpointing, so that EDTA can automatically start from where it was interrupted.
- Multithreading-enabled. Analyzing a maize genome (2.3 Gb, >85% TE) takes less than a week (-threads 36).
- Include a companion benchmarking pipeline for developers and researchers to test the annotation quality of custom TE libraries.
Citation
Please cite our paper if you find EDTA is useful:
Ou S., Su W., Liao Y., Chougule K., Ware D., Peterson T., Jiang N.✉, Hirsch C. N.✉ and Hufford M. B.✉ (2019). Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline. Genome Biol. 20(1): 275.