Skip to content

Releases: oushujun/EDTA

v1.8.2

28 Feb 22:44
Compare
Choose a tag to compare

Update usages and installations, fix a couple minor bugs.

v1.7.8

01 Feb 20:44
Compare
Choose a tag to compare
update

Fix small bugs for the conda release

29 Jan 22:48
Compare
Choose a tag to compare
  1. Added the --convert_seq_name to EDTA_raw.pl so that it can work with EDTA.pl or independently.
  2. Fix a small bug for input genome check.

Get ready for the conda version

28 Jan 21:07
Compare
Choose a tag to compare

This version has a couple of updates that can potentially bring EDTA into the conda world. All changes are at the coding level, while results should be unchanged.

  1. both LTR_retriever and LTR_FINDER were replaced by respective conda installations thanks to @Juke34's contribution. The LTR_FINDER_parallel wrapper was updated to v1.1 that can work with the conda installed LTR_FINDER.
  2. EDTA_raw.pl was updated to utilize conda dependencies and is able to convert overly long genome sequence names.

Resolving conda conflicts

22 Jan 17:00
Compare
Choose a tag to compare

A couple of useful updates

  1. Change to use the ENV default perl instead of using /usr/bin/perl (#47)
  2. Replace precompiled binary GenomeTools and GenericRepeatFinder with conda recipies (contributed by @Juke34).
  3. Check sequence names for the input genome. Remove annotations (after the first space in the seq ID line) and shorten seq IDs to <= 15 characters. The original genome file is untouched, the modified file is named $genome.mod and used in the EDTA analysis (#35, #40, #44).

Some important updates

18 Jan 17:26
c76ff09
Compare
Choose a tag to compare

This release has a number of important updates:

  1. Further clean the input CDS file based on repetitiveness. If a sequence in the CDS file occurs >= 10 times in the raw TE library, then this sequence is likely a repeat sequence and removed from the CDS file.
  2. Purge gene-contained sequences in intact TE elements.
  3. Use the conda-based Python3 TEsorter to replace the Python2 TEsorter. Thanks to @Juke34's work! (#39)
  4. Use Getopt to read program parameters. Option names have the long format now. e.g., previously -genome changed to --genome. Contributed by @Juke34 (#46)
  5. The docker/singularity version of EDTA has been tested and available!

Other updates

  1. Add the GPLv3 license (#39).
  2. Add a patch script /util/patch.pl to allow a quick update of previous results from EDTA 1.6.x to EDTA 1.7.x.
  3. Replace some of the cp commands in TIR-Learner to rsync to avoid error. (#45)
  4. Add a dependency check --check_dependencies by @Juke34 (#46)

Official release of EDTA

27 Dec 07:53
Compare
Choose a tag to compare

After a couple months' public testing and solving dozens of issues, EDTA has matured to a point that is worth an official release. Thank you all for your bug reports and feature requests.

Summary of functions and features

  1. Identify high-quality structurally intact TEs including LTR retrotransposons, TIR transposons, and Helitrons.
  2. Reduce false classifications and nested insertions between intact TEs to create a homogenized TE library.
  3. Accept user input TE library to identify novel TEs.
  4. Accept user input CDS to remove gene sequences in the TE library.
  5. Exclude user-specified genomic regions (i.e., gene regions) from TE masking.
  6. Perform whole-genome TE annotation and produce a gff file with both structurally intact and homology-based TE annotations.
  7. Produce self-evaluation results for users to check annotation consistency.
  8. Automatic checkpointing, so that EDTA can automatically start from where it was interrupted.
  9. Multithreading-enabled. Analyzing a maize genome (2.3 Gb, >85% TE) takes less than a week (-threads 36).
  10. Include a companion benchmarking pipeline for developers and researchers to test the annotation quality of custom TE libraries.

Citation

Please cite our paper if you find EDTA is useful:

Ou S., Su W., Liao Y., Chougule K., Ware D., Peterson T., Jiang N.✉, Hirsch C. N.✉ and Hufford M. B.✉ (2019). Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline. Genome Biol. 20(1): 275.