We provide a fast and free haplogroup classification web service. You can upload your mtDNA profiles aligned to rCRS or RSRS (beta) and receive mitochondrial haplogroups in return. Fasta, VCF and hsd input formats are accepted. So far, HaploGrep and the updated HaploGrep 2 have been cited over 400 times (Google Scholar - June 2018). Please join our HaploGrep Google User Group for future updates and ongoing discussions.
We also provide a command line version for local usage. Download and execute the latest release (v2.1.15).
java -jar haplogrep-2.1.15.jar --in <input> --format vcf/fasta/hsd --out haplogroups.txt
HaploGrep requires Java 8 and works on Windows, Linux and Mac.
The recommended input format is VCF or FASTA. For alignment, bwa version 0.7.17 is used.
You can also specify your profiles in hsd format, which is a simple tab-delimited file format consisting of 4 columns (ID, Range, Haplogroup and Polymorphisms). For readability, the polymorphisms are also tab-delimited (so columns > 4). A hsd example can be found here.
- By default HaploGrep expects that your data is aligned against rCRS (which is included in the human references hg19 and hg38). If your data is aligned against RSRS, add the
--rsrs
parameter (Default: off). Please read this blog post carefully before adding this option. - To change the metric to Hamming Distance or Jaccard add the
--metric
parameter (Default: Kulczynski Measure). - For adding additional output columns (e.g. found or remaining polymorphisms) please add the
--extend-report
flag (Default: off). - The used Phylotree version can be changed using the
--phylotree
parameter (Default: 17). - If your using genotyping arrays, please add the
--chip
parameter to limit the range to array SNPs only (Default: off, VCF only). To get the same behaviour for hsd files, please add only the variants to the range, which are included on the array or in the range you have sequenced (e.g. control region). Range can be sepearted by a semicolon;
, both ranges and single positions are allowed (e.g. 1-576; 34). - To output the complete path from rCRS root to your input sample use the
--lineage
parameter. (Default: off). We provide a textual format (*.lineage.txt
) and a Graphviz DOT format. You can upload the HaploGrep*.graphviz.txt
file here or process it with the Graphviz library.
Several mtDNA references exist, HaploGrep supports rCRS and RSRS. Please checkout our blog post to learn more about this topic.
If you are using HaploGrep for genotyping array data, please have a look at the --chip
parameter above.
Heteroplasmies are often stored as heterozygous genotypes (0/1). If a HF field (= Heteroplasmy Frequency of variant allele; introduced by MToolBox) is specified in the VCF header, we add variants with a HF > 0.96 to the input profile.
Please have a look at mtDNA-Server to check for heteroplasmies and contamination in your NGS data.
Check out our blog regarding mtDNA topics.
If you use HaploGrep, please cite our latest HaploGrep2 paper in combination with Phylotree 17. The first HaploGrep paper can be found here.
Sebastian Schoenherr (@seppinho) and Hansi Weissensteiner (@haansi); Division of Genetic Epidemiology, Medical University of Innsbruck;