Skip to content

Annotates vcf file variants to study their clinical significance

License

Notifications You must be signed in to change notification settings

Sashoss/VariantAnnotator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Installation

download and extract the zip folder

cd into the extracted directory and run,

for Linux shell,

python setup.py sdist bdist_wheel
pip install dist/*whl

for Windows command line,

python setup.py sdist bdist_wheel
for %i in (dist\*.whl) do pip install %i

Input

It takes vcf file as input

Usage

from VariantAnnotator import *

ParseObj = Parser("yourVCFfile.txt")
variants = ParseObj.parseInput()
ANNOTATE(variants, output="myOutputFile.xlsx")

Output

It pulls following annotations for each variant within the vcf file

Base Annotations

  1. Gene name
  2. Variant biotype
  3. Regulatory significance: Checks if the variant is within the regu;atory regions of the gene
  4. Regulatory feature id: ENSEMBL regulatory feature id
  5. Amino acids: Reports associated amino acid mutation
  6. Codons: Reports associated codon mutation

Database Ids

  1. rsid: variant dbSNP database id
  2. Uniprot id: Variant Uniprot database id
  3. Cosmic id: Variant Cosmic database id
  4. Clinvar id: Variant Clinvar database id
  5. PharmGKB ID: Variant PharmGKB database id

Amino acid mutation scores

  1. SIFT: SIFT scare for the amino acid mutation
  2. Polyphen: Polyphen score for amino acid mutation

Mutation frequency

  1. American: Frequency of the variant in American population
  2. South Asian: Frequency of the variant in South Asian population
  3. East Asian: Frequency of the variant in East Asian population
  4. African: Frequency of the variant in African population
  5. European: Frequency of the variant in African population
  6. Pubmed: Pubmed id for the corresponding variant

Below is the example output excel sheet image

plot

Things to do

  1. Error handelling
  2. Add protein structure-function annotations

About

Annotates vcf file variants to study their clinical significance

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages