Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: list index out of range while running with Mac-Clinvar #43

Closed
NTNguyen13 opened this issue Apr 4, 2020 · 4 comments
Closed

Comments

@NTNguyen13
Copy link

NTNguyen13 commented Apr 4, 2020

Hi, I'm trying out CharGer for prediction and annotation of my vcf file (annotated by VEP), my command is as follow:


charger \
    -f test_charger_vep.vcf \
    -o test_charger_vep2.tsv \
    -l -D \
    --mac-clinvar-tsv ~/clinvar/output/b37/single/clinvar_alleles.single.b37.vcf.gz

But it resulted in error:

charger::getClinVar
Traceback (most recent call last):
  File "~/anaconda3/envs/charger/bin/charger", line 743, in <module>
    main( sys.argv[1:] )
  File "~/anaconda3/envs/charger/bin/charger", line 662, in main
    mutationTypes = mutationTypes , \
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 821, in getExternalData
    self.getClinVar( **kwargs )
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 842, in getClinVar
    clinvarSet = self.getMacClinVarTSV( macClinVarTSV )
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 887, in getMacClinVarTSV
    [ description , status ] = self.parseMacPathogenicity( fields[12:17] )
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 909, in parseMacPathogenicity
    named = fields[0]
IndexError: list index out of range

I removed the option --mac-clinvar-tsv and it can run fine.
I used the single file from latest Mac Clinvar repository.

Could you please help me on this problem?

P/s: I also want to add the HotSpot3D to CharGer, how would I use https://github.com/ding-lab/hotspot3d to generate cluster for this task? What is my input file to get the cluster?

Thank you very much

@ccwang002
Copy link
Member

--mac-clinvar-tsvaccepts TSV instead of VCF. Please use clinvar_alleles.single.b37.tsv.gz from their repo (hg19).

As for HotSpot3D cluster file, please use this file (hg19) as an example using TCGA pan-cancer mutations (TCGA MC3). I have created a separate issue #44 to add the instructions to generate this HotSpot3D cluster file (or at least point to the right doc).

@NTNguyen13
Copy link
Author

Hi, I have tried it again with the tsv.gz file, it has this error now:

Traceback (most recent call last):
  File "~/anaconda3/envs/charger/bin/charger", line 743, in <module>
    main( sys.argv[1:] )
  File "~/anaconda3/envs/charger/bin/charger", line 662, in main
    mutationTypes = mutationTypes , \
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 821, in getExternalData
    self.getClinVar( **kwargs )
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 842, in getClinVar
    clinvarSet = self.getMacClinVarTSV( macClinVarTSV )
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 887, in getMacClinVarTSV
    [ description , status ] = self.parseMacPathogenicity( fields[12:17] )
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 914, in parseMacPathogenicity
    isPathogenic = int( isPathogenic )
ValueError: invalid literal for int() with base 10: 'NM_005101.3:c.62G>A'

I saw similar issue in here: #4 , I thought that was updated in the latest version, or I need to switch to older version of mac-clinvar?

@NTNguyen13
Copy link
Author

I tried the older version of mac-clinvar, it has a lot of warning like this but charger can still run:

biomine warning: del not found in conversion tables
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
  Hint: Is the input amino acid change column correct?
    Problem variant:  RAB39B:X:154490187-154490187A>G::NM_171998.3:c.543A>G::NP_741995.1:p.  --  p.Thr181=
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
  Hint: Is the input amino acid change column correct?
    Problem variant:  RAB39B:X:154490238-154490238C>T::NM_171998.3:c.492C>T::NP_741995.1:p.  --  p.Phe164=
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
  Hint: Is the input amino acid change column correct?
    Problem variant:  RAB39B:X:154490457-154490457T>C::NM_171998.3:c.273T>C::NP_741995.1:p.  --  p.Ile91=

@ccwang002
Copy link
Member

Looks like the new version is not compatible due to the change in the column order. Please ignore the warnings for now. We are fixing it in the 0.6 version.

Please re-open this issue if there is additional follow-up. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants