Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: LoadError: GeneticVariation.VCF.Reader #103

Open
ionicbond2005 opened this issue May 6, 2021 · 2 comments
Open

ERROR: LoadError: GeneticVariation.VCF.Reader #103

ionicbond2005 opened this issue May 6, 2021 · 2 comments

Comments

@ionicbond2005
Copy link

Dear Sir/ Madam:

I was trying to run viva to visualize the read depth for certain number of variants which have a read depth over 1000 to see whether those variants are clustered or not. However, when I run the program using the command-line

viva -f <<some.vcf>> -m read_depth -o <<some.directory>> -t Read-depth-heat-map-for-bump --save_remotely

I got the error message below:

[ Info: This will take a few moments...
ERROR: LoadError: GeneticVariation.VCF.Reader file format error on line 37
Stacktrace:
[1] error(::String, ::Int64) at ./error.jl:42
[2] _readheader!(::GeneticVariation.VCF.Reader, ::BioCore.Ragel.State{BufferedStreams.BufferedInputStream{IOStream}}) at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/BioCore/YBJvb/src/ReaderHelper.jl:106
[3] readheader!(::GeneticVariation.VCF.Reader) at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/BioCore/YBJvb/src/ReaderHelper.jl:80
[4] Reader at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/GeneticVariation/r8DAL/src/vcf/reader.jl:15 [inlined]
[5] GeneticVariation.VCF.Reader(::IOStream) at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/GeneticVariation/r8DAL/src/vcf/reader.jl:28
[6] top-level scope at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/VariantVisualization/1yoNl/viva:130
[7] include(::Function, ::Module, ::String) at ./Base.jl:380
[8] include(::Module, ::String) at ./Base.jl:368
[9] exec_options(::Base.JLOptions) at ./client.jl:296
[10] _start() at ./client.jl:506
in expression starting at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/VariantVisualization/1yoNl/viva:130

Welcome to VIVA.

Loading dependency packages:

...

Finished loading packages!

Reading /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/test-selected.vcf ...

===============================================================

As its say file format error on line 37, so I pull out the line 36-37 in the vcf

head -37 /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/test-selected.vcf |tail -2

##GATKCommandLine=<ID=VariantFiltration,CommandLine="VariantFiltration --output /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/test-variant-filtered.vcf.gz --filter-expression DP < 5000 --filter-expression DP > 20000 --filter-name Not_highest_DP --filter-name Not_lowest_DP --variant /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/Novogene/Novogene.vqsr.seqr.vcf.gz --reference /hpcdata/pid/andrew/data/human_g1k_v37.fasta --cluster-size 3 --cluster-window-size 0 --mask-extension 0 --mask-name Mask --filter-not-in-mask false --missing-values-evaluate-as-failing false --invalidate-previous-filters false --invert-filter-expression false --invert-genotype-filter-expression false --set-filtered-genotype-to-no-call false --apply-allele-specific-filters false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version="4.2.0.0",Date="April 29, 2021 4:15:29 PM EDT">
##GVCFBlock0-1=minGQ=0(inclusive),maxGQ=1(exclusive)

===========================================
Any idea on why the error occurs?

Thank you very much for your help

Samuel Li

@gtollefson
Copy link
Collaborator

gtollefson commented May 6, 2021

@ionicbond2005 Hi Samuel,
Thanks for bringing this issue up. It looks like there is unexpected formatting (probably an unexpected special symbol) on that line. The fastest way to get around this is to remove the line and produce a new "cleaned" file to visualize with a command like the GNU sed command below:

sed -i '37d' /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/test-selected.vcf > /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/test-selected_cleaned.vcf

This looks like an issue with the GeneticVariation.jl package which VIVA depends upon for reading in the VCF file. I would make an issue with them to correct this if you cannot run it with the cleaned version of the file with the offending line removed.

Let me know how it goes and if you have any more issues!

George

@ionicbond2005
Copy link
Author

Dear George:

After removing the ##GVCFBlock, I rerun the program and have another problem occur:

LoadError: BoundsError: attempt to access 1-element Array{SubString{String},1} at index [2]

Any idea on how to fix this problem?
Thank you very much

Samuel Li

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants