-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid Data #6
Comments
Hi thank you for your interest! Regarding the empty output, trgt-denovo will produce an empty output if it encounters any problem whatsoever at a given locus. That said, the warning you encountered,
Could you share which TRGT version you have used? Additionally (if possible) would you be willing to share the VCFs and BAMs you are using? I can also be reached at [email protected]). |
Hi, |
Great! Yes just the spanning BAM. Easiest would be to just include the reads spanning the problematic region (using samtools view, or run TRGT on only one region) |
Hi, When I use trgt-denovo 0.2.0, I got empty output file and Skipping invalid data warning. Best. |
Thanks I just had a look at it and it seems that the TRGT BAM files have malformed reads. Specifically the rq fields seem to have an invalid type There should be an update of TRGT to 1.4.1 soon, I'll let you know when it is released. |
TRGT v1.4.1 was just released (https://github.com/PacificBiosciences/trgt/releases/tag/v1.4.1). Would you be able to try this version and see if it fixes the problem? |
Thank you for your efforts, The new version fixed the problem. |
To create your own repeat definitions, an option you could try is tr-solve (https://github.com/trgt-paper/tr-solve). |
Hi,
Thank you for this tool, I tried the pipline you provided on my data, but during denovo detection, I got a lot of warning:
2024-11-24 20:55:02 [WARN] - Skipping invalid data in M
2024-11-24 20:55:02 [WARN] - Skipping invalid data in C
2024-11-24 20:55:02 [WARN] - Skipping invalid data in F
the files in input path named as follows:
Sample.spanning.bam
Sample.vcf.gz
Sample.sorted.vcf.gz
Sample.sorted.vcf.gz.csi
Sample.spanning.sorted.bam
Sample.spanning.sorted.bam.bai
but the de-novo output seems very strange:
trid genotype denovo_coverage allele_coverage allele_ratio child_coverage child_ratio mean_diff_father mean_diff_mother father_dropout_prob mother_dropout_prob allele_origin denovo_status per_allele_reads_father per_allele_reads_mother per_allele_reads_child father_dropout mother_dropout child_dropout index father_MC mother_MC child_MC father_AL mother_AL child_AL father_overlap_coverage mother_overlap_coverage
chr19_47047398_47047538_CCATGTCCTACCGAGTATGGCCTGGGCCAATCCCAC 0 0 0 0.0 0 0.0 0.0 0.0 0.0 0.0 . . . . . . . . 0 . . . . . . . .
chr19_2041258_2041346_GGCCCCAACCA 0 0 0 0.0 0 0.0 0.0 0.0 0.0 0.0 . . . . . . . . 0 . . . . . . . .
All the values were set to zero through the whole output file.
I checked the vcf files of trio data generated by trgt, an exmple showed below:
Father: chr19 47047398 . CAGCCTCGCCCTTCTTTTCCTTCAAATGCCGCCATCTCCTACCGAGTATGGCCTGGGCCAATCCCATCCATGTCCTACCGAGTATGGCCTGGGCCAATCCCACCCACGTCCGTCCCCATTCACGTCCTTTACAAACAGCCC . 0 . TRID=chr19_47047398_47047538_CCATGTCCTACCGAGTATGGCCTGGGCCAATCCCAC;END=47047538;MOTIFS=CCATGTCCTACCGAGTATGGCCTGGGCCAATCCCAC;STRUC=(CCATGTCCTACCGAGTATGGCCTGGGCCAATCCCAC)n GT:AL:ALLR:SD:MC:MS:AP:AM 0/0:140,140:138-142,140-140:20,20:2,2:0(30-102),0(30-102):0.5,0.5:.,.
Mother:
chr19 47047398 . CAGCCTCGCCCTTCTTTTCCTTCAAATGCCGCCATCTCCTACCGAGTATGGCCTGGGCCAATCCCATCCATGTCCTACCGAGTATGGCCTGGGCCAATCCCACCCACGTCCGTCCCCATTCACGTCCTTTACAAACAGCCC . 0 . TRID=chr19_47047398_47047538_CCATGTCCTACCGAGTATGGCCTGGGCCAATCCCAC;END=47047538;MOTIFS=CCATGTCCTACCGAGTATGGCCTGGGCCAATCCCAC;STRUC=(CCATGTCCTACCGAGTATGGCCTGGGCCAATCCCAC)n GT:AL:ALLR:SD:MC:MS:AP:AM 0/0:140,140:138-142,140-140:19,19:2,2:0(30-102),0(30-102):0.5,0.5:.,.
Child:
chr19 47047398 . CAGCCTCGCCCTTCTTTTCCTTCAAATGCCGCCATCTCCTACCGAGTATGGCCTGGGCCAATCCCATCCATGTCCTACCGAGTATGGCCTGGGCCAATCCCACCCACGTCCGTCCCCATTCACGTCCTTTACAAACAGCCC . 0 . TRID=chr19_47047398_47047538_CCATGTCCTACCGAGTATGGCCTGGGCCAATCCCAC;END=47047538;MOTIFS=CCATGTCCTACCGAGTATGGCCTGGGCCAATCCCAC;STRUC=(CCATGTCCTACCGAGTATGGCCTGGGCCAATCCCAC)n GT:AL:ALLR:SD:MC:MS:AP:AM 0/0:140,140:138-141,140-140:14,14:2,2:0(30-102),0(30-102):0.5,0.5:.,.
best
The text was updated successfully, but these errors were encountered: