-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do sequence heads have to match? #3
Comments
Dear Viktoria, Yes you are right, the names shall be standardized. Please let me know if this worked for you, |
Right, my problem is just that one of the assemblies isn't chromosome scale. I know in my first post it was just 2 genomes, but that was to keep the question simple. I actually have like 10ish remanei/latens/briggsae species. Most of these are chr scale but one is pretty fragmented. I turned it around and annotated TEs in the reference, got a vcf from the reference and query alignment, and then all the chr names match automatically bc the reference was used for both. Previously I was annotating TEs in the query, which if not chr scale won't align with the vcf. Anyways, I'm not sure if one way is particularly right, but I got a similar pattern using either method. The only difference was that more events were found when using TEs annotated from the reference. But like I said, the pattern is the same and similar to your paper Thanks Kevin |
Great to hear :-) |
I'm comparing two different nematode genomes and they are a little fragmented (the main 6 chromosomes + a few extra contigs) and each have different sequence names and number of contigs. Do the sequence names of all the files have to match?
Traceback (most recent call last):
File "/home/veggers/.conda/envs/transposition_detector_detect/share/TranspositionEventDetector_deTEct/TranspositionDetector.py", line 81, in
parseSniffles_SVs(seqHeadFile, svFile, ouFile1)
File "/home/veggers/.conda/envs/transposition_detector_detect/share/TranspositionEventDetector_deTEct/ParserSniffles.py", line 46, in parseSniffles_SVs
fW.write(sequenceDictB[chrom]+"\t"+"SVIM"+"\t"+"insertion"+"\t"+start+"\t"+end+"\t"+"."+"\t"+"+"+"\t"+"."+"\t"+info)
~~~~~~~~~~~~~^^^^^^^
KeyError: 'CM021144.1_356'
I guess I could just extract the main chromosomes from all the files and standardize the names if this is the case.
The text was updated successfully, but these errors were encountered: