Repetitive regions in mitochondrial assembly #138
Unanswered
o-william-white
asked this question in
Q&A
Replies: 1 comment 5 replies
-
Yeah, it seems to be a problem. Could you please provide the We masked a group of repetitive sequences for our default database, mainly for speeding up, not for accuracy purposes. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I am using getorganelle to assemble the mitochondrial genome of Rhagonycha fulva across multple samples. Note that the sequence data is from historical samples so we do not expect to recover full organelles given the sample quality. Retriving barcodes sequences is accepbtable for this data.
For most samples, I am able to recover mitochondrial contigs and annotations for barcode markers, notably cox1.
However, a couple of samples resulted in very short circular sequence. Attached is a log output, selected_graph and extend-gene graph for one of these samples.
This is what the short circular sequence looks like in the selected graph
However, this is what the extend-gene graph looks like. Note that I am able to annotate sequences from Rhagonycha fulva in these contigs.
Is getorganelle selecting the contigs for the selected_path based on coverage? I.e. is the assumption that organelle contigs will have higher coverage compared to nuclear contigs for example?
I think this result can be explained by the fact I am using a reference assembly of Rhagonycha fulva as a seed sequence, and within this assembly there is a large repetive region. See dot plot below.
I expect sequence reads from this repetive regions are assembling together resulting in contigs with very high coverage compared to other mitochondrial regions.
I thought it might be useful to mask repetivie sequences from the reference sequence (for example using Trf) prior to running getorganelle. Does this sound like a sensible idea?
Thanks for sharing the sofware!
Best wishes
Ollie
extended_K55.assembly_graph.fastg.extend-gene.fastg.txt
get_org.log.txt
animal_mt.K55.complete.graph1.selected_graph.gfa.txt
Beta Was this translation helpful? Give feedback.
All reactions