-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does it work with non-model species? #3
Comments
Unfortunately the code was designed specifically for diploid genomes. The code considers if a site is homozygous or heterozygous, though can handle if missing sites exist too. If you fed in sites with only 2 alleles that have frequencies that are roughly equal (as a hack), it may provide some results, but I cannot guarantee that the results will make sense. This does have me thinking if we could create a model to handle genomes with a generic ploidy, without sacrificing statistical power. |
Thanks for the reply. Although it is a far-fetched idea, it would be really cool to have this option in this tool along with handling multi-allelic sites. To my knowledge, no good tools are available to detect sample swap in non-model organisms. |
I would be interested in if the tool gives back any meaningful results in your case if you run it (with the hack). If I were to guess, I think given enough sites with high enough variability, In the worst case I think it will say everything is unrelated so I don't think it would hurt. |
Hi, I have a few questions. First, thanks for fixing the parsing bug. It works now. So, in this following command: Then, in this command: Lastly, can I use a list of raw fastq files instead of writing them one by one in the code below? If yes, what should be the format of the list file? Thank you. |
Edit*: Actually, the VCF that is used here doesn't need to be a multisample VCF. it just needs the biallelic variants.
Edit* The multi VCF file here must be a multisample VCF with reliable genotyping results from a reliable set of samples to capture the population structure. It can be but does not have to be is not the same as above. Also, ideally the multisample VCF used should not contain any of the samples used in the sample swap detection process downstream. The
At the moment I don't have support for a file list. However, unix glob (i.e. wildcards |
Hello,
Thanks for developing the tool. Does this tool work with non-model species of different ploidy?
The text was updated successfully, but these errors were encountered: