You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add --ploidy to vcfSampleCompare. This will affect 2 things:
--genotype (--nogenotype)
--separation-gap (-a)
Genotype calls can be haploid (ploidy=1), diploid (ploidy=2), etc.. Genotype calls appear in vcf files as a series of slash-delimited digits (0-3 or possibly more). The digits refer to the reference state or one of the comma-delimited variant states. E.g. A genotype call of '0' indicates "same as reference" when ploidy=1. A call of '1' indicates the state is the first of the comma-delimited ALT values when ploidy=1. If ploidy is 2 and the genotype call is (e.g.) 0/0, then both alleles are the same as the reference. 1/1 is both alleles are the first variant. 0/1 is a heterozygous state.
Setting ploidy should affect the usage of genotype calls in the following way:
Error if the genotype inferred from the data is a different ploidy than was supplied to the script on the command line.
Setting ploidy should affect allelic frequency in the following way:
Separation gap calculation should change
ploidy 1 would cause the calculation/comparison to be abs(AO/DP - AO/DP) >= gap
ploidy 2 would cause the calculation/comparison to be
Closest distance to 0.0, 0.5, or 1.0 for each sample
abs(AO/DP - AO/DP) >= gap/2 and (distance_to_closest(AO/DP,(0,0.5,1)) + distance_to_closest(AO/DP,(0,0.5,1))) <= 1/(gap/2) and set to 0 if closest is the same
ploidy 3 - same as 2, but with "3" and (0,0.33,0.67, and 1)
ploidy n...
Instead of distance_to_closest(), I could use distance to call
The higher the ploidy, the greater the chance for noise, so the gap should be reported as if ploidy is 1, sort should be as if ploidy is 1, and filter should be based on the actual ploidy (which should cause less to be filtered, the higher the ploidy.
Sort first on genotype call, then on separation gap
Perhaps I can determine significance given all the data
Sub-sort on the allelic frequency difference (genotype difference being the primary sort). If ploidy is wrong, ignore genotype in sort
The text was updated successfully, but these errors were encountered:
Add --ploidy to vcfSampleCompare. This will affect 2 things:
Genotype calls can be haploid (ploidy=1), diploid (ploidy=2), etc.. Genotype calls appear in vcf files as a series of slash-delimited digits (0-3 or possibly more). The digits refer to the reference state or one of the comma-delimited variant states. E.g. A genotype call of '0' indicates "same as reference" when ploidy=1. A call of '1' indicates the state is the first of the comma-delimited ALT values when ploidy=1. If ploidy is 2 and the genotype call is (e.g.) 0/0, then both alleles are the same as the reference. 1/1 is both alleles are the first variant. 0/1 is a heterozygous state.
Setting ploidy should affect the usage of genotype calls in the following way:
Setting ploidy should affect allelic frequency in the following way:
abs(AO/DP - AO/DP) >= gap
The text was updated successfully, but these errors were encountered: