You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This convert_gt runs without error but is untested
""" convert_gt(b::Bgen, T=Float32)Imports dosage information and chr/pos/snpID/ref/alt into numeric arrays.# Input- `b`: a `Bgen` object- `T`: Type for genotype array# Output- `G`: a `p × n` matrix of type `T`. Each column is a genotype- `Gchr`: Vector of `String`s holding chromosome number for each variant- `Gpos`: Vector of `Int` holding each variant's position- `GsnpID`: Vector of `String`s holding variant ID for each variant- `Gref`: Vector of `String`s holding reference allele for each variant- `Galt`: Vector of `String`s holding alterante allele for each variant"""functionconvert_gt(b::Bgen, T=Float32)
n =n_samples(b)
p =n_variants(b)
# return arrays
G =Matrix{T}(undef, p, n)
Gchr =Vector{String}(undef, p)
Gpos =Vector{Int}(undef, p)
GsnpID =Vector{String}(undef, p)
Gref =Vector{String}(undef, p)
Galt =Vector{String}(undef, p)
# loop over each variant
i =1for v initerator(b; from_bgen_start=true)
dose =minor_allele_dosage!(b, v; T=T)
copyto!(@view(G[i, :]), dose)
Gchr[i], Gpos[i], GsnpID[i], Gref[i], Galt[i] =chrom(v), pos(v), rsid(v), major_allele(v), minor_allele(v)
i +=1clear!(v)
endreturn G, Gchr, Gpos, GsnpID, Gref, Galt
end
This convert_htdoes not work due to major_allele and minor_allele not working.
""" convert_ht(b::Bgen)Import phased haplotypes as a `BitMatrix`, and store chr/pos/snpID/ref/alt.# Input- `b`: a `Bgen` object. Each variant must be phased and samples must be diploid# Output- `H`: a `p × 2n` matrix of type `T`. Each column is a haplotype. - `Hchr`: Vector of `String`s holding chromosome number for each variant- `Hpos`: Vector of `Int` holding each variant's position- `HsnpID`: Vector of `String`s holding variant ID for each variant- `Href`: Vector of `String`s holding reference allele for each variant- `Halt`: Vector of `String`s holding alterante allele for each variant"""functionconvert_ht(b::Bgen)
n =2n_samples(b)
p =n_variants(b)
# return arrays
H =BitMatrix(undef, p, n)
Hchr =Vector{String}(undef, p)
Hpos =Vector{Int}(undef, p)
HsnpID =Vector{String}(undef, p)
Href =Vector{String}(undef, p)
Halt =Vector{String}(undef, p)
# loop over each variant
i =1for v initerator(b; from_bgen_start=true)
dose =probabilities!(b, v)
phased(v) ||error("variant $(rsid(v)) at position $(pos(v)) not phased!")
for j in1:n_samples(b)
Hi =@view(dose[:, j])
H[i, 2j -1] =read_haplotype1(Hi)
H[i, 2j] =read_haplotype2(Hi)
end
Hchr[i], Hpos[i], HsnpID[i], Href[i], Halt[i] =chrom(v), pos(v), rsid(v), major_allele(v), minor_allele(v)
i +=1clear!(v)
endreturn H, Hchr, Hpos, HsnpID, Href, Halt
endread_haplotype1(Hi::AbstractVector) = Hi[2] ≥0.5?true:falseread_haplotype2(Hi::AbstractVector) = Hi[4] ≥0.5?true:false
Do you know of a dataset that is in both VCF and BGEN format? That would be great for testing. Also, I'm guessing probabilities! and minor_allele_dosage! are allocating for every variant. If so, probably we need to modify it so it accepts preallocated vectors.
The text was updated successfully, but these errors were encountered:
The tool bgenix supports transformation into the VCF format, so maybe we can use that for creating a test dataset. There already is minimal code for reading .gen, .vcf, and .haplotype files in the test code at this time, so maybe you can use that if necessary.
Yes, probabilities! and minor_allele_dosge! are allocating for every variant, and this can be avoided by using internal functions at this time. I may have to restructure/document these functions for the last suggestion.
Below is some prototype code.
This
convert_gt
runs without error but is untestedThis
convert_ht
does not work due tomajor_allele
andminor_allele
not working.Do you know of a dataset that is in both
VCF
andBGEN
format? That would be great for testing. Also, I'm guessingprobabilities!
andminor_allele_dosage!
are allocating for every variant. If so, probably we need to modify it so it accepts preallocated vectors.The text was updated successfully, but these errors were encountered: