You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm writing a routine to import a VCF file as a numeric matrix, but I get a much larger memory usage than expected.
As a minimum working example, consider the code below that loops over a VCF file:
using GeneticVariation
functionloop_vcf()
reader = VCF.Reader(open("target.vcf", "r"))
s =0for record in reader, geno in record.genotype
s +=1endclose(reader)
return s
end
On a test data (target.vcf.gz, must decompress first) with 3000 records and 100 samples, I get the following benchmark:
using BenchmarkTools
@benchmarkloop_vcf()
BenchmarkTools.Trial:
memory estimate:98.64 MiB
allocs estimate:941005--------------
minimum time:62.249 ms (5.75% GC)
median time:63.186 ms (5.99% GC)
mean time:63.835 ms (6.75% GC)
maximum time:79.381 ms (5.22% GC)
--------------
samples:79
evals/sample:1
Why am I getting such a large memory requirement? My data target.vcf is only 1.3MB on disk, so I feel like this memory usage is highly suspicious..
The text was updated successfully, but these errors were encountered:
I'm writing a routine to import a VCF file as a numeric matrix, but I get a much larger memory usage than expected.
As a minimum working example, consider the code below that loops over a VCF file:
On a test data (target.vcf.gz, must decompress first) with 3000 records and 100 samples, I get the following benchmark:
Why am I getting such a large memory requirement? My data
target.vcf
is only 1.3MB on disk, so I feel like this memory usage is highly suspicious..The text was updated successfully, but these errors were encountered: