-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpectedly high memory usage #9
Comments
Allocated memory does not mean actual usage,bloom use BitArray as storage, it actually takes up less memory than Dict. [MemoryDiagnoser]
public class Issues9
{
public int DataSize = 3_000_000;
private IList<byte[]> filterData;
private IList<string> dictData;
[GlobalSetup]
public void Setup()
{
filterData = new List<byte[]>(DataSize);
dictData = new List<string>(DataSize);
for (var i = 0; i < DataSize; i++)
{
filterData.Add(Encoding.UTF8.GetBytes($"property_{i}_name"));
dictData.Add($"property_{i}_name");
}
}
[Benchmark]
public void BloomFilter()
{
var filter = FilterBuilder.Build(1000, 0.01);
for (var i = 0; i < DataSize; i++)
{
filter.Add(filterData[i]);
}
for (var i = 0; i < DataSize; i++)
{
if (!filter.Contains(filterData[i]))
{
}
}
}
[Benchmark]
public void Dictionary()
{
var bf = new Dictionary<string, bool>();
for (var i = 0; i < DataSize; i++)
{
bf.Add(dictData[i], true);
}
for (var i = 0; i < DataSize; i++)
{
if (!bf.ContainsKey(dictData[i]))
{
}
}
}
} |
Hmm, I still can't make sense of the results:
|
Since bloom filter is a probabilistic structure, it should occupy a fraction of memory of the dict, unless the bf configuration has storage that's larger than all 3 million byte arrays. Is there any other way of measuring total size of the dict and bf? (short of making a test application and seeing its actual memory usage). I appreciate that Allocations are actually measuring the wrong thing. |
Write a console program, and do not close the program after the test is successful. Then, check how much memory the process currently occupies. |
The size has been allocated during initialization |
Perhaps I'm missing some configuration options, but I would have expected Bloom filter to have static memory usage:
Tested with various different options, and Dictionary uses consistently less memory
The text was updated successfully, but these errors were encountered: