[Tiered Caching] [Milestone 1] KeyLookup store for disk cache #10309
Labels
enhancement
Enhancement or improvement to existing feature or request
Search
Search query, autocomplete ...etc
Is your feature request related to a problem? Please describe.
Refer main #10024
and #10870
As part of disk based caches, we need to check whether a key is present on disk or not and extract value accordingly. As it can happen that we unnecessarily perform a disk seek even when the key is not present. We can solve this by storing disk cache keys in memory in an optimized way.
Details
As part of tiered caching milestone1, we are looking to add a disk tier. Where essentially we will be spilling evicted items from in-memory cache to this disk tier.
So whenever a request comes, we need to check whether this request is present on either of caches i.e. in-memory or disk. As doing a disk GET(to know whether request is present or not) every time might be expensive and will add to the overall search latency. So to solve this, we are trying to maintain disk keys in-memory itself. To do this, instead of storing the whole key in memory(which can be expensive), we will just store the key hashcode(integer) in memory using this key store implemenation. Obviously there might be false positive cases but it should be few. Also the actual keys will be stored on disk itself.
To implement this key store, we were looking for different data structures like bitset array, roaring bitmap etc. Bitset can be expensive memory wise considering how it works. As for example, if we have only one key which is a large number N, we need to allocate that much memory for bitset array to set Nth bit index for a number N.
On other hand, roaring bitmap is more of a compressed bitset which works better memory wise but relatively slower. Here too we need to be considerate of the max memory roaring bitmap will use, so we have added some logic around that.
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: