[Feature Request] Efficient and predictable maps in Move Framework #15330
Labels
enhancement
New feature or request
move-framework
Issues related to the Framework modules/libraries
stale-exempt
Prevents issues from being automatically marked and closed as stale
🚀 Feature Request
Implement in Move Framework efficient maps, with predictable performance on every operation. Both single struct map, as well as unbounded in size, multi-storage-slot, big map.
Motivation and Pitch
Both of the current datastructures - SimpleMap and SmartTable, cannot be improved, because their layout is fixed (becuase they were implemented before enums were available). So we will need new implementations, and those should supersede (i.e. deprecate) the current ones fully. In addition, they should be written in a way that they can be evolved in the future.
Implementation
Evaluation
OrderedMap::SortedVectorMap
Implementation is a SortedVectorMap, but given the efficiency improvements of vector operations with memcpy, it's performance characteristics are extremely good.
Running different sizes tests, here we measure nanoseconds taken for a single pair of insert + remove operation, into a map of varied size:
Basically achieving log(n) performance, with very simple implementation.
BigOrderedMap::BPlusTreeMap
Implementation uses BPlusTreeMap, which is chosen as it is best suited for the onchain storage layout - where majority of cost comes from loading and writing to storage items, and there is no partial read/write of them. It also rebalances in a way that reduces amount writes needed
For performance, we measured two things.
At large scale
Most relevantly - we measured performance at scale, in comparison to SmartTable. So adding 1M keys into each, with making entries be 4KB on each. we get:
This shows BigOrderedMap being more storage-efficient, especially when keys/values are small, due to SmartTable storing hash in addition to key,value. It is also more performant even on 1M keys, when we are optimizing for storage costs (i.e. more elements in a single bucket). As we reduce the size of the bucket, SmartTable becomes more competitive, but the storage costs increase significantly.
Note: Test is compared on the single thread to compare apples to apples, as SmartTable has no parallelism. BigOrderedMap is parallel, and running on more threads gives order of magnitude higher throughput.
At small scale
We also measured performance at small scale, and how much overhead is it to use BigOrderedMap, instead of just OrderedMap, when it is unknown if data will be too large to be stored in a single resource.
Here we measure nanoseconds taken for a single pair of insert + remove operation, into a map of varied size.
Here we can see that inlining of the root node makes the overhead be less than 2 times, and even splitting small maps to achieve higher parallelism - keeps the overhead reasonable (i.e. another 2-3 times). But in all cases it scales extremely well as dataset increases in size.
Alternatives
Unordered maps in general can produce better performance than ordered maps, as they have more freedom. But:
Other alternative would be to implement the whole datastructures within rust (i.e. have them backed by rust structs itself), and provide native calls to the operations. This is quite more complicated, and requires such support for native memory layout within VM, so this has been punted on. We might revisit it at some point in the future, if it can further improve performance significantly.
The text was updated successfully, but these errors were encountered: