The cloud infrastructure motivates disaggregation of monolithic data stores into components that are assembled together based on an application's workload. This study investigates disaggregation of an LSM-tree key-value store into components that communicate using RDMA. These components separate storage from processing, enabling processing components to share storage bandwidth and space. The processing components scatter blocks of a file (SSTable) across an arbitrary number of storage components and balance load across them using power-of-d. They construct ranges dynamically at runtime to parallelize compaction and enhance performance. Each component has configuration knobs that control its scalability. The resulting component-based system, Nova-LSM, is elastic. It outperforms its monolithic counterparts, both LevelDB and RocksDB, by several orders of magnitude with workloads that exhibit a skewed pattern of access to data.
Linux. We have tested it on CloudLab R320 and R6220 instances.
NovaLSM requires RDMA packages, gflags
, and fmt
. You may install all of its required dependencies using
bash scripts/bootstrap/env/install-deps.sh
cmake .
make -j4
https://github.com/HaoyuHuang/NovaLSM-YCSB-Client
https://github.com/HaoyuHuang/NovaLSMSim
Cloudlab profile: https://github.com/HaoyuHuang/NovaLSM/blob/master/scripts/bootstrap/cloud_lab_profile.py
- Setup SSH between cloudlab nodes.
bash scripts/bootstrap/env/setup-apt-ssh.sh $number_of_nodes
- Clone the YCSB client binding repo in your cloudlab node.
- SSH to node-0 on cloudlab to install everything required for experiments. This takes around 15-20 minutes.
bash scripts/bootstrap/env/init.sh $number_of_nodes
We conducted all of our experiments using cloudlab r6220 nodes. The experiment scripts are under "scripts/exp". You need to modify the directory in these scripts to point to your directory that stores the server binaries.