Skip to content

Code implementation for paper "From LeanStore to LearnedStore: Using a Learned Index to Improve Database Index Search"

License

Notifications You must be signed in to change notification settings

shubhajeet/learnedStore

Repository files navigation

LearnedStore

LearnedStore IEEExplorer uses learned index to accelerate search in btree based KV database. The project has been adapted from LeanStore commit point: d3d83143ee74c54c901fe5431512a46965377f4e, a high-performance OLTP storage engine optimized for many-core CPUs and NVMe SSDs.

Compiling

Cloning repositories and it's submodules: git clone --recurse-submodules [email protected]:shubhajeet/learnedStore.git

Install dependencies:

sudo apt-get install cmake libaio-dev libtbb-dev libsparsehash-dev

Installing submodules

mkdir submodules/instrumentation/build
cd submodules/instrumentation/build
cmake -DCMAKE_BUILD_TYPE=Release .. & make -j

mkdir build_Release && cd build_Release && cmake -DCMAKE_BUILD_TYPE=release .. && make -j

Downloading Dataset

Please refer to the SOSD paper to get the SOSD Dataset. Benchmarking under the SOSD Databaset can be done using bench_dataset.sh script. The dataset must be stored in data folder.

Running benchmark

bechmark scripts are in testbench folder. Scripts are bench_<>.sh. The config files are with .cfg extension and some example config can be found in the repo.

The experiment can be run using following command. bench_<>.sh <>.cfg <exp_name>

Appropriate disk file should be created before running the experiments.

  • bench_learnstore.sh :: script to measure leanstore and learnstore on created workload
  • bench_latency.sh :: script to measure the latency
  • bench_dataset.sh :: read only throughput experiments

Experiment Figure Regeneration

  1. Figure 2
./bench_learnstore.sh 200M.cfg read
./bench_learnstore.sh 200M.cfg readseg
  1. Figure 3 : Measuring latency Create SSD Dataset
./bench_learnstore.sh 200M_512b.cfg create

Measure latency

./bench_latency.sh 200M_512b.cfg read | tee /tmp/leanstore_cold_latency.log
./bench_latency.sh 200M_512b.cfg readseg | tee /tmp/learnstore_cold_latency.log

Convert to csv

grep latency: leanstore_cold_latency.log | cut -d' ' -f 2 | tee leanstore_cold_latency.log

Create graphs using experiments/percentile.ipynb

  1. Figure 4 : The figure can be created using the dot file: experiment/state.dot
  2. Figure 5 : Read only workload throughput Read only throughput. Auto train should be disabled.
./bench_dataset.sh 200M.cfg lineargen
./bench_dataset.sh 200M.cfg randomgen
./bench_dataset.sh 200M.cfg pieceLinear
./bench_dataset.sh 200M.cfg amzn
./bench_dataset.sh 200M.cfg fb
./bench_dataset.sh 200M.cfg logn
./bench_dataset.sh 200M.cfg norm

Cite

The code we used for our HDIS 2023 paper

@inproceedings{maharjanLearnedStore,
    author    = {Sujit Maharjan},
    title     = {From LeanStore to LearnedStore: Using a Learned Index to Improve Database Index Search},
    booktitle = {HDIS},
    year      = {2023}
}

About

Code implementation for paper "From LeanStore to LearnedStore: Using a Learned Index to Improve Database Index Search"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published