Skip to content

Latest commit

 

History

History
63 lines (33 loc) · 1.67 KB

HashKV-2021-6-4.md

File metadata and controls

63 lines (33 loc) · 1.67 KB

Title: HashKV: Enabling Efficient Updates in KV Storage via Hashing

Source: ATC'18

Authors: Helen H. W. Chan, Yongkun Li, Patrick P. C. Lee Yinlong Xu


Summary

  • The problem the paper aims to solve

    • Key-Value Separated LSM-Stores (WiscKey) introduces extra query and insert overhead while doing Garbage Collection.
    • Unnecessary rewrites for cold KV pairs - Write amplification.
  • How can the paper address the problem? What is the main idea of this paper?

    • Hash-based Data Grouping

      (1) Partition isolation: hash(Key) -> Partition.

      (2) KV pairs with same key are segmented into same group;

    • Group-Based Garbage Collection

      (1) Select the segment group with the largest amount of writes.

      (2) Using hash table (No LSM-tree queries required).

    • Hotness Awareness

      (1) Tagging for Hot-cold value separation.

      (2) Cold values are separately stored.

    • Selective KV Separation

      (1) Large values: KV separation.

      (2) Small values: stored entirely in LSM-tree.

  • Validations

    The results of experimentation and analysis show that through the elimination of mixture of hot and cold data, HashKV can achieve better throughput and reduce write amplification.


Strengthens

  • The idea of splitting KV pairs into segments based on hash of keys is interesting.

Weaknesses

  • For Hash-based Data Grouping, The Hash function is important. But it's not mentioned in the paper.
  • The overhead introduced by HashKV is not evaluated.

Comments

  • The comparison object and organization of the experiment are a bit strange.