Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mito: Write cache for remote object store #2965

Open
8 of 22 tasks
evenyag opened this issue Dec 20, 2023 · 0 comments
Open
8 of 22 tasks

mito: Write cache for remote object store #2965

evenyag opened this issue Dec 20, 2023 · 0 comments
Labels
C-performance Category Performance tracking-issue A tracking issue for a feature.

Comments

@evenyag
Copy link
Contributor

evenyag commented Dec 20, 2023

What type of enhancement is this?

Performance

What does the enhancement do?

Now the mito storage engine writes SST files to remote object stores directly.

pub struct ParquetWriter {
/// SST output file path.
file_path: String,
/// Input data source.
source: Source,
/// Region metadata of the source and the target SST.
metadata: RegionMetadataRef,
object_store: ObjectStore,
}

We have to fetch the object from the object store again if we want to access the file. If we implement a write-through cache for parquet files, we don't need to download the object again.

Implementation challenges

This might increase the cost of uploading an object and the memory pressure of memtables.

  • A better approach is to release the memtable once we flush the file to the write cache.
  • We update the manifest after the object is fully uploaded to the remote object store

To implement async upload, we need to store other metadata such as flushed sequence and region id for files in the write cache. The region edit also requires memtable ids to remove flushed memtables. We should switch to using the minimum sequence of memtable as the memtable id is incremented globally.

For simplicity, we can implement the sync version first, which returns after files are uploaded.

Steps

Further discussions

  • If the engine opens a region in a fresh env without the write cache, replaying the wal might cause oom
  • We can trigger a flush during replay to avoid OOM and upload it once we have write permission
  • Delays uploading level 0 files we might compact them later

Related Issues

It should be part of #2516

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Category Performance tracking-issue A tracking issue for a feature.
Projects
Status: In Progress
Development

No branches or pull requests

1 participant