Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(object store): introduce object prefix for opendal object store #16542

Merged
merged 25 commits into from
Jun 17, 2024

Conversation

wcy-fdu
Copy link
Contributor

@wcy-fdu wcy-fdu commented Apr 29, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR introduce a new system params use_new_object_prefix_strategy to control whether split object prefix. In general,

  • if it's a new cluster, the use_new_object_prefix_strategy is set to true and all opendal object store will split prefix;
  • if it's an old cluster, the use_new_object_prefix_strategy is set to false and everything keep the same with before.

Thus, we can guarantee backward compatibility.

Once an origin customer find some issue with the performance and want to split prefix for his old cluster, it's required to execute a script to completely rename the data in the object store and modify the system params in some way.

So in a word, this pr will not affect the behavior of an already exist cluster.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@wcy-fdu wcy-fdu requested a review from Li0k April 29, 2024 10:25
@wcy-fdu wcy-fdu changed the title feat(object store): add prefix for azblob feat(object store): devide prefix for azblob object store Apr 30, 2024
@wcy-fdu wcy-fdu marked this pull request as draft May 27, 2024 02:00
@wcy-fdu wcy-fdu requested a review from hzxa21 May 28, 2024 07:14
@wcy-fdu wcy-fdu marked this pull request as ready for review May 28, 2024 07:15
@wcy-fdu
Copy link
Contributor Author

wcy-fdu commented May 28, 2024

PTAL❤️cc @hzxa21 @Li0k

@hzxa21 hzxa21 changed the title feat(object store): devide prefix for azblob object store feat(object store): introduce object prefix for opendal object store May 28, 2024
e2e_test/batch/catalog/pg_settings.slt.part Outdated Show resolved Hide resolved
src/common/src/system_param/mod.rs Outdated Show resolved Hide resolved
src/common/src/system_param/mod.rs Outdated Show resolved Hide resolved
src/jni_core/src/hummock_iterator.rs Outdated Show resolved Hide resolved
Comment on lines 233 to 259
if use_new_object_prefix_strategy {
match env.system_params_manager_impl_ref() {
SystemParamsManagerImpl::Kv(mgr) => {
mgr.set_param("use_new_object_prefix_strategy", Some("true".to_owned()))
.await
.unwrap();
}
SystemParamsManagerImpl::Sql(mgr) => {
mgr.set_param("use_new_object_prefix_strategy", Some("true".to_owned()))
.await
.unwrap();
}
};
} else {
match env.system_params_manager_impl_ref() {
SystemParamsManagerImpl::Kv(mgr) => {
mgr.set_param("use_new_object_prefix_strategy", Some("false".to_owned()))
.await
.unwrap();
}
SystemParamsManagerImpl::Sql(mgr) => {
mgr.set_param("use_new_object_prefix_strategy", Some("false".to_owned()))
.await
.unwrap();
}
};
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we move the logic to MetaSrvEnv::new instead here becuase:

  • cluster_first_launch can be obtained directly there without interacting with object store and modifying write_exclusive_cluster_id
  • We can avoid making use_new_object_prefix_strategy mutable and instead pass the init value for use_new_object_prefix_strategy to init system params manager.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cluster_first_launch can be obtained directly there

For KV meta store, it is, but for SQL meta store, can we know whether it is the first launch here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @yezizp2012 Can we also judge cluster_first_launch for SQL backend, just like KV backend?
Otherwise we need to modify this system variable later, but "Making the use_new_object_prefix_strategy mutable will be dangerous because user can techinically change it via ALTER in SQL and the after a restart, the whole cluster will crash."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some offline discussion, for SQL backend I use MigratorTrait::get_applied_migrations() to check whether it's a new cluster:

  • if m20230908_072257_init is applied, it's an old cluster
  • if m20230908_072257_init is not applied, it's a new cluster.

Then we can make use_new_object_prefix_strategy immutable.

src/storage/benches/bench_compactor.rs Outdated Show resolved Hide resolved
src/storage/benches/bench_multi_builder.rs Outdated Show resolved Hide resolved
src/storage/hummock_test/src/bin/replay/main.rs Outdated Show resolved Hide resolved
src/storage/src/hummock/iterator/test_utils.rs Outdated Show resolved Hide resolved
src/tests/compaction_test/src/compaction_test_runner.rs Outdated Show resolved Hide resolved
@@ -416,11 +417,25 @@ impl MetaSrvEnv {
.await?
.map(|c| c.cluster_id.to_string().into())
.unwrap();
let migrations = Migrator::get_applied_migrations(&sql_meta_store.conn).await?;
let mut cluster_first_launch = true;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yezizp2012 PTAL for this part, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check the status before calling Migrator::up. At this point all migrations has already been applied. Please add a helper function to get whether the cluster is first launched, together with some necessary comments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check the status before calling Migrator::up. At this point all migrations has already been applied. Please add a helper function to get whether the cluster is first launched, together with some necessary comments.

+1. If we check get_applied_migrations here, it will contain "m20230908_072257_init" for sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, and since we explicitly specify that use_new_prefix_strategy should be passed in, there is no need to make any further judgment for ctl, right?

@wcy-fdu wcy-fdu requested a review from hzxa21 June 3, 2024 09:13
src/ctl/src/cmd_impl/hummock/sst_dump.rs Outdated Show resolved Hide resolved
src/ctl/src/common/hummock_service.rs Outdated Show resolved Hide resolved
@@ -416,11 +417,25 @@ impl MetaSrvEnv {
.await?
.map(|c| c.cluster_id.to_string().into())
.unwrap();
let migrations = Migrator::get_applied_migrations(&sql_meta_store.conn).await?;
let mut cluster_first_launch = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check the status before calling Migrator::up. At this point all migrations has already been applied. Please add a helper function to get whether the cluster is first launched, together with some necessary comments.

+1. If we check get_applied_migrations here, it will contain "m20230908_072257_init" for sure.

@wcy-fdu wcy-fdu requested a review from hzxa21 June 4, 2024 01:30
@wcy-fdu wcy-fdu requested a review from yezizp2012 June 4, 2024 01:30
src/common/src/system_param/mod.rs Outdated Show resolved Hide resolved
src/common/src/system_param/reader.rs Outdated Show resolved Hide resolved
src/config/example.toml Outdated Show resolved Hide resolved
src/meta/src/manager/env.rs Outdated Show resolved Hide resolved
src/meta/src/manager/env.rs Outdated Show resolved Hide resolved
@wcy-fdu wcy-fdu added the ci/run-backwards-compat-tests Run backwards compatibility tests in your PR. label Jun 12, 2024
@wcy-fdu
Copy link
Contributor Author

wcy-fdu commented Jun 14, 2024

Backward compatibility tests pass.

**
image
**

src/meta/src/manager/env.rs Outdated Show resolved Hide resolved
src/meta/src/manager/env.rs Outdated Show resolved Hide resolved
src/meta/src/manager/env.rs Outdated Show resolved Hide resolved
@wcy-fdu wcy-fdu requested a review from hzxa21 June 17, 2024 07:59
Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the PR!

Copy link

gitguardian bot commented Jun 17, 2024

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
9425213 Triggered Generic Password 6de4852 ci/scripts/e2e-sink-test.sh View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@wcy-fdu wcy-fdu enabled auto-merge June 17, 2024 09:06
@wcy-fdu wcy-fdu added this pull request to the merge queue Jun 17, 2024
Merged via the queue into main with commit 1b8a563 Jun 17, 2024
33 of 34 checks passed
@wcy-fdu wcy-fdu deleted the wcy/azblob_prefix branch June 17, 2024 09:57
@@ -408,13 +407,6 @@ pub async fn start_service_as_election_leader(
mut svc_shutdown_rx: WatchReceiver<()>,
) -> MetaResult<()> {
tracing::info!("Defining leader services");
if let MetaStoreImpl::Sql(sql_store) = &meta_store_impl {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove this code ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants