Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add sql migration script and related ci tests #16000

Merged
merged 24 commits into from
Apr 1, 2024

Conversation

yezizp2012
Copy link
Member

@yezizp2012 yezizp2012 commented Mar 28, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Example:

target/debug/risectl meta migration --etcd-endpoints localhost:2388 --sql-endpoint postgres://postgres:@localhost:5432/postgres -f

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

Copy link
Contributor

@shanicky shanicky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stream graph part lgtm.

Copy link
Contributor

@zwang28 zwang28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the hummock part.

I assume the migration script must be run with all RW nodes offline, to keep data consistency?

println!("system parameters migrated");

// workers.
let workers = model::Worker::list(&meta_store).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine not to migrate Worker to sql store. Because when workers join again, the info is fulfilled correspondingly.

Same for HummockPinnedSnapshot, HummockPinnedVersion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrating workers is to avoid unnecessary migration or scaling. Especially when cn continues to join, it may trigger multiple scalings.

}
println!("compaction task migrated");

// hummock sequence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ease of future maintenance, may make the init value global constant. Because the init value must be equal to that from src/meta/src/hummock/manager/sequence.rs.

@zwang28 zwang28 self-requested a review April 1, 2024 08:05
Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears to be hard to cover all metadata here. 😕 For example, the system parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed 😕 . I haven't thought of a better way to cover all the metadata yet, so I can only add them on a case-by-case basis for now. I will add the system parameters as well.

risingwave_object_store = { workspace = true }
risingwave_pb = { workspace = true }
risingwave_rpc_client = { workspace = true }
risingwave_storage = { workspace = true }
risingwave_stream = { workspace = true }
sea-orm = { version = "0.12.14", features = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it a workspace dependency?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask how long we plan to keep both meta backends? Can we somehow enforce developers to update this file when introducing new meta data during this period (otherwise it can be really dangerous for upgraders)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask how long we plan to keep both meta backends?

I think we gonna keep both of them for two major versions, until the cloud fully integrate sql backend.

🤔 If the developer introduce new meta backend without migration, the cluster will fail to start due to inconsistent cluster id.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

until the cloud fully integrate sql backend.

May I kindly ask that if there's an exact scheduling for integrating SQL meta backend into cloud in approximately 2 months? cc @arkbriar @fuyufjh Maintaining both backends will be a great pain for developers.

without migration, the cluster will fail to start due to inconsistent cluster id

This is true. I was referring to the case where some kind of newly introduced metadata is accidentally uncovered in the migration script.

Copy link
Member Author

@yezizp2012 yezizp2012 Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true. I was referring to the case where some kind of newly introduced metadata is accidentally uncovered in the migration script.

Indeed, 🥵 . Or for new requirements after 1.8, the etcd backend does not support and returns an error suggesting users to upgrade to the sql backend first. This way, there is no need to synchronize and maintain migration scripts.

Copy link
Member Author

@yezizp2012 yezizp2012 Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can rely on backup and restore to cover this situation, do a backup from the etcd cluster, and then restore it to the SQL backend. This part of the functionality is being worked on by @zwang28. Before that, we can help users upgrade through this migration script. I think this's the best way to handle it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I kindly ask that if there's an exact scheduling for integrating SQL meta backend into cloud in approximately 2 months? cc @arkbriar @fuyufjh Maintaining both backends will be a great pain for developers.

For now, no. However, it will be considered in the next quarter. I think first of all we need to support SQL backends. Migration is what should be considered afterwards.

@yezizp2012 yezizp2012 requested a review from BugenZhao April 1, 2024 09:14
@yezizp2012 yezizp2012 enabled auto-merge April 1, 2024 09:26
@yezizp2012 yezizp2012 added this pull request to the merge queue Apr 1, 2024
Merged via the queue into main with commit 9d74e10 Apr 1, 2024
29 of 30 checks passed
@yezizp2012 yezizp2012 deleted the feat/migration-etcd-2-sql branch April 1, 2024 10:14
github-merge-queue bot pushed a commit that referenced this pull request Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants