Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(storage): fix spill table id too strict assertion #17736

Merged
merged 1 commit into from
Jul 18, 2024

Conversation

wenym1
Copy link
Contributor

@wenym1 wenym1 commented Jul 18, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Resolve #17734

Update a previous unit test so that the corner case can be covered

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@github-actions github-actions bot added the type/fix Bug fix label Jul 18, 2024
@wenym1 wenym1 requested review from hzxa21, Li0k, BugenZhao and kwannoel July 18, 2024 06:45
Copy link
Contributor

@Li0k Li0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -965,7 +965,12 @@ impl UploaderData {
.map(|task_id| {
let (sst, spill_table_ids) =
self.spilled_data.remove(task_id).expect("should exist");
assert_eq!(spill_table_ids, table_ids);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are spill_table_ids and table_ids referring to in this context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spill_table_ids is the involved table ids in a spill task. table_ids is the table ids to be synced.

@kwannoel
Copy link
Contributor

kwannoel commented Jul 18, 2024

Triggered the main-cron job to run. May wait and see the results first.

@kwannoel
Copy link
Contributor

Oh this is to fix join amplification. Not the integration scale test. May ignore the main-cron result.

@wenym1 wenym1 added this pull request to the merge queue Jul 18, 2024
@kwannoel kwannoel added the need-cherry-pick-release-1.10 Open a cherry-pick PR to branch release-1.10 after the current PR is merged label Jul 18, 2024
@wenym1 wenym1 removed the need-cherry-pick-release-1.10 Open a cherry-pick PR to branch release-1.10 after the current PR is merged label Jul 18, 2024
@wenym1
Copy link
Contributor Author

wenym1 commented Jul 18, 2024

No need to cherry-pick to any release. The bug is merged recently.

Merged via the queue into main with commit 3328025 Jul 18, 2024
42 of 44 checks passed
@wenym1 wenym1 deleted the yiming/fix-spill-table-check branch July 18, 2024 08:08
@PhanSon95
Copy link

I'm staying at v1.9.1. Is this fix apply to that version already ?

@wenym1
Copy link
Contributor Author

wenym1 commented Aug 15, 2024

I'm staying at v1.9.1. Is this fix apply to that version already ?

It seems that v1.9.1 does not even include the code that introduces the too strict assertion.

Did you fail this assertion in your environment?

@PhanSon95
Copy link

Exactly, I'm deploying risingwave to my k8s cluster in distributed mode, and today the compactor component crashed (image attached below)
image

@PhanSon95
Copy link

Btw, I see we already had release v1.10.0 .
The question is, is that fix applied to this release ?

@wenym1
Copy link
Contributor Author

wenym1 commented Aug 15, 2024

Btw, I see we already had release v1.10.0 . The question is, is that fix applied to this release ?

I just checked the code of v1.10.0, and the failed assertion resolved in this PR had not been included in v1.10.0 either.

In the same file there is a similar assertion assert_eq!(epoch, epoch_data.epoch);. Did you fail at this line instead? It will great if you can share the full panic stack trace.

@wenym1
Copy link
Contributor Author

wenym1 commented Aug 15, 2024

Exactly, I'm deploying risingwave to my k8s cluster in distributed mode, and today the compactor component crashed (image attached below) image

Do you mind opening a bug report issue and fill in the related information as much as possible in the issue template so that we can fix the bug with more context? Thanks!

@PhanSon95
Copy link

@wenym1 Ofcourse, let me open bug report issue like you mention.
Here is the link of issue: #18054
Please take a look, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants