Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(stream): materialize should not compact input when handling conflict #13351

Merged
merged 4 commits into from
Nov 10, 2023

Conversation

st1page
Copy link
Contributor

@st1page st1page commented Nov 9, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

fix #13346

          It is because we use a `MaterializeBuffer` to compact the input chunk and because the upsert stream does not obey the associative property, the compact operation with the retractable stream rules is illegal. e.g. 

INSERT 1, DELETE 1 can be compacted into No-op. but UPSERT 1, DELETE 1 can not because we can not know if the record that should be deleted is inserted by the upsert operation.
I think we should remove the MaterializeBuffer structure but just leave a HashSet to help to get all the keys in the streamChunk and fetch them into the cache. And then we need to do all real operations in the state cache.

Originally posted by @st1page in #13346 (comment)

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@github-actions github-actions bot added the type/fix Bug fix label Nov 9, 2023
@st1page st1page requested a review from wcy-fdu November 10, 2023 04:10
Comment on lines -1305 to +1265
Some(OwnedRow::new(vec![Some(8_i32.into()), Some(3_i32.into())]))
Some(OwnedRow::new(vec![Some(8_i32.into()), Some(2_i32.into())]))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

U- 8 1
U+ 8 2
+  8 3

The conflict behavior is ignore. Quesiton is if we need to consider the update as a whole operation. The U- 8 1 is trying to delete a non-existent record, so it is ignored. But if the following U+ 8 2 should be ignored too?

Comment on lines -1184 to +1144
Some(OwnedRow::new(vec![Some(1_i32.into()), Some(4_i32.into())]))
Some(OwnedRow::new(vec![Some(1_i32.into()), Some(3_i32.into())]))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

            " i i
            + 1 3
            + 1 4

I think it should be 3 for a ignored behavior?

Copy link

codecov bot commented Nov 10, 2023

Codecov Report

Merging #13351 (b237a0c) into main (492076f) will decrease coverage by 0.07%.
Report is 3 commits behind head on main.
The diff coverage is 93.23%.

@@            Coverage Diff             @@
##             main   #13351      +/-   ##
==========================================
- Coverage   67.81%   67.74%   -0.07%     
==========================================
  Files        1524     1524              
  Lines      259436   259469      +33     
==========================================
- Hits       175934   175787     -147     
- Misses      83502    83682     +180     
Flag Coverage Δ
rust 67.74% <93.23%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/stream/src/executor/mview/materialize.rs 90.04% <93.23%> (+0.12%) ⬆️

... and 21 files with indirect coverage changes

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind writing some tests?

Copy link
Contributor

@wcy-fdu wcy-fdu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the quick fix~

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leverage the snapshot test here?

/// Similar to [`check_until_pending`], but use a DSL test script as input.
///
/// For each input event, it drives the executor until it is pending.
pub async fn check_with_script<F, Fut>(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will take a look later

@st1page st1page added this pull request to the merge queue Nov 10, 2023
Merged via the queue into main with commit 0b9cb1f Nov 10, 2023
6 of 7 checks passed
@st1page st1page deleted the sts/fix_materialize_should_not_compact_input branch November 10, 2023 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/fix Bug fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Materialize executor unexpectedly compact the UPDATE and DELETE operations from CDC
3 participants