Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: snapshot testing for stream executors #9787

Merged
merged 25 commits into from
May 19, 2023
Merged

test: snapshot testing for stream executors #9787

merged 25 commits into from
May 19, 2023

Conversation

xxchan
Copy link
Member

@xxchan xxchan commented May 14, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Introduce snapshot testing to make writing executor tests easier. See the doc in src/stream/tests/it/snapshot.rs for more details.

Checklist For Contributors

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • [ ] I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • I have demonstrated that backward compatibility is not broken by breaking changes and created issues to track deprecated features to be removed in the future. (Please refer to the issue)
  • All checks passed in ./risedev check (or alias, ./risedev c)

Checklist For Reviewers

  • I have requested macro/micro-benchmarks as this PR can affect performance substantially, and the results are shown.

Documentation

  • My PR DOES NOT contain user-facing changes.
Click here for Documentation

Types of user-facing changes

Please keep the types that apply to your changes, and remove the others.

  • Installation and deployment
  • Connector (sources & sinks)
  • SQL commands, functions, and operators
  • RisingWave cluster configuration changes
  • Other (please specify in the release note below)

Release note

@github-actions github-actions bot added the component/test Test related issue. label May 14, 2023
@xxchan xxchan requested review from stdrc, TennyZhuang and BugenZhao May 14, 2023 16:50
@xxchan
Copy link
Member Author

xxchan commented May 14, 2023

PTAL and tell whether you like this idea.

@xxchan xxchan requested a review from st1page May 14, 2023 16:57
@st1page
Copy link
Contributor

st1page commented May 14, 2023

how about splitting src/stream/src/ and src/stream/src/test? In other words, the test in this PR will be in src/stream/src/test/executor/over_window. This arrangement of test may not be able to access some private field, but currently most of our test does not acquire access them at all.

Btw, I remember we have mentioned that we can move the unit test into a separate because our streaming executor's tests do not need access private field and methods. And after that we can depend on more utils such as frontend crate. But I can not find that comments or issue.

@stdrc
Copy link
Member

stdrc commented May 15, 2023

Btw, I remember we have mentioned that we can move the unit test into a separate because our streaming executor's tests do not need access private field and methods. And after that we can depend on more utils such as frontend crate. But I can not find that comments or issue.

Not sure whether it's this comment: #7881 (comment)

Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether there'll be many executors that can benefit from this. 🤔

Actually, executor unit tests are not that "snapshot testing" to me. 🤔 The output is supposed to be manually derived by the developer and can be rarely changed. The ideas of simplifying these tests with DSL look good to me though.

src/stream/src/executor/over_window/eowc.rs Outdated Show resolved Hide resolved
@xxchan
Copy link
Member Author

xxchan commented May 15, 2023

I'm not sure whether there'll be many executors that can benefit from this.

I also have similar concerns so created a demo to gain feedbacks :p

I guess most executor tests are simple enough to be tested in this way. The only exceptions in my mind are:

  • binary executors (i.e., JOIN)
  • "control" executors

Some tests are directly manipulating the internal states / require more complicated control flows

Yes, so we don't use snapshot testing for such more fine-grained tests.

Actually, executor unit tests are not that "snapshot testing" to me. 🤔 The output is supposed to be manually derived by the developer and can be rarely changed. The ideas of simplifying these tests with DSL look good to me though.

The real motivation (as discussed with @TennyZhuang) is that manually write these tests are too tiring and the test coverage is low!!! 😄

  • Mainly for adding new tests, instead of replacing current tests
  • Generating outputs is good, and avoiding boilerplates is good.

So we may hope to find a way to write the simple test pattern (overlapping input/output events) easier. The granularity is between e2e and complete unit tests. It's kind of 1-executor slt where we manipulate messages directly (I'm not sure whether it's possible and if so, whether it's easier to use slt for these tests).

@xxchan xxchan requested a review from wangrunji0408 May 15, 2023 13:17
@xxchan xxchan force-pushed the xxchan/snapshot branch from 4b304ef to 74ee568 Compare May 15, 2023 15:21
@wangrunji0408
Copy link
Contributor

I have thought about such test script and it looks pretty cool!
I'm just curious why it is called "snapshot" testing, instead of "executor" testing?

@stdrc
Copy link
Member

stdrc commented May 16, 2023

I'm just curious why it is called "snapshot" testing, instead of "executor" testing?

I agree with the name, it's comparing the output snapshot of a correct (committed) version of executor with latest one.


The idea looks great to me! Follow the same idea, can we also make e2e tests snapshot testing (automatically generate output)?

@xxchan
Copy link
Member Author

xxchan commented May 16, 2023

I'm just curious why it is called "snapshot" testing, instead of "executor" testing?

I agree with the name, it's comparing the output snapshot of a correct (committed) version of executor with latest one.

Yes, that's true. To my understanding, "snapshot testing" is basically "file-based tests". The main idea is just commit the entire output!.

It's also called "golden tests" (and the files are called golden files). Here's a blog introducing it https://www.cs.cornell.edu/~asampson/blog/turnt.html

e2e tests snapshot testing (automatically generate output)?

I also have the same question but not sure. Sqllogictest and planner test both support auto completion, so to me they are already kind of "snapshot testing". The only difference is that they put input/output in the same file. Actually that's desired, but we can still do so (contain input in the generated output) for snapshots, which is already done in this PR.

Technically I think that's doable (and not hard). The only reason not to do it I can come up with is that current situation is good enough. (I think for sqllogictest that's true, but for planner tests we have some pain points and had some discussions about it #8557)

Copy link
Contributor

@soundOfDestiny soundOfDestiny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks!

Copy link
Contributor

@kwannoel kwannoel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will be very useful to improve test coverage. More importantly it can help us locate bug sources at the executor level. Currently I rely on e2e sql to test executor, but that is not ergonomic to use when we want test something specific for the executor.

Generally LGTM, further improvements can be made in subsequent PRs.

@TennyZhuang
Copy link
Contributor

It's exactly what I want, LGTM.

We can replace the current tests step by step.

@xxchan xxchan marked this pull request as ready for review May 17, 2023 08:19
@codecov
Copy link

codecov bot commented May 17, 2023

Codecov Report

Merging #9787 (9c6cd87) into main (539b061) will decrease coverage by 0.04%.
The diff coverage is 75.00%.

@@            Coverage Diff             @@
##             main    #9787      +/-   ##
==========================================
- Coverage   71.14%   71.11%   -0.04%     
==========================================
  Files        1250     1250              
  Lines      209398   209198     -200     
==========================================
- Hits       148970   148761     -209     
- Misses      60428    60437       +9     
Flag Coverage Δ
rust 71.11% <75.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/stream/src/executor/mod.rs 50.98% <ø> (ø)
src/stream/src/executor/over_window/eowc.rs 92.21% <ø> (-2.66%) ⬇️
src/stream/src/executor/project_set.rs 68.88% <ø> (-17.52%) ⬇️
src/stream/src/executor/test_utils.rs 91.27% <ø> (ø)
src/common/src/array/stream_chunk.rs 85.61% <75.00%> (-0.15%) ⬇️

... and 8 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@risingwavelabs risingwavelabs deleted a comment from github-actions bot May 17, 2023
@stdrc
Copy link
Member

stdrc commented May 17, 2023

This inspired me to come up with another idea: Just between "fully DSL" and "fully hand-written", a combination of both might be interesting: we "run_until_pending" and then print the output in the DSL form (which can be automatically generated).

The idea is interesting! But what if later someday we want to add mock state store that is not in-memory? Then the output is not available at the first pending.

@BugenZhao
Copy link
Member

calling next().await.unwrap().unwrap() repeatedly is mostly just noise for reviewers and annoying for writers.

IIRC, we have ExecutorTestExt for simplifying this. 🥰

@xxchan
Copy link
Member Author

xxchan commented May 18, 2023

calling next().await.unwrap().unwrap() repeatedly is mostly just noise for reviewers and annoying for writers.

IIRC, we have ExecutorTestExt for simplifying this. 🥰

I know that. But what I like most is replacing multiple expect into one large one. e.g., take a look at the changes of TopN. How do you think?

@xxchan
Copy link
Member Author

xxchan commented May 18, 2023

My final decisions:

  • all-in expect_test, because the workflow is simpler and it's good enough.
  • use inline output, because it might be better to review.
  • keep both of the styles (code input and DSL script input), because I can't tell which is better.
  • put the new style only in /tests/it to force you write new tests there... (Consider using integration test instead of unit test #9878)

Not decided:

  • check_until_pending or check_n_steps

This PR is ready to merge. Final comments welcomed.

Copy link
Member

@stdrc stdrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is expect![[]] auto-updated on risedev test? What if developer forgets to run risedev test locally, will CI check that?

Comment on lines 92 to 116
check_with_script(
|| create_executor(calls.clone(), store.clone()),
r###"
- !barrier 1
- !chunk |2
I T I i
+ 1 p1 100 10
+ 1 p1 101 16
+ 4 p2 200 20
- !chunk |2
I T I i
+ 5 p1 102 18
+ 7 p2 201 22
+ 8 p3 300 33
# NOTE: no watermark message here, since watermark(1) was already received
- !barrier 2
- recovery
- !barrier 3
- !chunk |2
I T I i
+ 10 p1 103 13
+ 12 p2 202 28
+ 13 p3 301 39
- !barrier 4
"###,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a little bit ugly. Can we add indent here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add indent here?

Yes

This looks a little bit ugly.

I don't know. In theory this is a yaml, so somebody may argue that no indent make more sense. I have no preference 😇

src/stream/tests/it/main.rs Outdated Show resolved Hide resolved
@xxchan
Copy link
Member Author

xxchan commented May 19, 2023

So is expect![[]] auto-updated on risedev test? What if developer forgets to run risedev test locally, will CI check that?

It's only updated if the envvar is set. It's just like a normal assert_eq otherwise (with diff in the error message). And there will also be a hint in the error message. See https://docs.rs/expect-test/latest/expect_test/ for more info

image

@xxchan xxchan enabled auto-merge May 19, 2023 07:29
@xxchan xxchan added this pull request to the merge queue May 19, 2023
Merged via the queue into main with commit 09d1849 May 19, 2023
@xxchan xxchan deleted the xxchan/snapshot branch May 19, 2023 07:56
@stdrc
Copy link
Member

stdrc commented May 19, 2023

So is expect![[]] auto-updated on risedev test? What if developer forgets to run risedev test locally, will CI check that?

It's only updated if the envvar is set. It's just like a normal assert_eq otherwise (with diff in the error message). And there will also be a hint in the error message. See docs.rs/expect-test/latest/expect_test for more info

Will be nice to have sth like risedev update-integration-tests😁

@xxchan
Copy link
Member Author

xxchan commented May 19, 2023

Will be nice to have sth like risedev update-integration-tests😁

I don't want to add a script for such simple task and feel env var is good enough. But I don't object to adding it neither. 🤪

@xxchan xxchan changed the title test: snapshot testing for unary stream executors test: snapshot testing for stream executors May 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/test Test related issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants