Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add data-driven Avro decode integration tests #17434

Merged
merged 5 commits into from
Jul 5, 2024

Conversation

xxchan
Copy link
Member

@xxchan xxchan commented Jun 25, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Background:

This PR proposes to improve the tests with 2 ideas:

  1. Do unit/integration test for encodings.

    Logically, these features have a nice clear boundary, there's no reason to start a cluster and write verbose e2e test for them, except we don't have a clear boundary in code. But we are improving the code organization during refactor: split source parser into separate crate #17002, and the test boundary can also be improved now.

  2. Data-driven tests.

    Like planner test, execution integration test... We just write input data (instead of using code to generate data). The result can be generated, and displayed in nice human readable format.

Benefits:

  • Faster iteration loop: faster compilation, less steps to run tests: Just UPDATE_EXPECT=1 cargo test -p risingwave_connector_codec, or use Rust Analyzer to Run test + expect
  • Easier to write tests, and then higher coverage.

Limitations:

  • The boundary still isn't clear enough, and we need to re-implement some logic in tests. For more, see my comments in code.
  • Avro rust lib is pooooor, so the "data driven" part isn't good enough.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

Copy link
Member Author

xxchan commented Jun 25, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @xxchan and the rest of your teammates on Graphite Graphite

@github-actions github-actions bot added the component/test Test related issue. label Jun 25, 2024
@xxchan xxchan changed the title test: add codec integration tests test: add Avro codec integration tests Jun 25, 2024
@xxchan xxchan marked this pull request as ready for review June 25, 2024 05:57
@xxchan xxchan requested a review from a team as a code owner June 25, 2024 05:57
@xxchan xxchan force-pushed the 06-20-test_add_codec_integration_tests branch from f2caca4 to 5f37c9b Compare June 25, 2024 06:01
@xxchan xxchan changed the title test: add Avro codec integration tests test: add data-driven Avro decode integration tests Jun 25, 2024
@xxchan xxchan force-pushed the 06-20-test_add_codec_integration_tests branch from 5f37c9b to 291f561 Compare June 25, 2024 06:35
let record = build_avro_data(conf.schema.original_schema.as_ref());
writer.append(record).unwrap();
let encoded = writer.into_inner().unwrap();
println!("path = {:?}", e2e_file_path("avro_simple_schema_bin.1"));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're still some usage of this data file to be cleaned.
image

@xxchan xxchan mentioned this pull request Jun 27, 2024
12 tasks
@xxchan xxchan force-pushed the 06-20-test_add_codec_integration_tests branch 3 times, most recently from 9f3b8fc to 3222903 Compare July 2, 2024 15:24
@fuyufjh fuyufjh self-requested a review July 3, 2024 03:01
Copy link
Member

@fuyufjh fuyufjh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I didn't really get the motivation. Perhaps @xiangjinwu knows better 😄

Comment on lines -455 to -456
async fn test_avro_union_type() {
let parser = new_avro_parser_from_local("union-schema.avsc")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this case is not in the new tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. It seems to be missed.

@xxchan xxchan force-pushed the 06-20-test_add_codec_integration_tests branch from 3222903 to 2351718 Compare July 4, 2024 07:00
Copy link
Contributor

@xiangjinwu xiangjinwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines 31 to 33
if !f.alternate() || s.len() == 1 {
let (name, ty) = s.iter().next().unwrap();
return write!(f, "Struct {{ {}: {:?} }}", name, &DataTypeTestDisplay(ty));
Copy link
Contributor

@xiangjinwu xiangjinwu Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we only display the first field of a struct when !f.alternate()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

@xxchan xxchan force-pushed the 06-20-test_add_codec_integration_tests branch from 2351718 to de77653 Compare July 5, 2024 11:39
@xxchan xxchan enabled auto-merge July 5, 2024 11:41
@xxchan xxchan added this pull request to the merge queue Jul 5, 2024
Merged via the queue into main with commit f0fa34c Jul 5, 2024
30 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/test Test related issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants