Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(source): resolve avro Ref during avro_schema_to_column_descs without hack #19601

Merged
merged 3 commits into from
Nov 28, 2024

Conversation

xiangjinwu
Copy link
Contributor

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Part of #17020. Ref in avro used to be supported by a hack we added in our apache_avro fork. This PR avoids the hack (replacing Ref with their referee in-place, resulting in an invalid tree containing duplicate definitions). The hack fails to work when there is Ref inside Ref - resulting in either unresolved Ref or infinite recursion.

This PR only corrects part of the usage - in avro_schema_to_column_descs to derive RisingWave column data types from avro schema. There will be a follow-up to correct the usage in convert_to_datum/AvroAccess. Without the latter part, simple data types like int (inside a Ref in another Ref) can already be supported.

Instead of building the expanded-yet-invalid tree as in the hack, this solution passes a NamesRef obtained from the root. Its complexity of having an associated lifetime is easy to deal with in this context. (To contrast, prost_reflect for protobuf builds the tree with Arc - no lifetime and no duplication. But apache_avro::Schema does not use Arc.)

The circular reference rejection logic is same as #10499 for protobuf. It can be DRY'ed with error message improved later.

This is intended to be part of v2.2 and NOT cherry-picked into earlier versions. There has been a further hack available for earlier versions that can be cherry-picked on demand - where the user is responsible for not using circular reference or an infinite recursion would happen.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@graphite-app graphite-app bot requested a review from a team November 28, 2024 03:09
@github-actions github-actions bot added the type/fix Bug fix label Nov 28, 2024
Copy link
Contributor

@chenzl25 chenzl25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rubber stamp

Copy link
Member

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xiangjinwu
Copy link
Contributor Author

For context, this is the error reported for the added test case if this PR was not applied:

ERROR:  Failed to run the query

Caused by these errors (recent errors listed first):
  1: connector error
  2: Feature is not yet implemented: Avro type: Ref { name: Name { name: "Point", namespace: None } }
No tracking issue yet. Feel free to submit a feature request at https://github.com/risingwavelabs/risingwave/issues/new?labels=type%2Ffeature&template=feature_request.yml

There will be a follow-up to correct the usage in convert_to_datum/AvroAccess. Without the latter part, simple data types like int (inside a Ref in another Ref) can already be supported.

Both parts will be included in v2.2. In this intermediate state, decimal inside a Ref in another Ref would result in a parsing error and filled with null, instead of creation rejected without this first part.

@xiangjinwu xiangjinwu enabled auto-merge November 28, 2024 10:01
@xiangjinwu xiangjinwu added this pull request to the merge queue Nov 28, 2024
Merged via the queue into main with commit c00fe35 Nov 28, 2024
28 of 29 checks passed
@xiangjinwu xiangjinwu deleted the fix-source-avro-ref-resolve-r1 branch November 28, 2024 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/fix Bug fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants