Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cdc): allow decimal to rw_int256 and varchar #16346

Merged
merged 9 commits into from
Apr 19, 2024

Conversation

KeXiangWang
Copy link
Contributor

@KeXiangWang KeXiangWang commented Apr 16, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Resolve #15977. With this PR, numeric in PG can be converted into rw_int256 or varchar implicitly through CDC.
The only thing user need to do is when create table, mark the data type of the column corresponding to numeric as rw_int256 or varchar.

One thing worth mentioning is that the numerics with decimal parts will be considered as NULL if the column is to be converted to rw_int256.

Currently, it's only for postgres. We can add support for mysql later if requested.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@KeXiangWang KeXiangWang requested a review from a team as a code owner April 16, 2024 14:44
Copy link

gitguardian bot commented Apr 16, 2024

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
Once a secret has been leaked into a git repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

@KeXiangWang KeXiangWang marked this pull request as draft April 16, 2024 14:44
@KeXiangWang KeXiangWang force-pushed the wkx/cdc-source-int256 branch from c7d4944 to ccc2589 Compare April 16, 2024 14:59
@KeXiangWang KeXiangWang marked this pull request as ready for review April 16, 2024 15:13
@KeXiangWang KeXiangWang force-pushed the wkx/cdc-source-int256 branch 5 times, most recently from a006ada to db6c98f Compare April 16, 2024 19:45
@KeXiangWang KeXiangWang requested review from StrikeW, wenym1 and lmatz April 16, 2024 20:26
Copy link
Contributor

@lmatz lmatz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@lmatz lmatz requested a review from xiangjinwu April 17, 2024 00:50
@KeXiangWang
Copy link
Contributor Author

Also kindly mention @bestmike007

src/connector/src/parser/postgres.rs Outdated Show resolved Hide resolved
src/connector/src/parser/postgres.rs Outdated Show resolved Hide resolved
src/connector/src/parser/postgres.rs Outdated Show resolved Hide resolved
@KeXiangWang KeXiangWang force-pushed the wkx/cdc-source-int256 branch 2 times, most recently from 037c7e8 to be55d48 Compare April 17, 2024 19:02
Comment on lines +96 to +97
107 POSITIVE_INFINITY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supposed to be NaN and Infinity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Trying to resolve this issue. If it's too non-trivial, I'll open another PR to fix it. Besides, I also find that in our shared source cdc, the conversion from pg-numeric to rw-numeric will also lose Infinity and -Infinity. Will fix it soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🫠Too complicated to fix, as it requires us to know the types in upstream tables when parsing the data in JSON payload. Otherwise, we cannot distinguish whether a string is converted from a numeric or a real varchar.

So in this PR, when mapping pg-numeric to varchar, NaN will be NAN, inf will be POSITIVE_INFINITY , and -inf will be NEGATIVE_INFINITY . At least these behavior is aligned among backfiling and normal data updates. Do you think that's acceptable?

For the bug in the conversion from pg-numeric to rw-numeric, I'll create a PR later to resolve it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a issue here to record.

Copy link
Contributor

@StrikeW StrikeW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

e2e_test/source/cdc/cdc.share_stream.slt Outdated Show resolved Hide resolved
@KeXiangWang KeXiangWang force-pushed the wkx/cdc-source-int256 branch from be55d48 to 9d4577a Compare April 19, 2024 00:06
@KeXiangWang KeXiangWang added this pull request to the merge queue Apr 19, 2024
Merged via the queue into main with commit 57310af Apr 19, 2024
27 of 28 checks passed
@KeXiangWang KeXiangWang deleted the wkx/cdc-source-int256 branch April 19, 2024 04:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support rw_int256 type in Postgres CDC source
4 participants