Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sink): json timestamptz should contain UTC suffix Z #13109

Merged
merged 4 commits into from
Nov 1, 2023

Conversation

xiangjinwu
Copy link
Contributor

@xiangjinwu xiangjinwu commented Oct 27, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

In #9755

Also fix the type cast of Timestamptz in datum_to_json_object to support downstream jdbc connector. Now it will generate a correct string value according to Debezium's document. This is partially related to #9376.

That PR changed from 1516991409.453000 to 2018-01-26 18:30:09.453000. This format is the only format accept by an insert into mysql/tidb.

However, the actual debezium output is 2018-01-26T18:30:09.453Z, no matter it is from postgresql or mysql.
https://github.com/debezium/debezium/blob/main/debezium-core/src/main/java/io/debezium/time/ZonedTimestamp.java

Furthermore, RisingWave json source accepts 2018-01-26T18:30:09.453Z but rejects 2018-01-26 18:30:09.453000 since 1.2.

To distinguish it from timestamp (without time zone), align with debezium as well as RisingWave's own json source, this PR changes the output format in the following cases:

  • New kafka / kinesis / pulsar append-only / upsert sinks created without the mode option (see below).
  • Existing and new debezium sinks. This is a breaking change but bug fix.

For compatibility, the old format is still used in the following cases:

  • Existing kafka / kinesis / pulsar append-only / upsert sinks.
  • Existing and new remote sinks.
  • Existing and new nats sinks. This will change and follow kafka / kinesis / pulsar once onboarded to new sink syntax.

It also provides a new option for new kafka / kinesis / pulsar append-only / upsert sinks to explicitly select the old format:

  • timestamptz.handling.mode = 'utc_without_suffix'

Just for a comparison of the different formats:

format T with zone space without zone space with zone
example 2006-01-02T15:04:05Z 2006-01-02 15:04:05 2006-01-02 15:04:05+00
ISO 8601
RFC 3339 Z or +00:00 but not +00
pg/rw psql input ✅ (session tz) rw does not support +00
pg/rw psql output pg +00 but rw +00:00
mysql/tidb/doris input/output ✅ (session tz)
debezium from pg/mysql
pg to_jsonb
rw source json Z or +00:00 but not +00
rw sink json (before PR) ✅ (always UTC)
rw sink json (after PR) ✅ (new frontend default) ✅ (compatible)

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

For sinks with timestamptz column using json encoding, the output format changed from 2018-01-26 18:30:09.453000 to 2018-01-26T18:30:09.453000Z.

@github-actions github-actions bot added the type/fix Bug fix label Oct 27, 2023
@codecov
Copy link

codecov bot commented Oct 30, 2023

Codecov Report

Merging #13109 (fd9c604) into main (b7195f8) will decrease coverage by 0.01%.
The diff coverage is 53.98%.

@@            Coverage Diff             @@
##             main   #13109      +/-   ##
==========================================
- Coverage   68.10%   68.10%   -0.01%     
==========================================
  Files        1507     1507              
  Lines      255634   255729      +95     
==========================================
+ Hits       174104   174154      +50     
- Misses      81530    81575      +45     
Flag Coverage Δ
rust 68.10% <53.98%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/connector/src/sink/encoder/json.rs 87.16% <88.09%> (-0.07%) ⬇️
src/frontend/src/handler/create_sink.rs 65.70% <0.00%> (-0.97%) ⬇️
src/connector/src/sink/kafka.rs 31.26% <0.00%> (-0.37%) ⬇️
src/connector/src/sink/nats.rs 0.00% <0.00%> (ø)
src/connector/src/sink/remote.rs 35.95% <50.00%> (+0.25%) ⬆️
src/connector/src/sink/formatter/debezium_json.rs 53.10% <46.15%> (-0.31%) ⬇️
src/connector/src/sink/formatter/mod.rs 30.13% <53.33%> (+0.72%) ⬆️
src/connector/src/sink/encoder/mod.rs 73.58% <28.57%> (-16.16%) ⬇️

... and 3 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@xiangjinwu xiangjinwu force-pushed the fix-sink-json-timestamptz branch from 55fcf52 to 94e6ab2 Compare October 31, 2023 10:43
@xiangjinwu xiangjinwu force-pushed the fix-sink-json-timestamptz branch from a8a0022 to 22142f9 Compare October 31, 2023 15:28
Copy link
Contributor

@tabVersion tabVersion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I did not realize the issue before. 😭

Copy link
Contributor

@StrikeW StrikeW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xiangjinwu xiangjinwu added this pull request to the merge queue Nov 1, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 1, 2023
@xiangjinwu xiangjinwu enabled auto-merge November 1, 2023 06:54
@xiangjinwu xiangjinwu added this pull request to the merge queue Nov 1, 2023
Merged via the queue into main with commit edf314e Nov 1, 2023
6 of 7 checks passed
@xiangjinwu xiangjinwu deleted the fix-sink-json-timestamptz branch November 1, 2023 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/fix Bug fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants