Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sink): use 'create sink ... format ... encode' to create redis sink #13003

Merged
merged 3 commits into from
Oct 24, 2023

Conversation

xxhZs
Copy link
Contributor

@xxhZs xxhZs commented Oct 23, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

use 'create sink ... format ... encode' to create redis sink like kafka.
And add Template encode to support custom format for strings input into redis

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

create table

create table t1(v1 int, v2 int);

create redis sink with append_only and json

CREATE SINK s1 AS t1 WITH (
    primary_key = 'v1 ',
    connector = 'redis',
    redis.url= 'redis://127.0.0.1:6379/',
)FORMAT PLAIN ENCODE JSON(force_append_only='true');

create redis sink with upsert and json

CREATE SINK s1 AS t1  WITH (
    primary_key = 'v1 ',
    connector = 'redis',
    redis.url= 'redis://127.0.0.1:6379/',
)FORMAT UPSERT ENCODE JSON;

create redis sink with append_only and template

CREATE SINK s1 AS t1  WITH (
    primary_key = 'v1 ',
    connector = 'redis',
    redis.url= 'redis://127.0.0.1:6379/',
)FORMAT PLAIN ENCODE TEMPLATE(force_append_only='true', redis_key_format = 'V1 is:{v1}', redis_value_format = 'V2 is:{v2}');

create redis sink with upsert and template

CREATE SINK s1 AS t1  WITH (
    primary_key = 'v1 ',
    connector = 'redis',
    redis.url= 'redis://127.0.0.1:6379/',
)FORMAT UPSERT ENCODE TEMPLATE(redis_key_format = 'V1 is:{v1}', redis_value_format = 'V2 is:{v2}');

fix

use string for timestamp
@xxhZs xxhZs force-pushed the xxh/redis-sink-format branch from ed5498a to c8d0223 Compare October 23, 2023 08:30
@xxhZs xxhZs marked this pull request as ready for review October 23, 2023 08:30
@xxhZs xxhZs requested review from wenym1, xiangjinwu and hzxa21 October 23, 2023 08:30
@codecov
Copy link

codecov bot commented Oct 23, 2023

Codecov Report

Merging #13003 (712042e) into main (3881de2) will increase coverage by 0.01%.
Report is 14 commits behind head on main.
The diff coverage is 25.92%.

@@            Coverage Diff             @@
##             main   #13003      +/-   ##
==========================================
+ Coverage   68.76%   68.78%   +0.01%     
==========================================
  Files        1496     1496              
  Lines      250508   250555      +47     
==========================================
+ Hits       172268   172343      +75     
+ Misses      78240    78212      -28     
Flag Coverage Δ
rust 68.78% <25.92%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/connector/src/sink/catalog/mod.rs 56.38% <0.00%> (-0.51%) ⬇️
src/frontend/src/handler/create_sink.rs 66.66% <0.00%> (-1.50%) ⬇️
src/sqlparser/src/ast/statement.rs 54.69% <0.00%> (-0.14%) ⬇️
src/connector/src/sink/encoder/template.rs 65.95% <0.00%> (-34.05%) ⬇️
src/connector/src/sink/redis.rs 54.38% <42.85%> (+1.14%) ⬆️
src/connector/src/sink/formatter/mod.rs 33.33% <21.73%> (+10.03%) ⬆️

... and 11 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

src/connector/src/sink/catalog/mod.rs Outdated Show resolved Hide resolved
src/connector/src/sink/redis.rs Outdated Show resolved Hide resolved
Comment on lines +148 to +151
if matches!(
self.format_desc.encode,
super::catalog::SinkEncode::Template
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move check_string_format as part of TemplateEncoder::new, and just call SinkFormatterImpl::new_with_redis here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be calling TemplateEncoder::new multiple times during the data injection process, but we only need to check the format when creating the sink. So, it might be more appropriate to place it outside.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is called once per actor, and you are right we only need to check the format once. My point was about TemplateEncoder being self-contained. At least we can move the function definition into the TemplateEncoder file.

For more context, proto / avro is doing heavier work in SinkFormatterImpl::new, and we have opened an issue to discuss whether to optimize it from once-per-actor to only once on meta #12982

Copy link
Contributor

@wenym1 wenym1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need to decide the name of this format from product's perspective. The name in current PR is TEMPLATE cc @neverchanje .

)FORMAT PLAIN ENCODE TEMPLATE(force_append_only='true', redis_key_format = 'UserID:{user_id}', redis_value_format = 'TargetID:{target_id},EventTimestamp{event_timestamp}');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May not add force_append_only='true' here?

And may not have the redis_ prefix for both key_format and value_format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried it, but without it, success cannot be achieved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, for the SQL syntax, we should put the force_append_only in the WITH part of the SQL rather than in the properties of encoder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, for the SQL syntax, we should put the force_append_only in the WITH part of the SQL rather than in the properties of encoder.

cc @xiangjinwu

Copy link
Contributor

@xiangjinwu xiangjinwu Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is part of format plain, so placed after it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

force_append_only is derived in frontend and stored as SinkType separately in SinkDesc, and before the formatter, the sink executor will remove the delete and upsert delete when force_append_only is true. And in the code of frontend, force_append_only is derived only by the with options. Should we fix this? @xiangjinwu

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

force_append_only is derived only by the with options

Check line 236-245 below. It has been updated.

Regarding SinkType, I actually plan to rename it to AppendOnlyState = [APPEND_ONLY, FORCE_APPEND_ONLY, NOT_APPEND_ONLY], given its current actual functionality is not aligned with type = in old syntax / format in new syntax. Its name has been confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Good to know that.

src/connector/src/sink/formatter/mod.rs Outdated Show resolved Hide resolved
fix
@xxhZs xxhZs force-pushed the xxh/redis-sink-format branch from de0df1e to 712042e Compare October 24, 2023 07:06
@xxhZs xxhZs added this pull request to the merge queue Oct 24, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 24, 2023
@xxhZs xxhZs added the user-facing-changes Contains changes that are visible to users label Oct 24, 2023
@xxhZs xxhZs added this pull request to the merge queue Oct 24, 2023
Merged via the queue into main with commit 210ae71 Oct 24, 2023
28 of 29 checks passed
@xxhZs xxhZs deleted the xxh/redis-sink-format branch October 24, 2023 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature user-facing-changes Contains changes that are visible to users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants