Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sink demo): Add http sink demo #15149

Merged
merged 6 commits into from
Feb 23, 2024
Merged

feat(sink demo): Add http sink demo #15149

merged 6 commits into from
Feb 23, 2024

Conversation

xxhZs
Copy link
Contributor

@xxhZs xxhZs commented Feb 20, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Add an http sink demo

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@xxhZs xxhZs requested a review from lmatz February 20, 2024 06:56
Copy link
Contributor

@lmatz lmatz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, so the http server in this demo needs to be activated by the user on their own, right?

cc: @neverchanje

@neverchanje
Copy link
Contributor

neverchanje commented Feb 20, 2024

I was unaware about the feature so I didn't know you have actually implemented some stuff.
Martin is right. https://github.com/getindata/flink-http-connector is merely a thirdparty plugin. The community had discussed the idea before, but no consensus reached yet.

IMO Getindata's http sink has a few limitations:

  • It offers only two options for HTTP method, i.e, PUT and POST. It can be enough for most cases though.
  • It can only execute one request-reply round to the service (session-less). However, some services may require an auth key first, which is obtained from another REST API, before the actual request.
  • It cannot handle status codes in the SQL API.

To conclude, the SQL interface can only work in certain cases where the REST API is session-less and simple.

For more general cases, I would suggest users to try Python UDF at first. In the cloud environment, this solution will only require exposing the UDF server to the public network, which will be more secured.

@xxhZs
Copy link
Contributor Author

xxhZs commented Feb 20, 2024

I was unaware about the feature so I didn't know you have actually implemented some stuff. Martin is right. https://github.com/getindata/flink-http-connector is merely a thirdparty plugin. The community had discussed the idea before, but no consensus reached yet.

IMO Getindata's http sink has a few limitations:

  • It offers only two options for HTTP method, i.e, PUT and POST. It can be enough for most cases though.
  • It can only execute one request-reply round to the service (session-less). However, some services may require an auth key first, which is obtained from another REST API, before the actual request.
  • It cannot handle status codes in the SQL API.

To conclude, the SQL interface can only work in certain cases where the REST API is session-less and simple.

For more general cases, I would suggest users to try Python UDF at first. In the cloud environment, this solution will only require exposing the UDF server to the public network, which will be more secured.

I agree, this feature was just a small example at the time when it was compatible with flink sink, the purpose of this pr was just that we didn't have an example of this feature anywhere in our library, so it was added.

@xxhZs xxhZs enabled auto-merge February 21, 2024 02:54
@hzxa21
Copy link
Collaborator

hzxa21 commented Feb 21, 2024

I agree, this feature was just a small example at the time when it was compatible with flink sink, the purpose of this pr was just that we didn't have an example of this feature anywhere in our library, so it was added.

Can we document this as well as the limitation of the current http sink implementation in the demo code and flink sink code?

@xxhZs xxhZs disabled auto-merge February 21, 2024 08:59
@xxhZs xxhZs enabled auto-merge February 21, 2024 09:40
auto-merge was automatically disabled February 22, 2024 06:09

Merge queue setting changed

@xxhZs xxhZs enabled auto-merge February 22, 2024 07:17
auto-merge was automatically disabled February 23, 2024 01:45

Merge queue setting changed

@xxhZs xxhZs added this pull request to the merge queue Feb 23, 2024
Merged via the queue into main with commit 316f180 Feb 23, 2024
26 of 27 checks passed
@xxhZs xxhZs deleted the xxh/add-http-sink-demo branch February 23, 2024 03:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants