Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sink): Support big query sink #12873

Merged
merged 11 commits into from
Nov 1, 2023
Merged

feat(sink): Support big query sink #12873

merged 11 commits into from
Nov 1, 2023

Conversation

xxhZs
Copy link
Contributor

@xxhZs xxhZs commented Oct 16, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

support big query sink. Because BigQuery has limited support for updates and deletes, we currently only support 'append only'

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

sql example
CREATE SINK s1 FROM t1 WITH (
connector = 'bigquery',
type = 'append-only',
bigquery.path= '${bigquery_service_account_json_path}',
bigquery.project= '${project_id}',
bigquery.dataset= '${dataset_id}',
bigquery.table= '${table_id}',
force_append_only='true'
);

bigquery.path: Big Query service account json file path. Can find it inhttps://console.cloud.google.com/iam-admin/serviceaccounts.
bigquery.project: Big Query project id
bigquery.dataset: Big Query dataset id
bigquery.table: Big Query table id

@xxhZs xxhZs force-pushed the xxh/big_query_sink branch 2 times, most recently from e3ed31d to 5e80162 Compare October 16, 2023 10:01
@xxhZs xxhZs force-pushed the xxh/big_query_sink branch from 5e80162 to 0469554 Compare October 16, 2023 10:06
@xxhZs xxhZs marked this pull request as ready for review October 16, 2023 10:07
@xxhZs xxhZs requested a review from a team as a code owner October 16, 2023 10:07
@xxhZs xxhZs requested review from hzxa21, wenym1 and tabVersion October 16, 2023 10:07
Copy link
Contributor

@wenym1 wenym1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM. Thanks for the PR!

src/connector/src/sink/mod.rs Show resolved Hide resolved
src/connector/src/sink/encoder/mod.rs Show resolved Hide resolved
src/connector/src/sink/big_query.rs Outdated Show resolved Hide resolved
src/connector/src/sink/big_query.rs Outdated Show resolved Hide resolved
}

async fn append_only(&mut self, chunk: StreamChunk) -> Result<()> {
let mut insert_vec = vec![];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the current implementation, it seems that each self.insert_request only lives inside the scope of append_only? If so, instead of storing it as field, we may create a new local variable inside each append_only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will use this variable in two places, either when it exceeds the maximum size or when a barrier is encountered.

src/connector/src/sink/big_query.rs Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 23, 2023

Codecov Report

Merging #12873 (0cb4f51) into main (84d27ba) will decrease coverage by 0.07%.
Report is 1 commits behind head on main.
The diff coverage is 11.07%.

@@            Coverage Diff             @@
##             main   #12873      +/-   ##
==========================================
- Coverage   68.23%   68.16%   -0.07%     
==========================================
  Files        1506     1507       +1     
  Lines      255354   255634     +280     
==========================================
+ Hits       174235   174259      +24     
- Misses      81119    81375     +256     
Flag Coverage Δ
rust 68.16% <11.07%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/connector/src/sink/encoder/mod.rs 89.74% <ø> (ø)
src/connector/src/sink/mod.rs 60.71% <ø> (ø)
src/connector/src/sink/encoder/json.rs 87.23% <0.00%> (-2.09%) ⬇️
src/connector/src/sink/big_query.rs 11.61% <11.61%> (ø)

... and 11 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@wenym1 wenym1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

};

pub const BIGQUERY_SINK: &str = "bigquery";
const BIGQUERY_INSERT_MAX_NUMS: usize = 500;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for setting this value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set the maximum number of cached rows; any rows exceeding this limit will be inserted directly to prevent certain issues caused by excessive data within a single barrier.

Copy link
Contributor

@tabVersion tabVersion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

src/connector/src/sink/big_query.rs Show resolved Hide resolved
src/connector/src/sink/big_query.rs Outdated Show resolved Hide resolved
Copy link
Member

@fuyufjh fuyufjh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for Cargo.toml

@xxhZs xxhZs added the user-facing-changes Contains changes that are visible to users label Oct 25, 2023
@xxhZs xxhZs added this pull request to the merge queue Oct 25, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 25, 2023
@xxhZs xxhZs added this pull request to the merge queue Oct 25, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 25, 2023
@gitguardian
Copy link

gitguardian bot commented Oct 25, 2023

⚠️ GitGuardian has uncovered 3 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id Secret Commit Filename
7648795 Generic CLI Secret 34f07d7 integration_tests/iceberg-cdc/run_test.sh View secret
7648795 Generic CLI Secret 0cb4f51 integration_tests/iceberg-cdc/run_test.sh View secret
7648795 Generic CLI Secret 0cb4f51 integration_tests/iceberg-cdc/docker-compose.yml View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

@xxhZs xxhZs enabled auto-merge October 25, 2023 15:35
@xxhZs xxhZs disabled auto-merge October 26, 2023 05:56
@xxhZs xxhZs requested a review from tabVersion October 30, 2023 03:22
@xxhZs xxhZs enabled auto-merge November 1, 2023 03:26
@xxhZs xxhZs disabled auto-merge November 1, 2023 03:28
@xxhZs xxhZs enabled auto-merge November 1, 2023 03:28
@xxhZs xxhZs added this pull request to the merge queue Nov 1, 2023
Merged via the queue into main with commit 7f5d3f6 Nov 1, 2023
8 checks passed
@xxhZs xxhZs deleted the xxh/big_query_sink branch November 1, 2023 04:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature user-facing-changes Contains changes that are visible to users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants