-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sink): Support big query sink #12873
Conversation
e3ed31d
to
5e80162
Compare
fix fix fmt
5e80162
to
0469554
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM. Thanks for the PR!
src/connector/src/sink/big_query.rs
Outdated
} | ||
|
||
async fn append_only(&mut self, chunk: StreamChunk) -> Result<()> { | ||
let mut insert_vec = vec![]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the current implementation, it seems that each self.insert_request
only lives inside the scope of append_only
? If so, instead of storing it as field, we may create a new local variable inside each append_only
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will use this variable in two places, either when it exceeds the maximum size or when a barrier is encountered.
Codecov Report
@@ Coverage Diff @@
## main #12873 +/- ##
==========================================
- Coverage 68.23% 68.16% -0.07%
==========================================
Files 1506 1507 +1
Lines 255354 255634 +280
==========================================
+ Hits 174235 174259 +24
- Misses 81119 81375 +256
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 11 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
src/connector/src/sink/big_query.rs
Outdated
}; | ||
|
||
pub const BIGQUERY_SINK: &str = "bigquery"; | ||
const BIGQUERY_INSERT_MAX_NUMS: usize = 500; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for setting this value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set the maximum number of cached rows; any rows exceeding this limit will be inserted directly to prevent certain issues caused by excessive data within a single barrier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for Cargo.toml
|
GitGuardian id | Secret | Commit | Filename | |
---|---|---|---|---|
7648795 | Generic CLI Secret | 34f07d7 | integration_tests/iceberg-cdc/run_test.sh | View secret |
7648795 | Generic CLI Secret | 0cb4f51 | integration_tests/iceberg-cdc/run_test.sh | View secret |
7648795 | Generic CLI Secret | 0cb4f51 | integration_tests/iceberg-cdc/docker-compose.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Our GitHub checks need improvements? Share your feedbacks!
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
support big query sink. Because BigQuery has limited support for updates and deletes, we currently only support 'append only'
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
sql example
CREATE SINK s1 FROM t1 WITH (
connector = 'bigquery',
type = 'append-only',
bigquery.path= '${bigquery_service_account_json_path}',
bigquery.project= '${project_id}',
bigquery.dataset= '${dataset_id}',
bigquery.table= '${table_id}',
force_append_only='true'
);
bigquery.path
: Big Query service account json file path. Can find it inhttps://console.cloud.google.com/iam-admin/serviceaccounts.bigquery.project
: Big Query project idbigquery.dataset
: Big Query dataset idbigquery.table
: Big Query table id