Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature merge sink #705

Closed
wants to merge 21 commits into from

Conversation

zhuliquan
Copy link
Contributor

fix issue #700 merge sinks with same table, it's helpful to save file handles or kafka connectors. Besides, it may fix bug two pipeline write data to same file.
For example

CREATE TABLE cars (
        timestamp TIMESTAMP,
        driver_id BIGINT,
        event_type TEXT,
        location TEXT
) WITH (
        connector = 'single_file',
        path = 'cars.json',
        format = 'json',
        type = 'source'
);

CREATE TABLE cars_output (
        timestamp TIMESTAMP,
        driver_id BIGINT,
        event_type TEXT,
        location TEXT
) WITH (
        connector = 'single_file',
        path = 'cars_output.json',
        format = 'json',
        type = 'sink'
);
INSERT INTO cars_output SELECT * FROM cars WHERE driver_id = 100 AND event_type = 'pickup';
INSERT INTO cars_output SELECT * FROM cars WHERE driver_id = 101 AND event_type = 'dropoff';

if we don't merge sink node of two query, results of query1 will be overwrited by results of query2. Because there two sink node infer two file handle for same file (cars_output.json).

@zhuliquan zhuliquan marked this pull request as draft November 26, 2024 09:41
@zhuliquan zhuliquan closed this Nov 26, 2024
@zhuliquan zhuliquan deleted the feature-merge_sink branch November 26, 2024 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants