Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: refactor logical source for better batch optimization #16354

Closed
chenzl25 opened this issue Apr 17, 2024 · 1 comment
Closed

Refactor: refactor logical source for better batch optimization #16354

chenzl25 opened this issue Apr 17, 2024 · 1 comment

Comments

@chenzl25
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Currently, we utilize the LogicalSource abstraction for both batch and streaming queries within our system. For batch queries, our system supports multiple sources such as Kafka, Iceberg, and file system (fs). Each of these sources requires specific optimizations tailored to its characteristics. For instance, the Kafka source may require setting a kafka_timestamp_range to allow users to specify a timestamp range, while the Iceberg source benefits from optimizations like column pruning and time travel capabilities.

However, if we want to add this optimization to the logical source, we would find that the Source operator quickly becomes messy. To address this, I propose introducing an additional optimization step specifically designed for batch queries. This step involves transforming the generic LogicalSource into dedicated operators such as LogicalKafkaScan, LogicalIcebergScan, and so forth. This approach will streamline optimization efforts, ensuring that each source type receives the necessary enhancements without cluttering the Source operator with numerous conditional branches.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

Copy link
Contributor

github-actions bot commented Aug 1, 2024

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean.
Don't worry if you think the issue is still valuable to continue in the future.
It's searchable and can be reopened when it's time. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant