Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(optimizer): allow fold_const on timezone-dependent exprs in logical plan #12633

Closed
wants to merge 3 commits into from

Conversation

xiangjinwu
Copy link
Contributor

@xiangjinwu xiangjinwu commented Oct 5, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

WIP: temporal filter tests are failing by this change.

Fix #12523.

  • Avoid try_fold_const panic due to missing timezone in LogicalSource::predicate_pushdown.
  • Avoid unwrap when try_fold_const evaluates to an error (eg 'abc'::timestamptz) in LogicalSource::predicate_pushdown

Fix of the first issue is not perfect:

  • It always runs 2 passes of inline timezone in optimizer: one at the beginning and one near the end. Previously, stream only runs 1 near the end, while batch runs 1 in the middle and 1 near the end.
  • Alternatively, we could add timezone: &str as a required argument of try_fold_const. This guarantees timezone is always available when we are about to evaluate, but some direct callers do not have this info right now, and we lose the ability to notice user of the discouraged implicit usage.
  • Alternatively, we could do the timezone rewrite in Binder::bind_expr or FunctionCall::new. These are not practical for similar reasons: cannot warn user or direct caller lacking info.
  • RFC: Opaque context in expression evaluation rfcs#75 could make timezone available without rewrites. But we acknowledge it is still possible a caller forget to provide timezone, just like a caller may call try_fold_const before rewrite.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@xiangjinwu xiangjinwu changed the title fix(optimizer): allow fold_const on timezone-dependent exprs in logic… fix(optimizer): allow fold_const on timezone-dependent exprs in logical plan Oct 5, 2023
@github-actions github-actions bot added the type/fix Bug fix label Oct 5, 2023
@@ -369,8 +369,7 @@ fn expr_to_kafka_timestamp_range(

match &expr {
ExprImpl::FunctionCall(function_call) => {
if let Some((timestampz_literal, reverse)) = extract_timestampz_literal(&expr).unwrap()
{
if let Ok(Some((timestampz_literal, reverse))) = extract_timestampz_literal(&expr) {
Copy link
Contributor Author

@xiangjinwu xiangjinwu Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another bug fixed: _rw_kafka_timestamp >= 'aa'::timestamptz shall not unwrap()

@xiangjinwu xiangjinwu force-pushed the fix-12523-fold-const-panic-wo-zone branch from cdde515 to 4d728ff Compare October 5, 2023 10:39
@codecov
Copy link

codecov bot commented Oct 5, 2023

Codecov Report

Merging #12633 (4d728ff) into main (194b606) will increase coverage by 0.00%.
Report is 4 commits behind head on main.
The diff coverage is 80.88%.

@@           Coverage Diff           @@
##             main   #12633   +/-   ##
=======================================
  Coverage   69.28%   69.29%           
=======================================
  Files        1470     1470           
  Lines      241296   241324   +28     
=======================================
+ Hits       167187   167214   +27     
- Misses      74109    74110    +1     
Flag Coverage Δ
rust 69.29% <80.88%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/cmd_all/src/common.rs 0.00% <ø> (ø)
src/frontend/src/expr/mod.rs 77.54% <100.00%> (-0.24%) ⬇️
src/frontend/src/expr/session_timezone.rs 72.22% <ø> (+6.25%) ⬆️
src/frontend/src/expr/utils.rs 88.06% <100.00%> (+0.04%) ⬆️
src/frontend/src/optimizer/mod.rs 92.35% <100.00%> (+0.14%) ⬆️
...frontend/src/optimizer/plan_node/logical_source.rs 78.12% <100.00%> (ø)
src/cmd_all/src/standalone.rs 48.48% <75.00%> (+4.89%) ⬆️

... and 3 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Comment on lines +798 to +801
ExprType::AddWithTimeZone | ExprType::SubtractWithTimeZone => {
let args = f.inputs();
args[0].is_now_offset() && args[1].is_const()
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is too weak. The following optimizer rule actually requires is_now_offset to be stricter than WatermarkAnalyzer. This was true before the PR but is broken.

if let Some((input_expr, cmp, now_expr)) = expr.as_now_comparison_cond() {
let now_expr = rewriter.rewrite_expr(now_expr);
// as a sanity check, ensure that this expression will derive a watermark
// on the output of the now executor
debug_assert_eq!(
try_derive_watermark(&now_expr),
WatermarkDerivation::Watermark(lhs_len)
);

For example, interval '1' month is allowed by is_now_offset but rejected by WatermarkAnalyzer. So the following query (inspired by previous temporal_filter.slt failure in ci) would fail the debug_assert above:

create table t1 (v1 timestamp);
create materialized view mv1 as select v1 from t1 where v1 between now() and now() + interval '1 month';

It is not hard to update implementation here to keep that invariant. But I prefer to fix the problem without touching too many different components.

@xiangjinwu xiangjinwu closed this Oct 5, 2023
@xiangjinwu xiangjinwu deleted the fix-12523-fold-const-panic-wo-zone branch October 9, 2023 02:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/fix Bug fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

timestamp without timezone panic when selecting from kafka source with _rw_kafka_timestamp
1 participant