Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(expr): distributed make timestamptz #13647

Merged
merged 9 commits into from
Nov 30, 2023

Conversation

KeXiangWang
Copy link
Contributor

@KeXiangWang KeXiangWang commented Nov 25, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Support distributed make_timestamptz.
CapturedContext is introduced for pass session info like timezone.

There will be two part where we need to utilized the captured context:

  1. When executing the executors, the contexted is used to finish the caluculation.
  2. When building exectuors, we may need to call the execute RPC in task_service to create a task on other CN as data source. At this time, we need to the capture these values and pass these values to the CN throught the execute RPC.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@KeXiangWang KeXiangWang requested a review from a team as a code owner November 25, 2023 00:05
Copy link

codecov bot commented Nov 25, 2023

Codecov Report

Attention: 55 lines in your changes are missing coverage. Please review.

Comparison is base (ce0121f) 68.21% compared to head (df139f6) 68.21%.

Files Patch % Lines
src/frontend/src/scheduler/distributed/stage.rs 22.72% 34 Missing ⚠️
src/batch/src/rpc/service/task_service.rs 0.00% 9 Missing ⚠️
src/expr/core/src/captured_execution_context.rs 42.85% 4 Missing ⚠️
src/batch/src/task/task_manager.rs 84.61% 2 Missing ⚠️
src/rpc_client/src/compute_client.rs 0.00% 2 Missing ⚠️
src/batch/src/execution/grpc_exchange.rs 0.00% 1 Missing ⚠️
src/batch/src/task/task_execution.rs 93.75% 1 Missing ⚠️
src/expr/impl/src/scalar/make_timestamptz.rs 0.00% 1 Missing ⚠️
src/expr/macro/src/lib.rs 96.55% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13647      +/-   ##
==========================================
- Coverage   68.21%   68.21%   -0.01%     
==========================================
  Files        1523     1524       +1     
  Lines      262264   262330      +66     
==========================================
+ Hits       178902   178938      +36     
- Misses      83362    83392      +30     
Flag Coverage Δ
rust 68.21% <57.03%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@fuyufjh fuyufjh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! Generally LGTM.

proto/batch_plan.proto Outdated Show resolved Hide resolved
@@ -359,3 +359,7 @@ message PlanFragment {
PlanNode root = 1;
ExchangeInfo exchange_info = 2;
}

message CapturedContext {
Copy link
Member

@fuyufjh fuyufjh Nov 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the CapturedContext message is defined in batch_plan.proto, while in src/expr/core/src/captured_context.rs, it's said to be "For all execution mode".

So shall we move this to somewhere else like expr.proto or compute.proto or common.proto?

Btw, please add some comments for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CaptureContext is not a good name here. We won't pass every captured context to CN (it's impossible).

A concrete name may be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm about to move it stream_plan as we also need this for streaming execution mode.

For naming, what about SessionContext? As this struct is mainly used to capture various sessional values for various execution modes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to name it ExecutionContext since it's about one query execution. It may include things even more limited to one execution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecutionContext is good, but it conflicts with an exsiting struct in frontend codes. I'm concerning it will cause confusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or ExecutionContextInfo? In fact I expect the existing struct should contains such info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contextand Info together seems to be a little redundant. Maybe CapturedExecutionContext?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@xiangjinwu xiangjinwu Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm about to move it stream_plan as we also need this for streaming execution mode.

Why not plan_common.proto?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂It‘s actually not used in batch_plan.proto, so stream_plan.proto is enough. But plan_common.proto make more sense. I'll move it to plan_common.proto.

src/frontend/src/scheduler/distributed/stage.rs Outdated Show resolved Hide resolved
Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, the contexts were all previously defined in the frontend crate because...

  • We are sure they're evaluated in the frontend (local mode), as they're mostly admin functions.
  • The context types are only available in frontend crate.

According to those, it seems reasonable to extract the TIME_ZONE context out of frontend and into expr crate. As a result, we have to find all paths to provide the context before evaluating. However, I'm considering whether there's really a strong reason for not inlining the timezone into the expression itself (as a constant argument). This is how we handle at_timezone and now before, which is really easy to implement and not invasive at all.

@BugenZhao
Copy link
Member

inlining the timezone into the expression itself

IIUC, simply adding a match arm of ExprType::MakeTimestampTz with 6 arguments and rewrite it to 7 arguments would work. 🤔

@KeXiangWang
Copy link
Contributor Author

IIUC, simply adding a match arm of ExprType::MakeTimestampTz with 6 arguments and rewrite it to 7 arguments would work. 🤔

Yes, current rewrite machism is friendly to expr part but cause some problems on frontend's logics. Thanks to @xiangjinwu for helping find the following issues caused by current rewrite mechanism:

#8415
#8403
The backend constructs its own expr and bypasses the frontend rewrite.

#7175 (comment)
#13640
Some parts' rewrite is missed.

#10086
#9580
#12633
The frontend rewrite can be finished in different stage. And sometimes the rewrite can cause conflict with the logic of is_now_offset and WatermarkAnalyzer.

#12633 (comment)
#12640
Invalid expr caused by not rewrite.

#10410
#13201
Another example of wrong rewrite logics.

In summary, current rewrite mechanism introduce challenges on frontend rewrite. It's easy to miss or incorrectly perform the rewrite. That's why I'm now trying to redesign the implementation of timezone, starting with make_timestamptz func.

This PR's implementation is somehow tricky but avoids the error-prone rewrite.

We can discuss which is better.

@KeXiangWang KeXiangWang force-pushed the wkx/distributed-make-timestamptz branch from b000219 to 42e6765 Compare November 28, 2023 22:57
Copy link
Member

@fuyufjh fuyufjh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@xiangjinwu xiangjinwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for establishing the new route!

src/batch/src/task/task_manager.rs Outdated Show resolved Hide resolved

let ctx = quote! { let ctx = #context; };
let mut body = quote! { #closure };
let fields = vec![format_ident!("time_zone")];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems here is some space for improvement, but we can consider them later:

  • Better not hard coding time_zone, and find a way to "reflect" on all fields
  • Or it can be generalized to accept a predefined collection of contexts, so that src/frontend/src/scheduler/local.rs can also benefit from it.

Copy link
Contributor Author

@KeXiangWang KeXiangWang Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave it to the streaming mode PR.

src/expr/core/src/captured_execution_context.rs Outdated Show resolved Hide resolved
Comment on lines +439 to +440
let exec = captured_execution_context_scope!(
captured_execution_context.clone(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a result, we have to find all paths to provide the context before evaluating. ...(the frontend inline rewriting approach) is really easy to implement and not invasive at all.

We either need to find all paths to provide the context before evaluating, or find all paths to inline the context before evaluating. The primary goal of these make_timestamptz PRs are not to support this new function for users, but to be the pioneer and see if the new approach is better than existing one.

@KeXiangWang How did you figure out these places to provide context in this PR? If some of them slip through, how easy are they exposed as a runtime error? My impression was that there are less places and easier to surface compared to the expr rewrite approach (recent addition #13724). But this can be wrong and simply due to we had more time to receive bugs on the rewrite approach.

I also agree inlining is less invasive, to the extend that it is solely in frontend. But if approach A requires awareness in 10 submodules in 1 module, while approach B requires awareness in 2 submodules in 3 parent modules each, it may also be acceptable to consider A as more invasive because 10 > 2 * 3. Again these numbers are made up arbitrarily and personally I feel the difference between 10 and 6 is not large enough to justify it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the codes in frontend and run some test examples to help verify whether all of them are found.

The good thing is when there's an expression been evaluated without context, it will throw an error like "context TIME_ZONE not found" and the error is shown in psql console, making it easy to recognize. For streaming mode, it will generate one log with the same error message and when we read from the MV, the result of that row is missed.

Another benefit is that we only need to find all the paths (to provide the context before evaluating) once. Once it's finished correctly with make_timestamptz, unless we have new paths, we don't need to care it in following the implementation of other expressions.

@KeXiangWang KeXiangWang added this pull request to the merge queue Nov 30, 2023
Merged via the queue into main with commit 67c2d70 Nov 30, 2023
29 of 30 checks passed
@KeXiangWang KeXiangWang deleted the wkx/distributed-make-timestamptz branch November 30, 2023 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants