-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(udf): add udf metrics to embeded udf, refactor external udf metrics. #15506
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
we decide to count each trigger as only one call. Event it actually do several call, we can still infer this info from the latency metrics.
Actually I am a little worried about one case. When always_retry_on_network_error
is enabled, the call may be retrying forever. In this case we will no longer see any UDF metrics, although error messages can still be seen in the log. Not sure if this would have any impact on diagnosing UDF problems. cc @kwannoel
@@ -17,7 +17,8 @@ | |||
|
|||
mod error; | |||
mod external; | |||
mod metrics; | |||
pub mod metrics; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this module can be moved to expr/core
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me hold this first, I am thinking if we add some specific for in-flight udf, the metrics mod should better stay here. After solve this problem, I can move it in other PR.
src/expr/core/src/expr/expr_udf.rs
Outdated
// batch query does not have a fragment_id | ||
let fragment_id = FRAGMENT_ID::try_with(ToOwned::to_owned).unwrap_or(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment should be moved together.
src/expr/core/src/expr/expr_udf.rs
Outdated
UdfImpl::External(client) => client.get_addr(), | ||
_ => "", | ||
}; | ||
let labels: &[&str; 3] = &[addr, &self.identifier, fragment_id.as_str()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think if we add language
to the labels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I can add it.
Hmm could we have a way to track in-flight UDF calls? So if the number stays high for a long time, we know the udf server has issues. |
Maybe I can add only udf failure count for |
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
link
label will be empty ``. Show in following example, the first three record udf is external udf, the last one are embeded udf:language
labels:wasm
,javascript
,python
,external
, for example, the first three isexternal
, the last one isjavascript
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.