Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sql-udf): deep calling stack (recursion) prevention for sql udf #14392

Merged
merged 12 commits into from
Jan 12, 2024

Conversation

xzhseh
Copy link
Contributor

@xzhseh xzhseh commented Jan 5, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

As titled, to simulate pg's behavior when dealing with recursive sql udf, a SQL_UDF_MAX_RECURSIVE_DEPTH is set to 16.
The reason for this is that, during local testing, ~60 calls to bind_function will exhaust the stack space for the current running rw thread, which will then be killed by os, and the entire cluster will quit.

And note that some test cases are updated, the previous fragile format check for recursive definition is removed. (i.e., create function foo(INT) returns varchar language sql as $$select 'foo(INT)'$$; is totally valid now)

reference: #14139 (comment). (Please note that the implementation is not exact the same as the proposal in the discussion)

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

Copy link
Member

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not sure 16 is safe. It depends on OS configuration, optimization level, target arch, and so on. So that's why I mentioned stacker as a fallback in #14139 (comment).

But the overall idea LGTM.

src/frontend/src/binder/mod.rs Outdated Show resolved Hide resolved
src/frontend/src/binder/expr/function.rs Outdated Show resolved Hide resolved
src/frontend/src/binder/mod.rs Outdated Show resolved Hide resolved
src/frontend/src/binder/mod.rs Outdated Show resolved Hide resolved
@xzhseh
Copy link
Contributor Author

xzhseh commented Jan 10, 2024

Test cases are updated (i.e., materialized view, valid recursive definition are covered), current approach is changed to global counting calling stack depth to prevent heavy binding along the way.

Note that this is not just related to potential recursive call, but also normal deep calling chain, we should be able to detect and timely return error for these cases, otherwise the cluster will panic and quit.

Since stacker is not quite useful for this problem, I decided to stick with normal counting approach, 16 allowance for bind_function should be fair and could cover most use cases.

cc @stdrc @xxchan @BugenZhao @wangrunji0408.

P.S. After private discussion, recursive sql udf support is postponed at present, maybe supported in the future.

Copy link
Contributor

@wangrunji0408 wangrunji0408 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM!

Comment on lines +296 to +297
// Update the status for the global counter
self.udf_context.incr_global_count();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems not right that this counter never decreases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The counter will be reset to 0 after a success binding to a specific sql udf. (i.e., when cleaning the udf_context)

// Clean the `udf_context` & `udf_recursive_context` after inlining,
// which makes sure the subsequent binding will not be affected
if clean_flag {
    self.udf_context.clear();
}

Copy link
Contributor

@TennyZhuang TennyZhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/frontend/src/binder/mod.rs Show resolved Hide resolved
@xzhseh xzhseh added this pull request to the merge queue Jan 12, 2024
Merged via the queue into main with commit 1afb0ec Jan 12, 2024
26 of 27 checks passed
@xzhseh xzhseh deleted the xzhseh/max-recursive-depth-sql-udf branch January 12, 2024 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants