-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(udf): introduce #[derive(StructType)]
for struct types in Rust UDF
#15372
Conversation
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
let script = format!("#[arrow_udf::function(\"{identifier}\")]\n{function_body}"); | ||
body = Some(function_body.clone()); | ||
let script = format!("use arrow_udf::{{function, types::*}};\n{}", script); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the introduction of the new struct syntax, users now need to write #[function(...)] explicitly (previously we would prepend this macro for them).
What about checks whether user uses #[function
in the script, and decide whether to added this for them? 🤡 In this way, we can keep the simple old syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds a little hacky. But I like this idea. 🤡
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final rule:
- if the return type is a struct, they have to write
#[function]
by themselves. - otherwise, we add it for them.
Can you elaborate more on what blocks us to infer the
|
Previously the attribute is: #[function("key_value(varchar) -> struct<key:varchar, value:varchar>")] We have changed it to: #[function("key_value(varchar) -> struct KeyValue")] where If we also support user-defined type in SQL: create type KeyValue as (key varchar, value varchar); The syntax could be more natural and consistent: create function key_value(varchar) returns KeyValue language rust as $$
// generate: #[function("key_value(varchar) -> struct KeyValue")]
fn key_value(kv: &str) -> Option<KeyValue<'_>> {...}
$$;
Yes, it's feasible. While we don't support user-defined type, we can still infer its struct schema from the WASM binary. The binary has stored the schema as a string, like |
Signed-off-by: Runji Wang <[email protected]>
e90b403
to
536a675
Compare
Signed-off-by: Runji Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
See background in arrow-udf/arrow-udf#2.
This PR changes the syntax for returning struct types in Rust UDFs, making it easier for users to define complex struct type.
It has been released with arrow-udf v0.2. The old syntax in v0.1 is deprecated. But old binaries built with v0.1 are still accepted. We maintain backward compatibility for WASM binaries.
This PR also optimized the speed of the statement above. Previously, it took over 10s because it always required a clean build of the Rust code. Now, we set up a fixed temporary directory for it so that we can reuse the build cache. Building from the second time the can be shortened to about 2s.
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
updates the syntax in #14903