-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: WASM UDF MVP #10910
feat: WASM UDF MVP #10910
Conversation
237d240
to
3039d04
Compare
src/udf/src/wasm.rs
Outdated
identifier: &str, | ||
) -> WasmUdfResult<InstantiatedComponent> { | ||
let object_store = get_wasm_storage(wasm_storage_url).await?; | ||
let serialized_component = object_store.read(&compiled_path(identifier), None).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXME: This stuck forever when I use S3. WHY??? 🥵
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update: this was because of using futures
rt inside tokio block_on
(or sth). Already fixed by tokio::task::block_in_place(|| { tokio::runtime::Handle::current().block_on
b52a004
to
26eaf7a
Compare
commit b52a004 Author: xxchan <[email protected]> Date: Thu Oct 26 14:25:18 2023 +0800 update arrow-ipc commit e94feeb Author: xxchan <[email protected]> Date: Thu Oct 26 06:21:34 2023 +0000 Fix "cargo-hakari" commit 08a5601 Merge: 56e6fc4 942e99d Author: xxchan <[email protected]> Date: Thu Oct 26 14:19:34 2023 +0800 Merge branch 'main' into xxchan/wasm-udf commit 942e99d Author: Yufan Song <[email protected]> Date: Wed Oct 25 22:10:31 2023 -0700 fix(nats-connector): change stream into optional string, add replace stream name logic (#13024) commit 90fb4a3 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Thu Oct 26 04:25:11 2023 +0000 chore(deps): Bump comfy-table from 7.0.1 to 7.1.0 (#13049) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit b724be7 Author: jinser <[email protected]> Date: Thu Oct 26 00:26:15 2023 +0800 feat: add `comment on` clause support (#12849) Co-authored-by: Richard Chien <[email protected]> Co-authored-by: August <[email protected]> commit 7f791d6 Author: August <[email protected]> Date: Wed Oct 25 20:29:16 2023 +0800 feat: move model_v2 and model_migration into a separate crates (#13058) commit 7f82929 Author: Noel Kwan <[email protected]> Date: Wed Oct 25 16:57:45 2023 +0800 fix(meta): persist internal tables of `CREATE TABLE` (#13039) commit 09a67ab Author: Noel Kwan <[email protected]> Date: Wed Oct 25 16:49:08 2023 +0800 fix: `WAIT` should return error if timeout (#13045) commit e48547d Author: Runji Wang <[email protected]> Date: Wed Oct 25 16:41:16 2023 +0800 refactor(type): switch jsonb to flat representation (#12952) Signed-off-by: Runji Wang <[email protected]> commit 56e6fc4 Author: xxchan <[email protected]> Date: Wed Oct 25 15:33:36 2023 +0800 fix merge issue commit c644361 Merge: fcd6992 2d428b1 Author: xxchan <[email protected]> Date: Wed Oct 25 15:23:44 2023 +0800 Merge remote-tracking branch 'origin/main' into xxchan/wasm-udf commit fcd6992 Author: xxchan <[email protected]> Date: Wed Oct 25 14:28:53 2023 +0800 fix s3 stuck commit 21e9740 Author: xxchan <[email protected]> Date: Wed Oct 25 12:47:24 2023 +0800 Revert "fix s3 stuck (why?)" This reverts commit f19a6b4. commit f19a6b4 Author: xxchan <[email protected]> Date: Wed Sep 13 14:32:28 2023 +0800 fix s3 stuck (why?) commit 019f309 Author: xxchan <[email protected]> Date: Tue Sep 12 15:29:52 2023 +0800 ON_ERROR_STOP=1 commit 6e4ee3c Author: xxchan <[email protected]> Date: Tue Sep 12 15:09:58 2023 +0800 generate-config commit b63a1c3 Merge: 2b0cc96 53611bf Author: xxchan <[email protected]> Date: Tue Sep 12 14:53:10 2023 +0800 Merge remote-tracking branch 'origin/main' into xxchan/wasm-udf commit 2b0cc96 Author: xxchan <[email protected]> Date: Sat Sep 9 23:49:43 2023 +0800 fix conflicts commit 6b13fe3 Author: xxchan <[email protected]> Date: Sat Sep 9 23:35:50 2023 +0800 update system param default commit a273943 Merge: cc34bfe f649aa6 Author: xxchan <[email protected]> Date: Sat Sep 9 23:33:38 2023 +0800 Merge remote-tracking branch 'origin/main' into xxchan/wasm-udf commit cc34bfe Author: xxchan <[email protected]> Date: Tue Aug 1 17:47:42 2023 +0200 use count_char as the example commit f913f63 Merge: 53bf8e0 2637dbd Author: xxchan <[email protected]> Date: Tue Aug 1 17:22:13 2023 +0200 Merge branch 'main' into xxchan/wasm-udf commit 53bf8e0 Author: xxchan <[email protected]> Date: Mon Jul 31 14:20:07 2023 +0200 minor update commit 70cee42 Author: xxchan <[email protected]> Date: Mon Jul 17 14:53:29 2023 +0200 fix arrow_schema into -> try_into commit a7d172d Author: xxchan <[email protected]> Date: Fri Jul 14 16:31:20 2023 +0200 buf format commit 43a3290 Author: xxchan <[email protected]> Date: Thu Jul 13 23:04:16 2023 +0200 add tinygo example & turn on wasi support commit 61a4998 Author: xxchan <[email protected]> Date: Wed Jul 12 11:40:56 2023 +0200 cleanup commit 165d4d9 Author: xxchan <[email protected]> Date: Wed Jul 12 11:02:44 2023 +0200 use object store to store wasm commit 88979e4 Author: xxchan <[email protected]> Date: Tue Jul 11 15:32:52 2023 +0200 add wasm_storage_url system param commit a897320 Author: xxchan <[email protected]> Date: Thu Jul 6 20:04:45 2023 +0200 Load compiled wasm module in expr 🚀🚀🚀 commit 63b3523 Author: xxchan <[email protected]> Date: Sun Jul 2 19:27:22 2023 +0200 it works (although very slow)
26eaf7a
to
e111248
Compare
69a280b
to
5b53a7c
Compare
Signed-off-by: Runji Wang <[email protected]>
5b53a7c
to
97f64e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested and it works! Great job! Next I'm going to review the code and begin to learn wasm. 🤡
src/udf/Cargo.toml
Outdated
cfg-or-panic = "0.2" | ||
futures-util = "0.3.28" | ||
itertools = "0.11" | ||
risingwave_object_store = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to decouple UDF from object store and put related code into the compute node.
src/storage/hummock_trace/Cargo.toml
Outdated
@@ -25,7 +25,7 @@ tokio = { version = "0.2", package = "madsim-tokio" } | |||
tracing = "0.1" | |||
|
|||
[dev-dependencies] | |||
itertools = "0.10.5" | |||
itertools = "0.11" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
itertools = "0.11" | |
itertools = "0.12" |
src/udf/wit_example/create.sh
Outdated
# debug: 23557258 | ||
# release: 12457072 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The binary size looks much larger than I imagine. Is there a way to strip the binary? or is it possible to build on target wasm32-unknown-unknown
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to build on target
wasm32-unknown-unknown
?
This is not possible and there's a long story. But I forget the details now. 🤡
Is there a way to strip the binary?
I think so. I remember there are some tools for this purpose. And IIRC currently debuginfo is also included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please tell me when you recall the story. 🤡
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe my memory was wrong.
I remember:
- We can definitely compile a wit module to
wasm32-unknown-unknown
and run it. - There are many things don't work in
wasm32-unknown-unknown
. - From the runtime side (host), API for wasi and non-wasi are different.
But I don't remember:
- (Mainly) Whether the WIT UDF example (especially
arrow-rs
) works inwasm32-unknown-unknown
- Whether a non-wasi module can be run with the wasi host API.
Maybe I was using wasi just because I can println
🤡
I will also need to relearn it 🤡 |
WasmUdf( | ||
#[from] | ||
#[backtrace] | ||
risingwave_udf::wasm::WasmUdfError, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this error be merged into risingwave_udf::Error
?
// for backward compatibility, newly added fields should be optional | ||
pub extra: Option<PbExtra>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this struct can be reorganized without worrying about backward compatibility like protobuf messages.
src/udf/wit_example/rust/Cargo.toml
Outdated
[profile.release] | ||
debug = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this, the binary size in release mode could be reduced to 3MB.
[profile.release] | |
debug = 1 |
src/udf/wit_example/rust/src/lib.rs
Outdated
} | ||
|
||
impl Udf for CountChar { | ||
fn eval(batch: RecordBatch) -> Result<RecordBatch, EvalErrno> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to build another #[function]
macro to generate these code. It would be very similar to our internal #[function]
macro. The only difference is fitting Arrow arrays.
src/udf/wit/udf.wit
Outdated
// TODO: is schema needed? since record-batch already contains schema. | ||
export input-schema: func() -> schema | ||
export output-schema: func() -> schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's needed as a signature of the function.
src/udf/wit/udf.wit
Outdated
export input-schema: func() -> schema | ||
export output-schema: func() -> schema | ||
|
||
// export init: func(inputs: list<scalar>) -> result<_, init-errno> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this method for?
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
f697a5e
to
d5db2b6
Compare
@@ -48,7 +51,7 @@ pub async fn handle_create_function( | |||
Some(lang) => { | |||
let lang = lang.real_value().to_lowercase(); | |||
match &*lang { | |||
"python" | "java" => lang, | |||
"python" | "java" | "wasm_v1" => lang, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version can be encoded as metadata in wasm binaries so that the language can be always "wasm".
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
To test it, just run:
TODO
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Types of user-facing changes
Release note
Support WebAssembly UDF.
Usage
Users can create a WASM component with the WIT file
src/udf/wit/udf.wit
and Apache Arrow. They can use different programming languages (rust and golang are provided examples).To create a function with the compiled WASM component,
See
src/udf/wit_example
for example code & required tools.Configuration
One additional system parameter
wasm_storage_url
is added (defaults tofs://@/tmp/risingwave
, which is used to store the user-uploaded WASM file, and intermediate compilation artifacts.