-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: lambda expression #69
base: main
Are you sure you want to change the base?
Conversation
|
||
### level 2: scalar lambda expression | ||
|
||
Implement lambda expression function to make user can define there `transform` logic for array's each elements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
risingwavelabs/risingwave#11123
Another part of this solution is to implement transform
and reduce
functions, whose inputs contain both data column and the lambda expression.
I find that we can unnest this kind of subquery in some way. Here is a PR on how to transpose Apply with ProjectSet. risingwavelabs/risingwave#11390.
|
LGTM, but still not the best because it introduce unnessary Join operator 🤔 |
True. It could be a temporal solution. From a performance perspective, this RFC could do better. |
Co-authored-by: TennyZhuang <[email protected]>
Co-authored-by: CAJan93 <[email protected]>
```SQL | ||
select *, | ||
array_sum(arr_i), | ||
array_sum(transform(arr_json, v -> (v->'x')::int)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be a little confused that use ->
for two purposes, and increase the complexity of parser and binder.
I'd propose using =>
for lambda function.
The latest progress: TL;DR
The lambda syntaxIn this RFC, we proposed using the arrow symbol “->” as the syntax for lambda expressions, such as A bad case: how should We also proposed the -- Lambda function
SELECT f(ARRAY[1,2,3], x => x * 2);
-- Named notation
SELECT abs(a => -1)
-- ???
SELECT f(apply => (x => x * 2), array => ARRAY[1,2,3]); There are definitely many compiler techniques that can solve the problem of ambiguity in the above syntax. After all, it is much simpler compared to C++ syntax. However, as an experimental feature, we do not want to introduce very complex refactors into the parser for it. We have decided to look for a syntax that is not ambiguous and relatively simple to support this feature. The final choice is a Rust-like syntax In the current pg 15, there is no built-in operator that allows the prefix Of course, another important reason is that we like Rust. Choosing the Rust-like syntax in the MVP version does not mean we will not change, but because the cost of implementing it is very low, even if we eventually choose another syntax, we can simply deprecate the rust-like one and maintain compatibility with it forever without much effort. If the PostgreSQL eventually support the lambda function using another syntax, or our users really like the The
|
It is possible to introduce another macro to generate array scalar functions from aggregate definitions. It may look like: #[aggregate("sum(int64) -> int64")] // original aggregate
#[array_function("array_sum(int64[]) -> int64")] // the new macro
fn sum(state: i64, input: i64) -> i64 {
state + input
} This approach requires no extra code for each function, instead it puts all complexity into the macro. By contrast, manually implement each #[function("array_sum(int64[]) -> int64")]
fn array_sum(array: ListRef<'_>) -> i64 {
array.as_int64().iter().sum()
} So I would also prefer manual implementing each array function in the first step. |
Just ran into this: |
use lambda expression to enhance array's complex processing