-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(expr): Implement evaluation for CONCAT_WS expression #2470
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 800 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
798 | 1 | 1 | 0 |
Click to see the invalid file list
- src/expr/src/expr/expr_concat_ws.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 800 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
798 | 1 | 1 | 0 |
Click to see the invalid file list
- src/expr/src/expr/expr_concat_ws.rs
|
2ab523d
to
bc43bda
Compare
Only wrote unit tests. Are there any integration tests I should be writing? |
Codecov Report
@@ Coverage Diff @@
## main #2470 +/- ##
==========================================
+ Coverage 71.39% 71.43% +0.03%
==========================================
Files 688 689 +1
Lines 88505 88611 +106
==========================================
+ Hits 63191 63298 +107
+ Misses 25314 25313 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job!
src/expr/src/expr/expr_concat_ws.rs
Outdated
let mut builder = Utf8ArrayBuilder::new(input.cardinality())?; | ||
let row_len = string_columns[0].len(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is row_len
the same as input.cardinality()
?
src/expr/src/expr/expr_concat_ws.rs
Outdated
if sep.is_none() { | ||
builder.append(None)?; | ||
continue; | ||
} | ||
let sep = sep.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let sep = match sep {
Some(sep) => sep,
None => {
builder.append(None)?;
continue;
}
}
src/expr/src/expr/expr_concat_ws.rs
Outdated
let mut found_first = false; | ||
|
||
for string_column in &string_columns { | ||
let string_column = string_column.as_utf8(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call as_utf8
once for each column rather than in the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some help regarding this. I tried to call as_utf8
above when I initialized string_columns
: https://github.com/singularity-data/risingwave/pull/2470/files#diff-5c777fd52ffc145fbcdb963a452809c403db833992216a8361b724b555743217R45-R49.
This is the updated code. It does not compile because r.as_utf8()
returns data owned by the closure:
let string_columns = self
.string_exprs
.iter()
.map(|c| {
let r: ArrayRef /* */ = c.eval(input)?;
Ok(r.as_utf8())
})
.collect::<Result<Vec<_>>>()?;
I have also tried into_utf8()
, but that only works if we have ownership of ArrayImpl
, which we don't in this case.
Any other approach you can suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let string_columns_ref = string_columns.iter().map(|c| {
c.as_utf8()
}).collect::<Vec<_>>();
right after string_columns
is constructed.
Then use for string_column in &string_columns_ref
instead in the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the updated code, as per your suggestion:
let string_columns = self
.string_exprs
.iter()
.map(|c| c.eval(input))
.collect::<Result<Vec<Arc<ArrayImpl>>>>()?;
let string_columns_ref = string_columns
.iter()
.map(|c| c.as_utf8() )
.collect::<Vec<&Utf8Array>>();
To check my understanding: this works because string_columns
lifetime lasts until the end of the eval
function scope. Is this correct?
src/expr/src/expr/expr_concat_ws.rs
Outdated
} | ||
let sep = sep.unwrap(); | ||
|
||
let mut res = String::from(""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is PartialBytesWriter
in utf8_array.rs
that supporting writing directly to the Utf8ArrayBuilder. You can take a look at the doc comments on these types:
BytesWriter
, PartialBytesWriter
, BytesGuard
src/expr/src/expr/expr_concat_ws.rs
Outdated
let array = Utf8Array::from_slice(&[Some(","), None, Some(","), Some(",")]) | ||
.map(|x| Arc::new(x.into())) | ||
.unwrap(); | ||
let col1 = Column::new(array); | ||
let array = Utf8Array::from_slice(&[Some("a"), Some("a"), None, None]) | ||
.map(|x| Arc::new(x.into())) | ||
.unwrap(); | ||
let col2 = Column::new(array); | ||
let array = Utf8Array::from_slice(&[Some("b"), Some("b"), Some("a"), None]) | ||
.map(|x| Arc::new(x.into())) | ||
.unwrap(); | ||
let col3 = Column::new(array); | ||
|
||
let data_chunk = DataChunk::builder().columns(vec![col1, col2, col3]).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try out DataChunk::from_pretty
src/expr/src/expr/expr_concat_ws.rs
Outdated
let array = Utf8Array::from_slice(&[Some(","), None, Some(","), Some(",")]) | ||
.map(|x| Arc::new(x.into())) | ||
.unwrap(); | ||
let col1 = Column::new(array); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may use column!
macro
src/expr/src/expr/expr_concat_ws.rs
Outdated
for string_column in &string_columns { | ||
let string_column = string_column.as_utf8(); | ||
if let Some(string) = string_column.value_at(row_idx) { | ||
if !found_first { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can also move this found_first = true
and res.push_str(string);
out of the for string_column
loop to eliminate the if branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What's changed and what's your intention?
Summarize your change (mandatory)
As per title. Changes for frontend will be implemented separately.
How does this PR work? Need a brief introduction for the changed logic (optional)
Describe clearly one logical change and avoid lazy messages (optional)
Describe any limitations of the current code (optional)
Checklist
Refer to a related PR or issue link (optional)
Part of #2405. Migrated over from #2427 for CI to run correctly. cc @skyzh .