[FEA] kernel for date_trunc and trunc that has a scalar format #11860

revans2 · 2024-12-11T16:00:09Z

Is your feature request related to a problem? Please describe.
The current implementation of trunc and date_trunc use a kernel where the format is a column.

This works great. In my tests I saw a 180x speedup over the CPU (16 cores). But we could save a lot of memory if the format is a scalar, which I think is the most common case.

In the best case where the format string is "DD", a new kernel we would save about 6 bytes per row of input. On a date that is 150% increase in memory usage. For a timestamp that is only a 75% increase in memory usage. But for the worst case on a date it is "QUARTER" or a 275% increase in memory usage. For a timestamp it is "MICROSECOND", which would be 187.5% increase in memory usage. This is probably minor, but it would be nice.

ttnghia · 2024-12-11T18:42:20Z

The current JNI implementation is already optimized for memory. Although the scala is promoted into a column, such column has only one row thus there is no change in memory usage.

However, I agree that we can optimize further, but in term of performance. Currently, a format string is parsed when processing every row, even there is only one format value (column format of size one). We can do better by parsing the scalar format (on host) before calling the kernel, saving time for the GPU by not doing so again. I'll post a JNI PR shortly.

revans2 added ? - Needs Triage Need team to review and classify feature request New feature or request labels Dec 11, 2024

revans2 mentioned this issue Dec 11, 2024

Support trunc and date_trunc SQL function #11833

Merged

ttnghia linked a pull request Dec 13, 2024 that will close this issue

Support trunc and date_trunc SQL function #11833

Merged

ttnghia closed this as completed in #11833 Dec 14, 2024

sameerz added performance A performance related task/issue and removed feature request New feature or request ? - Needs Triage Need team to review and classify labels Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] kernel for date_trunc and trunc that has a scalar format #11860

[FEA] kernel for date_trunc and trunc that has a scalar format #11860

revans2 commented Dec 11, 2024

ttnghia commented Dec 11, 2024 •

edited

Loading

[FEA] kernel for date_trunc and trunc that has a scalar format #11860

[FEA] kernel for date_trunc and trunc that has a scalar format #11860

Comments

revans2 commented Dec 11, 2024

ttnghia commented Dec 11, 2024 • edited Loading

ttnghia commented Dec 11, 2024 •

edited

Loading