Skip to content

Commit

Permalink
Add compute_column_expression to pylibcudf for transform.compute_colu…
Browse files Browse the repository at this point in the history
…mn (#17279)

Follow up to #16760

`transform.compute_column` (backing `.eval`) requires an `Expression` object created by a private routine in cudf Python. Since this routine will be needed for any user of the public `transform.compute_column`, moving it to pylibcudf.

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #17279
  • Loading branch information
mroeschke authored Nov 20, 2024
1 parent 56061bd commit 7158ee0
Show file tree
Hide file tree
Showing 4 changed files with 226 additions and 237 deletions.
10 changes: 2 additions & 8 deletions python/cudf/cudf/_lib/transform.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,10 @@
from numba.np import numpy_support

import cudf
from cudf.core._internals.expressions import parse_expression
from cudf.core.buffer import acquire_spill_lock, as_buffer
from cudf.utils import cudautils

from pylibcudf cimport transform as plc_transform
from pylibcudf.expressions cimport Expression
from pylibcudf.libcudf.types cimport size_type

from cudf._lib.column cimport Column
Expand Down Expand Up @@ -93,7 +91,7 @@ def one_hot_encode(Column input_column, Column categories):


@acquire_spill_lock()
def compute_column(list columns, tuple column_names, expr: str):
def compute_column(list columns, tuple column_names, str expr):
"""Compute a new column by evaluating an expression on a set of columns.
Parameters
Expand All @@ -108,12 +106,8 @@ def compute_column(list columns, tuple column_names, expr: str):
expr : str
The expression to evaluate.
"""
visitor = parse_expression(expr, column_names)

# At the end, all the stack contains is the expression to evaluate.
cdef Expression cudf_expr = visitor.expression
result = plc_transform.compute_column(
plc.Table([col.to_pylibcudf(mode="read") for col in columns]),
cudf_expr,
plc.expressions.to_expression(expr, column_names),
)
return Column.from_pylibcudf(result)
229 changes: 0 additions & 229 deletions python/cudf/cudf/core/_internals/expressions.py

This file was deleted.

2 changes: 2 additions & 0 deletions python/pylibcudf/pylibcudf/expressions.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,5 @@ class Operation(Expression):
left: Expression,
right: Expression | None = None,
): ...

def to_expression(expr: str, column_names: tuple[str, ...]) -> Expression: ...
Loading

0 comments on commit 7158ee0

Please sign in to comment.