Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1619160: Support for patching table functions #2055

Open
djfletcher opened this issue Aug 8, 2024 · 1 comment
Open

SNOW-1619160: Support for patching table functions #2055

djfletcher opened this issue Aug 8, 2024 · 1 comment
Assignees
Labels
feature New feature or request local testing Local Testing issues/PRs

Comments

@djfletcher
Copy link

What is the current behavior?

Table functions, including builtin snowpark ones like flatten, raise an exception in tests: NotImplementedError: [Local Testing] table_function.TableFunctionJoin is not supported. When I try to patch it, the patched function is passed a normal Column and not a ColumnEmulator with the underlying rows series:

# path/to/snowpark_job.py
from snowflake.snowpark import Session
from snowflake.snowpark.functions import flatten


def snowpark_job(session: Session):
    df = session.create_dataframe([[[1, 2, 3], [4, 5], []]], schema=["lists"])
    flattened_df = df.select(flatten(df.lists))
    flattened_df.show()

Nor can I patch the builtin table function. As far as I can tell, patching functions using snowflake.snowpark.mock does not support returning 0, 1, or many rows per input row. But more specifically, when I put a debugger inside patch_flatten() it is being passed a normal Column and not a ColumnEmulator so I can't interact with the underlying series of rows.

# path/to/test.py
from unittest import mock
from uuid import uuid4

from snowflake.snowpark import Session
from snowflake.snowpark.functions import flatten
from snowflake.snowpark.mock import ColumnEmulator, ColumnType
from snowflake.snowpark.mock import patch as snowpark_patch
from snowflake.snowpark.types import IntegerType

from path.to.snowpark_job import snowpark_job


@snowpark_patch(flatten)
def patch_flatten(column: ColumnEmulator, *args, **kwargs) -> ColumnEmulator:
    ret_data = [integer for row in column for integer in row]
    ret_column = ColumnEmulator(data=ret_data)
    ret_column.sf_type = ColumnType(IntegerType(), True)
    return ret_column


@mock.patch(
    "path.to.snowpark_job.flatten",
    new=patch_flatten,
)
def test_snowpark_job():
    session = Session.builder.config("local_testing", True).create()
    snowpark_job(session)

What is the desired behavior?

Ideally builtin table functions like flatten have test implementations, but more generally it might be more practical to support patching table functions.

If this is not an existing feature in snowflake-snowpark-python. How would this impact/improve non local testing mode?

Table functions are a fairly common use case when transforming semi structured data into structured data, so it would make the library more robust.

References, Other Background

@djfletcher djfletcher added feature New feature or request local testing Local Testing issues/PRs labels Aug 8, 2024
@github-actions github-actions bot changed the title Support for patching table functions SNOW-1619160: Support for patching table functions Aug 8, 2024
@flopetegui
Copy link

Any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request local testing Local Testing issues/PRs
Projects
None yet
Development

No branches or pull requests

3 participants