You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am patching the function call_builtin to generate a uuid string (a primary key id) for each row:
# path/to/snowpark_job.pyfromsnowflake.snowparkimportSessionfromsnowflake.snowpark.functionsimportcall_builtindefsnowpark_job(session: Session, table_name: str):
df=session.table(table_name)
df=df.with_column("id", call_builtin("UUID_STRING"))
df.show()
# path/to/test.pyfromunittestimportmockfromuuidimportuuid4fromsnowflake.snowparkimportSessionfromsnowflake.snowpark.functionsimportcall_builtinfromsnowflake.snowpark.mockimportColumnEmulator, ColumnTypefromsnowflake.snowpark.mockimportpatchassnowpark_patchfromsnowflake.snowpark.typesimportStringTypefrompath.to.snowpark_jobimportsnowpark_job@snowpark_patch(call_builtin)defpatch_call_builtin(function_name: str, *args, **kwargs) ->ColumnEmulator:
iffunction_name=="UUID_STRING":
ret_column=ColumnEmulator(data=[str(uuid4()) for_inrange(1000)])
ret_column.sf_type=ColumnType(StringType(), True)
returnret_columnelse:
raiseNotImplementedError(
f"If you want to use the builtin function '{function_name}' then you will need to add a case here to patch it"
)
@mock.patch("path.to.snowpark_job.call_builtin",new=patch_call_builtin,)deftest_snowpark_job():
session=Session.builder.config("local_testing", True).create()
snowpark_job(session, "test_table")
It raises AttributeError: 'ColumnEmulator' object has no attribute 'as_'. I have also tried .alias() and .name() instead of with_column and each raises a similar error.
What is the desired behavior?
ColumnEmulator class should support aliasing column names.
Also, somewhat separately, there are no other arguments passed to patch_call_builtin() other than function_name, so I don't know the number of rows to generate uuids for. This is what I see when I put a debugger inside patch_call_builtin()
My solution was to simply generate more than needed (using range() with a larger number than rows in my test dataset) but I'm not sure if that's going to work.
If this is not an existing feature in snowflake-snowpark-python. How would this impact/improve non local testing mode?
It is extremely common to rename columns during data transformations, especially when using builtin functions. If builtin functions are supposed to be supported in Snowpark local testing then aliasing those column names should also be supported.
References, Other Background
The text was updated successfully, but these errors were encountered:
github-actionsbot
changed the title
ColumnEmulator does not support aliasing column names
SNOW-1617523: ColumnEmulator does not support aliasing column names
Aug 7, 2024
What is the current behavior?
I am patching the function
call_builtin
to generate a uuid string (a primary key id) for each row:It raises
AttributeError: 'ColumnEmulator' object has no attribute 'as_'
. I have also tried.alias()
and.name()
instead ofwith_column
and each raises a similar error.What is the desired behavior?
ColumnEmulator class should support aliasing column names.
Also, somewhat separately, there are no other arguments passed to
patch_call_builtin()
other than function_name, so I don't know the number of rows to generate uuids for. This is what I see when I put a debugger insidepatch_call_builtin()
My solution was to simply generate more than needed (using range() with a larger number than rows in my test dataset) but I'm not sure if that's going to work.
If this is not an existing feature in
snowflake-snowpark-python
. How would this impact/improve non local testing mode?It is extremely common to rename columns during data transformations, especially when using builtin functions. If builtin functions are supposed to be supported in Snowpark local testing then aliasing those column names should also be supported.
References, Other Background
The text was updated successfully, but these errors were encountered: