You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Data Wrangler Extension version (available under the Extensions sidebar): v1.12.1
Jupyter Extension version (available under the Extensions sidebar): v2024.10.0
Python Extension version (available under the Extensions sidebar): v2024.20.0
OS (Windows | Mac | Linux distro) and version: Windows
Pandas version: None
Python and/or Anaconda version: 3.11.4
Type of virtual environment used (N/A | venv | virtualenv | conda | ...): N/A
Expected behaviour
Data Wrangler can execute Polars code in custom operations window
Actual behaviour
Exception:
I think it is because opening plain csv doesn't have any runtime context, so it defaults to using pandas and doesn't support other libraries.
Steps to reproduce:
Open a CSV file
Click "Open in Data Wrangler"
Write a Polars df code like df = df.sort("value", reverse=True) and execute
Logs
Output for Jupyter in the Output panel (View→Output, change the drop-down the upper-right of the Output panel to Jupyter)
AttributeError: 'DataFrame' object has no attribute 'sort'
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_40096\1255212943.py in ?(code, old_ns, new_ns)
38 name = get_ipython().compile.cache(code)
39 except Exception:
40 name = "<string>"
41
---> 42 exec(compile(code, name, 'exec'), session['namespaces']["create"](new_ns, old_ns))
~\AppData\Local\Temp\ipykernel_40096\2332454470.py in ?()
1 # Sort by value in descending order
----> 2 df = df.sort("value", reverse=True)
~\AppData\Roaming\Python\Python311\site-packages\pandas\core\generic.py in ?(self, name)
6295 and name not in self._accessors
6296 and self._info_axis._can_hold_identifiers_and_holds_name(name)
6297 ):
6298 return self[name]
-> 6299 return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'sort'
The text was updated successfully, but these errors were encountered:
Hi @biiiipy, thanks for opening this issue! We don't currently support using Polars code to manipulate the DataFrame (we only support loading from Polars DataFrames by converting it into Pandas).
For your use-case, do you mostly care that the exported code is in Polars? (e.g. you interact with the DataFrame during the interactive Data Wrangler session using the built-in operations UI and Pandas, and we translate the code on export, which can be used in a data pipeline written with Polars)
Or alternatively, is it more important to be able to work directly in Polars (for example, you have really large files you are working with locally that you are unable to effectively sample).
The second option would be preferred (although understandably more work from the dev perspective) as it allows the user to stick with using just Polars and avoids any potential translation issues or having to read both Polars and Pandas code.
Environment data
Expected behaviour
Data Wrangler can execute Polars code in custom operations window
Actual behaviour
Exception:
I think it is because opening plain csv doesn't have any runtime context, so it defaults to using pandas and doesn't support other libraries.
Steps to reproduce:
df = df.sort("value", reverse=True)
and executeLogs
Output for
Jupyter
in theOutput
panel (View
→Output
, change the drop-down the upper-right of theOutput
panel toJupyter
)The text was updated successfully, but these errors were encountered: