-
Notifications
You must be signed in to change notification settings - Fork 25
Python Integration
(valid for Python Integration >= 4.1, 2020-09)
The Python nodes can work with local Python installations. Preference settings allow to set the path to a Python executable for Python3 and 2 respectively. Within the preferences the user has to make a choice whether he wants to use Python3 or 2.
pandas >= 0.23.4 pickle matplotlib
The exchange format between the Python node and Python is CSV. The following table should provide an overview over data type support. KNIME RowIds are transferred as Pandas data frame index.
Currently, experimental features of pandas 1.x, like consistent missing value support and string/boolean column types are not implemented.
KNIME Column => |
Pandas column (Representation of missing values) |
---|---|
String (or String Compatible) |
Object NaN |
Double |
Float64 NaN |
Int |
Int64 or Float64 (if it contains missing values) NaN |
Long |
- (no way to represent missing values) |
Bool |
Bool True (bug in pandas import) |
Date/Time/Durations | |
LocalDate |
Datetime64[ns] (no extreme value support) NaT |
LocalDateTime |
Datetime64[ns] (no extreme value support) NaT |
LocalTime |
Datetime64[ns] (adds the current date to the time!) NaT |
Duration |
Timedelta64 (Iso_8601 String import with parsing bugs on pandas side, recommendation to transfer as String instead) NaT |
Period |
Timedelta64 (Iso_8601 String import with parsing bugs on pandas side, recommendation to transfer as String instead) NaT |
Pandas Column => (support of missing values) |
KNIME column |
---|---|
Object (no transfer of '\r' or '\\') NaN |
String |
Float64 NaN |
Double |
Int64 | Long |
Bool | Bool |
datetime/timedelta | |
Datetime64 NaT |
LocalDateTime (trucated to microseconds due to a pandas export bug) |
Timedelta64 NaT |
Duration (parsed Iso_8601 String) |
Every Python node comes along with an example script which should work right away. A KNIME input table will be made available to Python as Pandas dataframe named kIn. A Pandas dataframe named pyOut is expected to be returned to KNIME.
KNIME flow variables can be used with the template 'FLOWVAR(myFlowVariableName)'. This part of the code is then replaced with the content of the variable before execution.
Every Python node has the option to push the input table(s) and the given script to an external Python session to provide a way of troubleshooting or prototyping.
There are two ways:
- command line call of Python (given the executable of the preference settings) and a prepared script which reads in the KNIME input table. The Python code of the node will then be available as clipboard content
- launching a Jupyter notebook which is prepared to read in the KNIME input table from the CSV-file and already contains the Python code of the node as well as procedures to export the result to CSV
Nothing to do. A terminal (Mac/Linux) or Powershell (Windows) window will be opened and the selected Python executable is called with the prepared Python script. The Python code of the node is provided as clipboard content.
Will be explained a bit more in detail at an extra Wiki-Page as it is planned to provide it for the R-nodes too.
Jupyter Preferences
Python Plots do provide the created image in at least two ways:
The plot node offers an additional configuration tab to provide basic control over some image features.
Only valid for file export, not for Image Port or Node View
Support of
- PNG
- JPEG
- SVG
- TIF
Image width and height in pixel
Resolutions in dots per inch. Valid for all images (view, port and exported file)
If a filename is given and if the box "Write image to file" is checked, the image is exported as file. If the image already exists, it will be overwritten if the checkbox for overwriting is checked. The filename supports the following templates:
-
$$DATE$$ for the current date, -
$$USER$$ for the user name, -
$$WS$$ for the workspace directory, and - FLOWVAR(variable name) to use flow variable values in the file name.
The view shows a PNG with the given dimensions and the given DPI. At the lower left corner image dimensions are shown. Now, if the user resizes the image it will be rescaled which might not look very nice. But the user can force a recreation of that image by a double click on it with the new dimensions (which are still shown in the lower left corner).