-
Notifications
You must be signed in to change notification settings - Fork 25
Home
Knime provides powerful and flexible means to mine data. However, as many methods are implemented just for data modeling languages like R or Matlab, it is crucial to integrate these languages into Knime. To some extent this is already possible. However, from our daily work we’ve learned that many users need to use scripts without having any background in scripting. Thus we implemented a new open source scripting integration framework for Knime, which is based on RGG templates [1]. Its main purpose is to hide the script complexity behind a user-friendly graphical interface. Furthermore, our approach goes beyond the existing integration of R as it provides better and more flexible graphics support, flow variable support and an easy-to-extend server-based script template repository.
1 RGG: A general GUI Framework for R scripts; Ilhami Visne, Bioinformatics, 2009, 10:74
The latest version the Scripting tools is available from the Knime community repository. To install them, just add the community update site url to Knime and select the scripting extensions from the list.
The R integration uses Rserve as a back-end to communicate with your local R installation or one on a remote server. You can find a detailed description on how setup the back-end at the R-Server installation instructions.
The R integration provides nodes to hand through the whole R workspace. Descriptions on how to use these nodes can be found here: Generic R.
The Python integration supports Python 2.x (recommended version 2.7). It runs either with a local Python installation or on a remote python server. Follow the description on how to setup the Python installation.
Groovy is the dynamic version of Java. It can run locally and uses the same JVM as KNIME itself. In other words it does not need an additional back-end, the nodes alone do the trick.
since the release 2.0.3 the MATLAB scripting integrations needs a MATLAB installation on the local machine. On the other hand there is no more middleware nor additional configuration needed.
spaces in the paths that are passed to matlab lead to errors. A quick and dirty workaround is to remove the space in the knime installation folders (and subsequently take care that no other spaces are used in the path).
The MATLAB integration uses mpicbg-matlab as a backend to communicate with your local MATLAB installation or one on a remote server. You can find a detailed description on how setup the backend here.
Our scripting framework is based on a template repository that contains rgg-templates along with description, categorization and optional previews (see screenshot). In contrast to similar products like RevoDeployR we've chosen a more simple but what we think more easy-to-use approach to deploy template collections. Templates are stored in plain text files that are supposed to follow a simple 3-fold header-description-templatecode schema. The desired template repository can be specified in the Knime preferences (iIndependenlty for each scripting language of course). By default the nodes link to template-files that focus on visualizations and processing-snippets used for HCS screening (Example: R plots, Example: R snippets). These templates supplement the HCS Tools. It is possible to hook in several template repository files at once.
You can explore the current set of figure templates and snippets:
Here there are a couple of example templates.
To use R templates set in KNIME > Preferences > KNIME > R Scriping the following links in their respective field
https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/R/figure-templates.txt https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/R/snippet-templates.txt
To use Python templates set in KNIME > Preferences > KNIME > Python Scriping the following links in their respective field
https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/Python/figure-templates.txt https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/Python/script-templates.txt
To use MATLAB templates set in KNIME > Preferences > KNIME > Matlab Scriping the following links in their respective field
https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/Matlab/figure-templates.txt https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/Matlab/script-templates.txt
To use Groovy templates set in KNIME > Preferences > KNIME > Groovy Scriping the following links in their respective field
https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/Groovy/Groovy-templates.txt
To write your own templates you need to understand the concept the of the RGG-XML. The documentation, example and a "How to..." you can find on the RGG-page.
Flow variables of KNIME can also be accessed within the templates or R code. A short documentation how to used flow variables you can find here here.
The scripting nodes are released under the GNU General Public License, Version 3 (including certain additional permissions according to Sec. 7 of the GPL Ver. 3).
Source code/SCM access is available/granted on request. So is the mpicbg-python module (Python back-end).
The MATLAB back-end, also called mpicbg-matlab, is Published under the BSD-License. The user has to make absolutly sure to be confrom to the user licensce agreemen he has with Mathworks. Usually, to use the matlab nodes, a Network concurrent license for MATLAB is required for multiple people accessing it.
Feel welcome to contact us ([email protected]) if you want to contribute to this project, have suggestions, found bugs, or want to tell us about your vision about how scripting languages might integrate into Knime
Please acknowledge or cite this website if you have used this KNIME extension in your work and found it helpful:
https://github.com/knime-mpicbg/knime-scripting/wiki
If you want to mention our institution use this:
High-Throughput Technology Development Studio (TDS) Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG)
We have published a book chapter (Springer, PubMed) on open source software tools for high-content screening, where we introduced the Scripting Integration and HCS-Tools for KNIME. Feel free to add this as a citation. Here is the citation in BibTeX format:
@incollection{ year={2013}, isbn={978-1-62703-310-7}, booktitle={Target Identification and Validation in Drug Discovery}, volume={986}, series={Methods in Molecular Biology}, editor={Moll, Jurgen and Colombo, Riccardo}, doi={10.1007/978-1-62703-311-4_8}, title={CellProfiler and KNIME: Open Source Tools for High Content Screening}, url={http://dx.doi.org/10.1007/978-1-62703-311-4_8}, publisher={Humana Press}, keywords={High content screening; Image processing; Statistics; Open Source; CellProfiler; KNIME; Distributed computing}, author={Stöter, Martin and Niederlein, Antje and Barsacchi, Rico and Meyenhofer, Felix and Brandl, Holger and Bickle, Marc}, pages={105-122}, language={English} }