Skip to content
felix cc edited this page Jul 31, 2018 · 58 revisions

Table of Contents

Overview

Knime provides powerful and flexible means to mine data. However, as many methods are implemented just for data modeling languages like R or Matlab, it is crucial to integrate these languages into Knime. To some extent this is already possible. However, from our daily work we’ve learned that many users need to use scripts without having any background in scripting. Thus we implemented a new open source scripting integration framework for Knime, which is based on RGG templates [1]. Its main purpose is to hide the script complexity behind a user-friendly graphical interface. Furthermore, our approach goes beyond the existing integration of R as it provides better and more flexible graphics support, flow variable support and an easy-to-extend server-based script template repository.

1 RGG: A general GUI Framework for R scripts; Ilhami Visne, Bioinformatics, 2009, 10:74


Installation

The latest version the Scripting tools is available from the Knime community repository. To install them, just add the community update site url to Knime and select the scripting extensions from the list.


Scripting language support

R

The R integration uses Rserve as a back-end to communicate with your local R installation or one on a remote server. You can find a detailed description on how setup the back-end at the R-Server installation instructions.

Generic R

The R integration provides nodes to hand through the whole R workspace. Descriptions on how to use these nodes can be found here: Generic R.

Python

The Python integration supports Python 2.x (recommended version 2.7). It runs either with a local Python installation or on a remote python server. Follow the description on how to setup the Python installation.

Groovy

Groovy is the dynamic version of Java. It can run locally and uses the same JVM as KNIME itself. In other words it does not need an additional back-end, the nodes alone do the trick.

MATLAB

since the release 2.0.3 the MATLAB scripting integrations needs a MATLAB installation on the local machine. On the other hand there is no more middleware nor additional configuration needed.

Matconsolectl bug and workaround

spaces in the paths that are passed to matlab lead to errors. A quick and dirty workaround is to remove the space in the knime installation folders (and subsequently take care that no other spaces are used in the path).

KNIME 2.9 and older

The MATLAB integration uses mpicbg-matlab as a backend to communicate with your local MATLAB installation or one on a remote server. You can find a detailed description on how setup the backend here.


Scripting Templates

Our scripting framework is based on a template repository that contains rgg-templates along with description, categorization and optional previews (see screenshot). In contrast to similar products like RevoDeployR we've chosen a more simple but what we think more easy-to-use approach to deploy template collections. Templates are stored in plain text files that are supposed to follow a simple 3-fold header-description-templatecode schema. The desired template repository can be specified in the Knime preferences (iIndependenlty for each scripting language of course). By default the nodes link to template-files that focus on visualizations and processing-snippets used for HCS screening (Example: R plots, Example: R snippets). These templates supplement the HCS Tools. It is possible to hook in several template repository files at once.



Public Template Repository of the MPI-CBG

You can explore the current set of figure templates and snippets:


Here there are a couple of example templates.
To use R templates set in KNIME > Preferences > KNIME > R Scriping the following links in their respective field
 https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/R/figure-templates.txt
 https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/R/snippet-templates.txt

To use Python templates set in KNIME > Preferences > KNIME > Python Scriping the following links in their respective field

 https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/Python/figure-templates.txt
 https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/Python/script-templates.txt

To use MATLAB templates set in KNIME > Preferences > KNIME > Matlab Scriping the following links in their respective field

 https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/Matlab/figure-templates.txt
 https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/Matlab/script-templates.txt

To use Groovy templates set in KNIME > Preferences > KNIME > Groovy Scriping the following links in their respective field

 https://raw.githubusercontent.com/knime-mpicbg/scripting-templates/master/knime-scripting-templates/Groovy/Groovy-templates.txt

Creating Your own Templates

To write your own templates you need to understand the concept the of the RGG-XML. The documentation, example and a "How to..." you can find on the RGG-page.


Using flow variables of KNIME within Templates / R code

Flow variables of KNIME can also be accessed within the templates or R code. A short documentation how to used flow variables you can find here here.


License and Support

The scripting nodes are released under the GNU General Public License, Version 3 (including certain additional permissions according to Sec. 7 of the GPL Ver. 3).
Source code/SCM access is available/granted on request. So is the mpicbg-python module (Python back-end).

The MATLAB back-end, also called mpicbg-matlab, is Published under the BSD-License. The user has to make absolutly sure to be confrom to the user licensce agreemen he has with Mathworks. Usually, to use the matlab nodes, a Network concurrent license for MATLAB is required for multiple people accessing it.

Feel welcome to contact us ([email protected]) if you want to contribute to this project, have suggestions, found bugs, or want to tell us about your vision about how scripting languages might integrate into Knime

Citations and Aknowledgements

Please acknowledge or cite this website if you have used this KNIME extension in your work and found it helpful:

  https://github.com/knime-mpicbg/knime-scripting/wiki

If you want to mention our institution use this:

  High-Throughput Technology Development Studio (TDS)
  Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG)

We have published a book chapter (Springer, PubMed) on open source software tools for high-content screening, where we introduced the Scripting Integration and HCS-Tools for KNIME. Feel free to add this as a citation. Here is the citation in BibTeX format:

  @incollection{
  year={2013},
  isbn={978-1-62703-310-7},
  booktitle={Target Identification and Validation in Drug Discovery},
  volume={986},
  series={Methods in Molecular Biology},
  editor={Moll, Jurgen and Colombo, Riccardo},
  doi={10.1007/978-1-62703-311-4_8},
  title={CellProfiler and KNIME: Open Source Tools for High Content Screening},
  url={http://dx.doi.org/10.1007/978-1-62703-311-4_8},
  publisher={Humana Press},
  keywords={High content screening; Image processing; Statistics; Open Source; CellProfiler; KNIME; Distributed computing},
  author={Stöter, Martin and Niederlein, Antje and Barsacchi, Rico and Meyenhofer, Felix and Brandl, Holger and Bickle, Marc},
  pages={105-122},
  language={English}
  }
Clone this wiki locally