-
Notifications
You must be signed in to change notification settings - Fork 75
How to integrate a new operator into Texera
Oliver Chen edited this page Jul 5, 2018
·
3 revisions
Here we only talk about operator that does operation on retrieved data from source operator(e.g. sentiment analytics). This tutorial will not be suitable for source operator and result operator.
- This class will be used to generate your operator object. It should extend from predicateBase. You need to add your predicate class to predicateBase file first.
- The constructor of this predicate class should take all attributes you want your operator to take from the webpage(e.g. inputAttributeName/resultAttributeName), and store them.
- For each parameter of constructor, you'll want a method to return this argument for your operator class to use.
- The predicate class should have a newOperator method, which constructs and returns a new operator object using this predicate object. E.g.
public dummyOperator newOperator() { return new dummyOperator(this); }
- Write a getOperatorMetadata method to specify group and description of your operator. Use
ImmutableMap.<~>builder
to do it.
- This class should extend from one of the classes from api/src/main/java/edu/uci/ics/texera/api/dataflow
- This class should have a constructor which takes a predicate object and stores it.
- This class should have a
setInputOperator
method, which specifies the previous operator(inputOperator). - This class should have a
open
method, which sets our inputSchema to be the inputOperator's ourputSchema and transforms our inputSchema into appropriate outputSchema. Then methodopen
sets the cursor toOPENED
. - This class should have a
getNextTuple
method, which callsgetNextTuple
of the inputOperator to get data from previous step; does operation on this data; adds generated field to the end if necessary; and returns the tuple. - This class should have a
close
method to set the cursor toCLOSED
. - This class should have a
transformToOutputSchema
method, which takes inputSchema, modifies it, and returns the outputSchema.