You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Users want to submit batches of transfers that are physically present in a local filesystem or a network share under a known path.
The solution that we're considering is introducing a new Batch API that allow users to submit a new batch with a simple path parameter indicating its physical location. The API triggers a new workflow batch that, using an activity, scans the given location and starts new processing workflows for each entry found.
POST /batch
Submission of batch.
Takes a path.
GET /batch
Returns workflow status.
(we do very similar in /collection/bulk)
These are some considerations and/or compromises for this first iteration:
The underlying batch workflow will use a predefined identifier to avoid concurrent batches, which is something that we want to avoid while we explore our solution and its implications,
The batch workflow does not wait for processing workflows to complete, e.g. we're purposely avoiding parent-child workflow relationships because the cardinality is unbounded - what we plan is to write an activity that scans the location and fires the processing workflows using the Cadence client (we think this is not a problem as long as we reuse the same client during the life-cycle of the activity),
We need to process directories found under the given path, but this is going to require some refactoring in the processing workflow and its input parameters,
We do not expect the batch to mutate once it is submitted by the user, i.e. we don't need to track changes during its processing,
One compromise is not to worry about data locality for now - the batch must be locally available wherever Enduro is running, including its activity worker (which is going to be the case because of Problem: workers can't be deployed separately #37) - this is obviously going to become a blocker when we want to run Enduro at scale but that's not currently a priority for our customers.
The text was updated successfully, but these errors were encountered:
Users want to submit batches of transfers that are physically present in a local filesystem or a network share under a known path.
The solution that we're considering is introducing a new Batch API that allow users to submit a new batch with a simple
path
parameter indicating its physical location. The API triggers a new workflowbatch
that, using an activity, scans the given location and starts new processing workflows for each entry found.POST /batch
Submission of batch.
Takes a
path
.GET /batch
Returns workflow status.
(we do very similar in
/collection/bulk
)These are some considerations and/or compromises for this first iteration:
The text was updated successfully, but these errors were encountered: