SWF: resume a the execution of a stopped workflow #28

ggreg · 2015-02-06T11:02:00Z

Description

This feature allows to resume a workflow where it failed, timed out or was terminated or cancelled. That what we mean by stopped.

Hence the tasks that were completed on this execution are not executed again.

This feature introduces the following problems:

Interface: How to request this behavior? In other word, how does the decider know that it should resume a workflow?
Implementation: How to inject the history of the stopped workflow?

Interface

There are several ways to communicate with a decider:

Through a special task list. It is not convenient because it has a impact on the configuration and the implementation of the decider.
With special values in the workflow's input, for example prefixed with _. As the client passes input in a dict that contains the args and kwargs keys, we could also add a mode key. Mode is not explicit and requires additional parameters to reference the workflow execution to resume. I prefer a _previous_workflow_execution parameter with the two attributes workflow_id and rund_id that would allow to retrieve the history of the previous workflow execution.
Attach images by dragging & dropping, selecting them, or pasting from the clipboard.
Update commentCancel

Implementation

Overview

The two main problems are:

To not execute again the tasks that were already completed. Their state is stored in the history of the workflow execution that was stopped.
To use the input of the workflow execution that was stopped.

How to not execute already completed tasks

We could:

Retrieve the history of the current workflow execution, parse it, and merge the events to have a snapshot of all tasks state.
Retrieve the history of the stopped workflow execution, parse it, merge the events, and only keep the completed tasks.
Then we merge these tasks into the current state of tasks to override tasks that were not completed or even scheduled.
When future objects are filled, ones that back the already completed tasks are in state FINISHED (failed tasks are discarded and must be executed again).

This approach requires that tasks id are consistent between the previous and the current workflow_execution because this id is used to associate a future object with the task it backs.

How to use the input of the stopped workflow execution

Once we get a reference to the previous workflow with the _previous_workflow_execution parameter, we can retrieve its input and inject it into the current workflow history.

The text was updated successfully, but these errors were encountered:

…tion

ggreg added the enhancement label Feb 6, 2015

ggreg self-assigned this Feb 6, 2015

ggreg pushed a commit that referenced this issue Feb 6, 2015

Update swf.executor #28: support resuming of a stopped workflow execu…

f07bcdc

…tion

ggreg pushed a commit that referenced this issue Feb 9, 2015

Update swf.executor #28: support resuming of a stopped workflow execu…

436d281

…tion

ggreg linked a pull request Feb 10, 2015 that will close this issue

Feature/resume workflow #29

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SWF: resume a the execution of a stopped workflow #28

SWF: resume a the execution of a stopped workflow #28

ggreg commented Feb 6, 2015

SWF: resume a the execution of a stopped workflow #28

SWF: resume a the execution of a stopped workflow #28

Comments

ggreg commented Feb 6, 2015

Description

Interface

Implementation

Overview

How to not execute already completed tasks

How to use the input of the stopped workflow execution