feat: Provide optional lock to prevent concurrent pipeline execution #105

tombriggsallego · 2022-12-13T18:54:34Z

Meltano Version

2.8.0

Python Version

3.8

Bug scope

CLI (options, error messages, logging, etc.)

Operating System

Linux Ubuntu

Description

If we run meltano run tap-something some-mapper target-something and that pipeline is already running, meltano (correctly!) throws an "already running" error and exits. However, if instead we

run meltano run tap-something some-mapper target-something dbt-postgres:run,
wait for the tap+mapper+target block to finish and the dbt-postgres portion to start, and then
run meltano run tap-something some-mapper target-something dbt-postgres:run again

meltano will run the entire pipeline again, ultimately resulting in multiple copies of the same dbt project running at once. :(

If it matters we execute meltano via cron. The tap/mapper/target portion usually only takes a few minutes, but dbt often takes 20+ minutes to run. We had been planning to schedule the job for every 15 minutes and let meltano block concurrent runs when dbt was running long but unfortunately this prevents that.

Code

No response

The text was updated successfully, but these errors were encountered:

tayloramurphy · 2022-12-16T21:58:23Z

@aaronsteers thoughts on how we could help out with this? Likely would just be checking if the same plugin:command is already executing, right?

aaronsteers · 2022-12-16T22:14:37Z

Featurewise, we could declare a new plugin command property that specifies only one copy can run at a time. That limit would need to be per environment, so prod would never be blocked by devtest, for instance. The challenge is that I don't know if the way we are logging commands today would work the same way it does for EL. In theory, though, this definitely could work.

A second approach could be to create a dummy "command" before and after the dbt execution runs. That dummy command would basically "take" a lock and subsequently "release" the lock. You'd probably want to build a max-age of the lock, so it could self-heal, and you probably would want to have an explicit command to "release" the lock in cases that you know that its process is not running.

A third option, and I think I like this best, would be to build the second solution into the dbt-ext plugin itself, and/or into the EDK, and have the ability to use prehooks and posthooks to do the same thing inline.

The challenge then would be where to store the lock artifact. That could be easy or hard depending on the deployment scenario.

tombriggsallego · 2023-01-11T20:51:23Z

I built a version of @aaronsteers 's option 2. It is available here. It's not pretty but it seems to do the trick. I think ultimately option 3 is the ideal; adding two extra commands to achieve this makes for an ugly pipeline command. :( Extending the EDK is beyond my capabilities at the moment though. ;)

This comment was marked as resolved.

Sign in to view

WillDaSilva transferred this issue from meltano/meltano May 11, 2023

WillDaSilva changed the title ~~bug: Check for already running pipeline doesn't include transformers (or at least dbt...)~~ feat: Provide optional lock to prevent concurrent pipeline execution May 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Provide optional lock to prevent concurrent pipeline execution #105

feat: Provide optional lock to prevent concurrent pipeline execution #105

tombriggsallego commented Dec 13, 2022

tayloramurphy commented Dec 16, 2022

aaronsteers commented Dec 16, 2022 •

edited

Loading

tombriggsallego commented Jan 11, 2023 •

edited

Loading

This comment was marked as resolved.

feat: Provide optional lock to prevent concurrent pipeline execution #105

feat: Provide optional lock to prevent concurrent pipeline execution #105

Comments

tombriggsallego commented Dec 13, 2022

Meltano Version

Python Version

Bug scope

Operating System

Description

Code

tayloramurphy commented Dec 16, 2022

aaronsteers commented Dec 16, 2022 • edited Loading

tombriggsallego commented Jan 11, 2023 • edited Loading

This comment was marked as resolved.

aaronsteers commented Dec 16, 2022 •

edited

Loading

tombriggsallego commented Jan 11, 2023 •

edited

Loading