Understand why pipeline dependencies are confusing for end user #770

sebbacon · 2022-03-31T07:57:08Z

This job failed with a transient DB error, so we asked the user to re-run it. It's the bottom job on this page (link):

The user reports

the failed job was completed successfully...however, its status remains failed ... and that leads to the failure of the notebook action with the following error message: "generate_study_population_hospitalisation_4 failed on a previous run and must be re-run". Just in case, I tried the run_all but it failed as well (see here). My understanding is that a job needs to be run with the failed cohort extractor action and notebook action together at the same time. Am I right?

Why does the user believe the job was completely successfully? There's no evidence of that. Perhaps they misinterpreted our instruction that it was safe to re-run?
Or perhaps they did re-run it, and something else has gone wrong?

When we've understood this, consider if there are any small UI tweaks or terminology changes we could make to improve intelligibility

The text was updated successfully, but these errors were encountered:

Jongmassey · 2022-03-31T08:50:25Z

The completed job was a reference to this job which was one that I believe @evansd fettled the status of following some unspecified internal error that meant that job didn't actually fail but was reported as such.

sebbacon · 2022-03-31T12:13:14Z

My hypothesis is that the user misunderstood that they needed to re-run the generate_study_population_hospitalisation_4, presumably because previously when they'd experienced a failure and we fixed it, fixing it resulted in the action that had appeared to fail actually not failing. Whereas now they need to start a new job.

sebbacon · 2022-03-31T12:14:43Z

Ah, in fact I now realise we actually advised them incorrectly:

Following this, one step of the job failed with a database connection error after some time. Several other jobs from other jobs failed at the same time from other projects and both TPP and the Opensafely tech team are investigating the root cause and implementing mitigations.

The failed job was reset by us to run again and completed successfully

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understand why pipeline dependencies are confusing for end user #770

Understand why pipeline dependencies are confusing for end user #770

sebbacon commented Mar 31, 2022

Jongmassey commented Mar 31, 2022

sebbacon commented Mar 31, 2022

sebbacon commented Mar 31, 2022

Understand why pipeline dependencies are confusing for end user #770

Understand why pipeline dependencies are confusing for end user #770

Comments

sebbacon commented Mar 31, 2022

Jongmassey commented Mar 31, 2022

sebbacon commented Mar 31, 2022

sebbacon commented Mar 31, 2022