Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database errors should have their own status_code #768

Open
evansd opened this issue Jan 10, 2025 · 0 comments
Open

Database errors should have their own status_code #768

evansd opened this issue Jan 10, 2025 · 0 comments

Comments

@evansd
Copy link
Contributor

evansd commented Jan 10, 2025

We currently append some custom text to the error message when a database job fails with certain exit codes, but we leave the status_code as NONZERO_EXIT like any other failed job. This makes monitoring and debugging harder than it needs to be. We should add custom status codes for this. I think probably one for each specific type of database error?

Related Slack thread:
https://bennettoxford.slack.com/archives/C069YDR4NCA/p1736417917622859

And relevant bits of code:

job-runner/jobrunner/run.py

Lines 448 to 451 in 4fba743

elif job_definition.allow_database_access:
error_msg = config.DATABASE_EXIT_CODES.get(results.exit_code)
if error_msg:
message += f": {error_msg}"

DATABASE_EXIT_CODES = {
# Custom database-related exit codes return from cohortextractor, see
# https://github.com/opensafely-core/cohort-extractor/blob/0a314a909817dbcc48907643e0b6eeff319337db/cohortextractor/cohortextractor.py#L787
3: (
"A transient database error occurred, your job may run "
"if you try it again, if it keeps failing then contact tech support"
),
4: "New data is being imported into the database, please try again in a few hours",
5: "Something went wrong with the database, please contact tech support",
}

# FAILED states
DEPENDENCY_FAILED = "dependency_failed"
NONZERO_EXIT = "nonzero_exit"
CANCELLED_BY_USER = "cancelled_by_user"
UNMATCHED_PATTERNS = "unmatched_patterns"
INTERNAL_ERROR = "internal_error"
KILLED_BY_ADMIN = "killed_by_admin"
STALE_CODELISTS = "stale_codelists"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant