You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I can confirm this behavior.
What would be needed is a database cleanup at the start of ndscheduler to change the status of those jobs to "failed" since they are most likely not completed.
I submitted a PR #90 which cleans the database from such interrupted executions.
In my case the interruption was caused by running the ndscheduler via systemd unit which sends a SIGTERM at stop/restart and not the SIGINT which is expected by ndscheduler. It is possible to change the stop signal used by systemd unit to SIGINT in order to ensure graceful stop of ndscheduler. Another alternative would be to add SIGTERM in server.py alongside with the handler for SIGINT.
This can happen when a job is running and the ndscheduler process died.
I.e to reproduce:
can create shell job like that sleeps for a while i.e:
["bash","-c","sleep 3600"]
when the job is running, send kill signal, the next time ndscheduler starts, the job will be stuck at running.
The text was updated successfully, but these errors were encountered: