Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: update_databases.py does not have handling if job ID doesn't exist #187

Open
hagertnl opened this issue Nov 25, 2024 · 0 comments
Open
Labels
bug Something isn't working

Comments

@hagertnl
Copy link
Contributor

A cron job running update_databases.py exited this weekend with:

+ update_databases.py --time 3d --user sauetest --machine frontier --loglevel INFO
Enabled 2 database backends
Using machine config: frontier.ini
reading harness config /sw/acceptance/olcf-test-harness-dev/configs/frontier.ini
Traceback (most recent call last):
  File "/sw/acceptance/olcf-test-harness-dev/harness/utilities/update_databases.py", line 291, in <module>
    elif slurm_data[entry['job_id']]['state'] in slurm_job_state_codes['pending']:
KeyError: '2801028'
+ set +x

Now, I suspect that this was due to an issue with sacct that caused the slurm_data dictionary to be empty, but either way, we should be checking if a job ID exists in slurm_data before trying to access it.

@hagertnl hagertnl added the bug Something isn't working label Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant