-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SubjectSet completeness metrics #3649
Conversation
to count the number of retired subjects for a workflow
hook into subject add / remove events to ensure we have an update count of completeness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything seems fine. I had a couple minor questions about syntax, nothing deal breaking.
One thing: how does this work with un-retirement? Do we need to call this SubjectSetCompletenessWorker
in this case as well?
Good catch about unretirement events - i'll update the PR to handle that event :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! 👍
closes #3450
TODO
subject_set_completeness_from_read_replica
to read the counts from the replica DBThis PR adds the ability to calculate the completeness of a subject set in the context of a workflow and record these completeness metrics on the
SubjectSet
resource in a json blob that is keyed by the workflow id.As a subject set can belong to many workflows in a project we need to record the completeness metrics for the SubjectSet in the context of these workflows. Specifically we need to calculate the ratio of the number of retired subjects in a set divided by the the number of subjects in a set, formula
completeness = total_retired_subjects_for_a_workflow_in_a_set / total_subjects_in_a_set
.Finally we clamp the completeness metric to a sensible range, 0.0 (0%) to 1.0 (100%) to ensure we display sensible numbers and avoid issues with incorrect numerators and denominators in the calculation (described in #2155)
These completeness metrics stores this per workflow metric a
subject_set.completeness
jsonb column in the database and uses the atomic jsonb update operators (available since our upgrade to pg v11) to avoid clobbering data already in the jsonb column and/or by concurrent updates.Review checklist
apiary.apib
file?