-
Notifications
You must be signed in to change notification settings - Fork 41
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add SubjectSet completeness metrics (#3649)
* add a subject set workflow counter to count the number of retired subjects for a workflow * add a completeness json attribute to subject_set * spec out a subject et completeness worker * fix linter warnings * correctly spell the spec file name * add json field atomic update * match perform method arg order * ignore rspec change block syntax in linter * add specs for clobbering and range clamping * use the readreplica via feature flag * add subejct_set completness to serializer * add frozen string literal magic comment * recalculate the set completeness on subject add / remove hook into subject add / remove events to ensure we have an update count of completeness * run the set completness worker on each retirement event * recalculate subject set completeness when unretiring
- Loading branch information
Showing
15 changed files
with
302 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# frozen_string_literal: true | ||
|
||
class SubjectSetWorkflowCounter | ||
attr_reader :subject_set_id, :workflow_id | ||
|
||
def initialize(subject_set_id, workflow_id) | ||
@subject_set_id = subject_set_id | ||
@workflow_id = workflow_id | ||
end | ||
|
||
# count the number of subjects in this subject set | ||
# that have been retired for this workflow | ||
def retired_subjects | ||
scope = | ||
SubjectWorkflowStatus | ||
.where(workflow: workflow_id) | ||
.joins(workflow: :subject_sets) | ||
.where(subject_sets: { id: subject_set_id }) | ||
.retired | ||
|
||
scope.count | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# frozen_string_literal: true | ||
|
||
class SubjectSetCompletenessWorker | ||
include Sidekiq::Worker | ||
using Refinements::RangeClamping | ||
|
||
sidekiq_options queue: :data_low | ||
|
||
sidekiq_options congestion: { | ||
interval: 30, # N jobs (below) in each 30s | ||
max_in_interval: 1, # only 1 job every interval above | ||
min_delay: 60, # next job can run 60s after the last one | ||
reject_with: :reschedule, # reschedule the job to run later (avoid db pressure) so we don't eventually run all the jobs and the stored metrics eventually align | ||
key: ->(subject_set_id, workflow_id) { "subject_set_#{subject_set_id}_completeness_#{workflow_id}_worker" } | ||
} | ||
|
||
sidekiq_options lock: :until_executing | ||
|
||
def perform(subject_set_id, workflow_id) | ||
subject_set = SubjectSet.find(subject_set_id) | ||
workflow = Workflow.find_without_json_attrs(workflow_id) | ||
|
||
# find the count of all retired subjects, for a known subject set, in the context of a known workflow | ||
# using the read replica if the feature flag is enabled | ||
retired_subjects_completeness = 0.0 | ||
DatabaseReplica.read('subject_set_completeness_from_read_replica') do | ||
retired_subjects_count = SubjectSetWorkflowCounter.new(subject_set.id, workflow.id).retired_subjects * 1.0 | ||
total_subjects_count = subject_set.set_member_subjects_count * 1.0 | ||
# calculate and clamp the completeness value between 0.0 and 1.0, i.e. 0 to 100% | ||
retired_subjects_completeness = (0.0..1.0).clamp(retired_subjects_count / total_subjects_count) | ||
end | ||
|
||
# store these per workflow completeness metric in a json object keyed by the workflow id | ||
# use the atomic DB json operator to avoid clobbering data in the jsonb attribute by other updates | ||
# https://www.postgresql.org/docs/11/functions-json.html | ||
SubjectSet.where(id: subject_set.id).update_all( | ||
"completeness = jsonb_set(completeness, '{#{workflow_id}}', '#{retired_subjects_completeness}', true)" | ||
) | ||
rescue ActiveRecord::RecordNotFound | ||
# avoid running sql count queries for subject sets and workflows we can't find | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
10 changes: 10 additions & 0 deletions
10
db/migrate/20210729152047_add_workflow_completeness_to_subject_set.rb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# frozen_string_literal: true | ||
|
||
class AddWorkflowCompletenessToSubjectSet < ActiveRecord::Migration | ||
def change | ||
# since PG v11+ we can add a new column and a default at the same time | ||
# https://github.com/ankane/strong_migrations#bad-1 | ||
# https://www.2ndquadrant.com/en/blog/add-new-table-column-default-value-postgresql-11/ | ||
add_column :subject_sets, :completeness, :jsonb, default: {} | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# frozen_string_literal: true | ||
|
||
require 'spec_helper' | ||
|
||
describe SubjectSetWorkflowCounter do | ||
let(:subject_set) { create(:subject_set_with_subjects, num_workflows: 1, num_subjects: 2) } | ||
let(:workflow) { subject_set.workflows.first } | ||
let(:counter) { described_class.new(subject_set.id, workflow.id) } | ||
|
||
describe 'retired_subjects' do | ||
it 'returns 0 if there are none' do | ||
expect(counter.retired_subjects).to eq(0) | ||
end | ||
|
||
context 'with retired_subjects' do | ||
let(:subject_to_retire) { subject_set.subjects.first } | ||
|
||
before do | ||
SubjectWorkflowStatus.create( | ||
workflow_id: workflow.id, | ||
subject_id: subject_to_retire.id, | ||
retired_at: Time.now.utc | ||
) | ||
end | ||
|
||
it 'returns 1' do | ||
expect(counter.retired_subjects).to eq(1) | ||
end | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.