-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPIKE] Decouple DSA from Solr #4578
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,7 @@ | |
|
||
require 'csv' | ||
|
||
# Find items that are goverened by the provided APO and then return all catkeys and refresh status. | ||
# Find items that are governed by the provided APO and then return all catkeys and refresh status. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Typo. |
||
# https://github.com/sul-dlss/dor-services-app/issues/4373 | ||
# Invoke via: | ||
# bin/rails r -e production "ApoCatkey.report('druid:bx911tp9024')" | ||
|
@@ -12,17 +12,7 @@ def self.report(apo_druid) | |
|
||
CSV.open(output_file, 'w') do |csv| | ||
csv << %w[druid catkey refresh] | ||
query = "is_governed_by_ssim:\"info:fedora/#{apo_druid}\"&objectType_ssim:\"item\"" | ||
druids = [] | ||
# borrowed from bin/generate-druid-list | ||
loop do | ||
results = SolrService.query('*:*', fl: 'id', rows: 10000, fq: query, start: druids.length, sort: 'id asc') | ||
break if results.empty? | ||
|
||
results.each { |r| druids << r['id'] } | ||
sleep(0.5) | ||
end | ||
|
||
druids = Dro.has_admin_policy(apo_druid).map(&:external_identifier) | ||
num_dros = druids.size | ||
puts "Found #{num_dros} objects that are governed by APO #{apo_druid}" | ||
druids.each_with_index do |druid, i| | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,6 @@ | |
class DeleteService | ||
# Tries to remove any existence of the object in our systems | ||
# Does the following: | ||
# - Removes item from Fedora/Solr | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cruft. |
||
# - Removes content from dor workspace | ||
# - Removes content from assembly workspace | ||
# - Removes content from sdr export area | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,37 @@ | ||
# frozen_string_literal: true | ||
|
||
# Finds the members of a collection by using Solr | ||
# Finds the members of a collection | ||
class MemberService | ||
# @param [String] druid the identifier of the collection | ||
# @param [Boolean] only_published when true, restrict to only published items | ||
# @param [Boolean] exclude_opened when true, exclude opened items | ||
# @return [Array<Hash<String,String>>] the members of this collection | ||
def self.for(druid, only_published: false, exclude_opened: false) | ||
query = "is_member_of_collection_ssim:\"info:fedora/#{druid}\"" | ||
query += ' published_dttsim:[* TO *]' if only_published | ||
query += ' -processing_status_text_ssi:Opened' if exclude_opened | ||
args = { | ||
fl: 'id,objectType_ssim', | ||
rows: 100_000_000 | ||
} | ||
SolrService.query query, args | ||
Dro | ||
.members_of_collection(druid) | ||
.then { |members| reject_opened_members(members, exclude_opened) } | ||
.then { |members| select_published_members(members, only_published) } | ||
.map do |member| | ||
{ | ||
'id' => member.external_identifier, | ||
'objectType' => member.content_type == Cocina::Models::ObjectType.agreement ? 'agreement' : 'item' | ||
} | ||
end | ||
end | ||
|
||
def self.reject_opened_members(members, exclude_opened) | ||
return members unless exclude_opened | ||
|
||
members.reject do |member| | ||
WorkflowClientFactory.build.status(druid: member.external_identifier, version: member.version).display_simplified == 'Opened' | ||
end | ||
end | ||
|
||
def self.select_published_members(members, only_published) | ||
return members unless only_published | ||
|
||
members.select do |member| | ||
WorkflowClientFactory.build.lifecycle(druid: member.external_identifier, milestone_name: 'published', version: member.version).present? | ||
end | ||
Comment on lines
+12
to
+35
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the sluggish part. |
||
end | ||
end |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,8 @@ | |
require_relative '../config/environment' | ||
require 'optparse' | ||
|
||
# TODO: Figure out if we still want this or not, given how tightly coupled this functionality is to Solr | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See comment. |
||
|
||
options = { output: 'druids.txt', quiet: false } | ||
parser = OptionParser.new do |option_parser| | ||
option_parser.banner = 'Usage: bin/generate-druid-list \'<QUERY, e.g., project_tag_ssim:"Naxos : born digital audio">\' [options]' | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
# frozen_string_literal: true | ||
|
||
# TODO: Figure out if we still want this or not, given how tightly coupled this functionality is to Solr | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See comment. |
||
|
||
namespace :missing_druids do | ||
desc 'Find unindexed druids' | ||
task unindexed_objects: :environment do | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If nothing else, I learned a bit more about how to do fancy JSONB queries with our crazy nested JSON.