You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since DSA already sits atop the source of truth for Cocina (Postgres), and it's queryable, DSA can get this information directly from Postgres without needing to consult Solr.
The text was updated successfully, but these errors were encountered:
Fixes#4459
This is a spike commit towards an SDR Evolution the team has been batting around for a while now, namely severing DSA's dependency on Solr. The spike largely replaces Solr queries with direct DB queries, and for most use cases this works just fine. The key word here is "most..."
* The Solr queries have been replaced with DB queries that reach into JSONB columns which results in table scans. I tested all of these queries in stage with large-ish, but not prod-huge, data sets (~25K records) and most of them perform fine. That said, we might want to test this with prod-like data and do some benchmarking to determine if we want to index more of the JSONB data.
* A notable performance outlier is `MemberService.for` which needs to make a single Workflow API call for *each* member of a virtual object. These are impressively slow for a virtual object with a few thousand members, taking over a minute to complete.
Another question we'd need to answer to take this work forward is what to do about `bin/generate-druid-list`, which allows a user to issue Solr queries directly, and `lib/tasks/missing_druids.rake`, which compares what's in the DSA DB and what's in Solr to determine if any objects need (re-)indexing. Are these still useful? If so, could they live elsewhere or could we solve these problems in a different way? If the answer is no, we may not want to proceed with this decoupling.
**NOTE:** Since this is a spike meant to generate discussion, I have not yet bothered with deal with changing the tests (or caring about linting). That will naturally come later if we decide the idea and implementation has merit.
This is an "SDR evolution" ticket, the intent of which is to reduce dependencies within DSA. (HT to @jcoyne for the idea!)
As far as we know, Solr is used within DSA for:
Since DSA already sits atop the source of truth for Cocina (Postgres), and it's queryable, DSA can get this information directly from Postgres without needing to consult Solr.
The text was updated successfully, but these errors were encountered: