What is the reason to use the Postgres id
column for the _id
field in Elasticsearch?
#1947
-
The Elasticsearch What is the reason for this difference? It makes certain queries against Elasticsearch oddly different (and easy to mess up, if you're not aware of this small detail), as the Would it be possible to change the |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
As far as I understand from How Indexing works diagram1, the In fact, we have a [now obsolete, I think] comment in the openverse/ingestion_server/ingestion_server/indexer.py Lines 1 to 6 in 4d6e995 Seems that this was the issue for implementing this kind of sync between upstream and elasticsearch: cc-archive/cccatalog-api#7 Footnotes |
Beta Was this translation helpful? Give feedback.
As far as I understand from How Indexing works diagram1, the
_id
was used to determine whether the items in the database need to be synced with the elasticsearch index. We have migrated to refreshing all data instead of trying to sync only the latest items.In fact, we have a [now obsolete, I think] comment in the
ingestion_server/indexer.py
describing this process:openverse/ingestion_server/ingestion_server/indexer.py
Lines 1 to 6 in 4d6e995