-
Notifications
You must be signed in to change notification settings - Fork 7
Merge function inherently flawed #50
Comments
until this is fixed, we should go back to modifying bdnycdev.db in dropbox and rely on Dropbox notify us of conflicted copies. Only people with write access include me, @dr-rodriguez , and @hover2pi |
I don't fully understand the problem. What should happen is that records from the modified SOURCES table are merged into the master SOURCES table and given a new source_id. Then the source_id column is updated in all relevant tables. Is this not what happens? There could be a problem arising from the SOURCES table not having a NOT NULL column requirement since this is what astrodbkit looks for when testing record duplication. For example, if you modify a record in the SOURCES table then merge, it will think they are two separate sources. Is that what you mean? |
This workflow is dangerous because the modified database could reference the source_id's in multiple tables which are not merged at the same time as the sources table. In general, the idea of changing primary keys of tables when merging is very unsettling. Every effort should be made to maintain the primary key from the modified table and give big big warnings when it will be changed. |
Perhaps if the source_ids are a unique identifier instead of a random ordered number set. Such as, a randomly generated string of numbers and characters of a set length. OR, and better, the shortname of the object minus the +/-. For example, 1503+2525 would have source_id = 15032525. In the case where objects share shortnames, we'd need a remedy, but in most cases this would be unique and not necessary to renumber when merged like the current workflow. |
The source_ids are already a unique identifier. I haven't tested if source_ids are being updated, but found the code that's supposed to do it, so I can look into that later. I thought the problem was more with non-source_id tables. For example, updating a publication creates a new publication id rather than replacing the existing one as it has no way of checking for duplicates. This new record is not updated throughout the sources table, or anywhere else. |
Just tested and indeed this is what happens. There is no way to check for duplicates in the SOURCES table and so the source_id would not be replaced anywhere. The same thing happens for the PUBLICATIONS table (and likely elsewhere): an updated record becomes a new one whose new id is not reflected in SOURCES. |
Right, so the first order solution is to require a UNIQUE NOT NULL column in addition to the id column. The source_id changing across other tables could be remedied by just not allowing individual table merges. Modified databases should always be completely merged into the master so source_ids can be updated in all tables if something in the SOURCES table changes. |
Merge function needs to retain id/primary keys from modified db and not generate new ones by default.
The text was updated successfully, but these errors were encountered: