Merge function inherently flawed #50

kelle · 2016-05-13T20:56:07Z

Merge function needs to retain id/primary keys from modified db and not generate new ones by default.

kelle · 2016-05-13T20:58:00Z

until this is fixed, we should go back to modifying bdnycdev.db in dropbox and rely on Dropbox notify us of conflicted copies. Only people with write access include me, @dr-rodriguez , and @hover2pi

hover2pi · 2016-05-19T14:48:04Z

I don't fully understand the problem.

What should happen is that records from the modified SOURCES table are merged into the master SOURCES table and given a new source_id. Then the source_id column is updated in all relevant tables.

Is this not what happens?

There could be a problem arising from the SOURCES table not having a NOT NULL column requirement since this is what astrodbkit looks for when testing record duplication. For example, if you modify a record in the SOURCES table then merge, it will think they are two separate sources. Is that what you mean?

kelle · 2016-05-19T18:35:39Z

This workflow is dangerous because the modified database could reference the source_id's in multiple tables which are not merged at the same time as the sources table. In general, the idea of changing primary keys of tables when merging is very unsettling. Every effort should be made to maintain the primary key from the modified table and give big big warnings when it will be changed.

paigegiorlagodfrey · 2016-05-27T15:41:20Z

Perhaps if the source_ids are a unique identifier instead of a random ordered number set. Such as, a randomly generated string of numbers and characters of a set length. OR, and better, the shortname of the object minus the +/-. For example, 1503+2525 would have source_id = 15032525. In the case where objects share shortnames, we'd need a remedy, but in most cases this would be unique and not necessary to renumber when merged like the current workflow.

dr-rodriguez · 2016-05-27T15:56:26Z

The source_ids are already a unique identifier. I haven't tested if source_ids are being updated, but found the code that's supposed to do it, so I can look into that later.

I thought the problem was more with non-source_id tables. For example, updating a publication creates a new publication id rather than replacing the existing one as it has no way of checking for duplicates. This new record is not updated throughout the sources table, or anywhere else.

dr-rodriguez · 2016-05-27T16:42:35Z

For example, if you modify a record in the SOURCES table then merge, it will think they are two separate sources. Is that what you mean?

Just tested and indeed this is what happens. There is no way to check for duplicates in the SOURCES table and so the source_id would not be replaced anywhere. The same thing happens for the PUBLICATIONS table (and likely elsewhere): an updated record becomes a new one whose new id is not reflected in SOURCES.

hover2pi · 2016-06-03T14:52:28Z

Right, so the first order solution is to require a UNIQUE NOT NULL column in addition to the id column.

The source_id changing across other tables could be remedied by just not allowing individual table merges. Modified databases should always be completely merged into the master so source_ids can be updated in all tables if something in the SOURCES table changes.

kelle added the Critical label May 13, 2016

hover2pi self-assigned this Jul 6, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge function inherently flawed #50

Merge function inherently flawed #50

kelle commented May 13, 2016

kelle commented May 13, 2016

hover2pi commented May 19, 2016

kelle commented May 19, 2016

paigegiorlagodfrey commented May 27, 2016

dr-rodriguez commented May 27, 2016

dr-rodriguez commented May 27, 2016

hover2pi commented Jun 3, 2016

Merge function inherently flawed #50

Merge function inherently flawed #50

Comments

kelle commented May 13, 2016

kelle commented May 13, 2016

hover2pi commented May 19, 2016

kelle commented May 19, 2016

paigegiorlagodfrey commented May 27, 2016

dr-rodriguez commented May 27, 2016

dr-rodriguez commented May 27, 2016

hover2pi commented Jun 3, 2016