Keep the purge option when syncing schemas #104

dvergari · 2024-03-06T13:46:47Z

As of now the only way to drop tables on the RIGHT cluster that not exist on the LEFT is to use they --sync option, but using it if we're converting a legacy managed table to an external one it does not set the PURGE option, potentially keeping unwanted data on the RIGHT cluster.

dstreev · 2024-04-29T14:29:40Z

Adding the purge back in this scenario might cause issues with a schema update, since that process drops and recreates the table. If the purge flag were set, we'd inadvertently remove the data.

What if we built an extra 'post' run file that issued hdfs dfs rm -r -f commands when a RIGHT side schema meets this 'drop' scenario?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep the purge option when syncing schemas #104

Keep the purge option when syncing schemas #104

dvergari commented Mar 6, 2024

dstreev commented Apr 29, 2024

Keep the purge option when syncing schemas #104

Keep the purge option when syncing schemas #104

Comments

dvergari commented Mar 6, 2024

dstreev commented Apr 29, 2024