Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep the purge option when syncing schemas #104

Open
dvergari opened this issue Mar 6, 2024 · 1 comment
Open

Keep the purge option when syncing schemas #104

dvergari opened this issue Mar 6, 2024 · 1 comment

Comments

@dvergari
Copy link

dvergari commented Mar 6, 2024

As of now the only way to drop tables on the RIGHT cluster that not exist on the LEFT is to use they --sync option, but using it if we're converting a legacy managed table to an external one it does not set the PURGE option, potentially keeping unwanted data on the RIGHT cluster.

@dstreev
Copy link
Collaborator

dstreev commented Apr 29, 2024

Adding the purge back in this scenario might cause issues with a schema update, since that process drops and recreates the table. If the purge flag were set, we'd inadvertently remove the data.

What if we built an extra 'post' run file that issued hdfs dfs rm -r -f commands when a RIGHT side schema meets this 'drop' scenario?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants