Skip to content

Commit

Permalink
Legacy and Older Hive 3 Clusters didn't like the dynamic settings for…
Browse files Browse the repository at this point in the history
… partition data movement.
  • Loading branch information
dstreev committed May 13, 2022
1 parent 14679c3 commit 547e13f
Show file tree
Hide file tree
Showing 4 changed files with 24 additions and 24 deletions.
6 changes: 0 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1225,12 +1225,6 @@ The process is complete with a markdown report location at the bottom of the out

The report location is displayed at the bottom of the application window when it's complete.

### Action Report for LEFT and RIGHT Clusters

Under certain conditions, additional actions may be required to achieve the desired 'mirror' state. An output script for each cluster is created with these actions. These actions are NOT run by `hms-mirror.` They should be reviewed and understood by the owner of the dataset being `mirrored` and run when appropriate.

The locations are displayed at the bottom of the application window when it's complete.

### SQL Execution Output

A SQL script of all the actions taken will be written to the local output directory.
Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@

<groupId>com.cloudera.utils.hadoop</groupId>
<artifactId>hms-mirror</artifactId>
<version>1.5.1.5-SNAPSHOT</version>
<version>1.5.1.4.5-SNAPSHOT</version>
<name>hms-mirror</name>

<url>https://github.com/dstreev/hms_mirror</url>
Expand Down
36 changes: 20 additions & 16 deletions src/main/java/com/cloudera/utils/hadoop/hms/stage/Transfer.java
Original file line number Diff line number Diff line change
Expand Up @@ -269,10 +269,11 @@ protected Boolean doSQL() {
String transferDesc = MessageFormat.format(TableUtils.LOAD_FROM_PARTITIONED_SHADOW_DESC, let.getPartitions().size());
ret.addSql(new Pair(transferDesc, transferSql));
} else {
if (!config.getCluster(Environment.RIGHT).getLegacyHive()) {
ret.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION, "set " + MirrorConf.SORT_DYNAMIC_PARTITION + "=false");
ret.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD, "set " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD + "=-1");
}
// Don't set, use system values. May cause issues in some configs and older Hive 3.
// if (!config.getCluster(Environment.RIGHT).getLegacyHive()) {
// ret.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION, "set " + MirrorConf.SORT_DYNAMIC_PARTITION + "=false");
// ret.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD, "set " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD + "=-1");
// }
String partElement = TableUtils.getPartitionElements(let);
String transferSql = MessageFormat.format(MirrorConf.SQL_DATA_TRANSFER_WITH_PARTITIONS_PRESCRIPTIVE,
set.getName(), ret.getName(), partElement);
Expand Down Expand Up @@ -351,10 +352,11 @@ protected Boolean doIntermediateTransfer() {
String transferDesc = MessageFormat.format(TableUtils.LOAD_FROM_PARTITIONED_SHADOW_DESC, let.getPartitions().size());
let.addSql(new Pair(transferDesc, transferSql));
} else {
if (!config.getCluster(Environment.LEFT).getLegacyHive()) {
let.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION, "set " + MirrorConf.SORT_DYNAMIC_PARTITION + "=false");
let.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD, "set " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD + "=-1");
}
// Don't set this. Allow for system values. And old Hive 3 has issues setting this and may cause error.
// if (!config.getCluster(Environment.LEFT).getLegacyHive()) {
// let.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION, "set " + MirrorConf.SORT_DYNAMIC_PARTITION + "=false");
// let.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD, "set " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD + "=-1");
// }
String partElement = TableUtils.getPartitionElements(let);
String transferSql = MessageFormat.format(MirrorConf.SQL_DATA_TRANSFER_WITH_PARTITIONS_PRESCRIPTIVE,
let.getName(), tet.getName(), partElement);
Expand Down Expand Up @@ -408,10 +410,11 @@ protected Boolean doIntermediateTransfer() {
String transferDesc = MessageFormat.format(TableUtils.LOAD_FROM_PARTITIONED_SHADOW_DESC, let.getPartitions().size());
ret.addSql(new Pair(transferDesc, shadowTransferSql));
} else {
if (!config.getCluster(Environment.RIGHT).getLegacyHive()) {
ret.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION, "set " + MirrorConf.SORT_DYNAMIC_PARTITION + "=false");
ret.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD, "set " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD + "=-1");
}
// Don't set, use system settings. Setting values may cause exception in some environments.
// if (!config.getCluster(Environment.RIGHT).getLegacyHive()) {
// ret.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION, "set " + MirrorConf.SORT_DYNAMIC_PARTITION + "=false");
// ret.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD, "set " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD + "=-1");
// }
String partElement = TableUtils.getPartitionElements(let);
String shadowTransferSql = MessageFormat.format(MirrorConf.SQL_DATA_TRANSFER_WITH_PARTITIONS_PRESCRIPTIVE,
set.getName(), ret.getName(), partElement);
Expand Down Expand Up @@ -560,10 +563,11 @@ protected Boolean doStorageMigrationTransfer() {
String transferDesc = MessageFormat.format(TableUtils.STORAGE_MIGRATION_TRANSFER_DESC, let.getPartitions().size());
ret.addSql(new Pair(transferDesc, transferSql));
} else {
if (!config.getCluster(Environment.LEFT).getLegacyHive()) {
let.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION, "set " + MirrorConf.SORT_DYNAMIC_PARTITION + "=false");
let.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD, "set " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD + "=-1");
}
// Don't set, causes issues in older Hive 3.
// if (!config.getCluster(Environment.LEFT).getLegacyHive()) {
// let.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION, "set " + MirrorConf.SORT_DYNAMIC_PARTITION + "=false");
// let.addSql("Setting " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD, "set " + MirrorConf.SORT_DYNAMIC_PARTITION_THRESHOLD + "=-1");
// }
String partElement = TableUtils.getPartitionElements(let);
String transferSql = MessageFormat.format(MirrorConf.SQL_DATA_TRANSFER_WITH_PARTITIONS_PRESCRIPTIVE,
let.getName(), ret.getName(), partElement);
Expand Down
4 changes: 3 additions & 1 deletion strategy_docs/data_movement.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

### SQL

The **SQL** data strategy will use Hive SQL to move data between clusters. When the cluster don't have direct line of sight to each other and can NOT be [linked](../README.md#linking-clusters-storage-layers), you can use options like `-cs` or `-is` to bridge the gap.

#### Options

##### `-ma|--migrate-acid` or `-mao|--migrate-acid-only`
Expand Down Expand Up @@ -36,7 +38,7 @@ Are used to set the *databases* default locations for managed and external table

##### `-rdl|--reset-to-default-location`

Regardless of where the source data _relative_ location is on the filesystem, this will reset it to the default location.
Regardless of where the source data _relative_ location was on the filesystem, this will reset it to the default location on the new cluster.

If `-dc|--distcp` is used, then the `warehouse` options are required in order for `hms-mirror` to build the `distcp` workplan.

Expand Down

0 comments on commit 547e13f

Please sign in to comment.