Skip to content

Commit

Permalink
Add personalized configuration parameters for each metastore. (Expedi…
Browse files Browse the repository at this point in the history
…aGroup#315)

* Add personalized configuration parameters for each metastore.

* Add personalized configuration parameters for each metastore

* Recover

* Update junit test

* Update Junit Test

* Update Junit Test

* Update Junit Test

* Format the code and update the readme

* Revert

* Update FederatedMetaStoreTest.java

* Update PrimaryMetaStoreTest.java

* Update AbstractMetaStore.java

using new HashMap so the generated Yaml doesn't generate an anchor (reference &id001)

* Update YamlFederatedMetaStoreStorageTest.java

fixing test

---------

Co-authored-by: yangyx <[email protected]>
Co-authored-by: Patrick Duin <[email protected]>
  • Loading branch information
3 people authored and flaming-archer committed May 28, 2024
1 parent f0460f1 commit bcdd35e
Show file tree
Hide file tree
Showing 8 changed files with 43 additions and 14 deletions.
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,8 @@ Example:
- my_writable_db2
- user_db_.*
- ...
configuration-properties:
hive.metastore.kerberos.principal: hive/[email protected]
federated-meta-stores: # List of read only metastores to federate
- remote-meta-store-uris: thrift://10.0.0.1:9083
name: secondary
Expand All @@ -147,6 +149,8 @@ Example:
- database: prod_db2
mapped-tables:
- tbl2
configuration-properties:
hive.metastore.kerberos.principal: hive/[email protected]
- ...

The table below describes all the available configuration values for Waggle Dance federations:
Expand All @@ -166,6 +170,7 @@ The table below describes all the available configuration values for Waggle Danc
| `primary-meta-store.mapped-tables` | No | List of mappings from databases to tables to federate from the primary metastore, similar to `mapped-databases`. By default, all tables are available. See `mapped-tables` configuration below. |
| `primary-meta-stores.hive-metastore-filter-hook` | No | Name of the class which implements the `MetaStoreFilterHook` interface from Hive. This allows a metastore filter hook to be applied to the corresponding Hive metastore calls. Can be configured with the `configuration-properties` specified in the `waggle-dance-server.yml` configuration. They will be added in the HiveConf object that is given to the constructor of the `MetaStoreFilterHook` implementation you provide. |
| `primary-meta-stores.database-name-mapping` | No | BiDirectional Map of database names and mapped name, where key=`<database name as known in the primary metastore>` and value=`<name that should be shown to a client>`. See the [Database Name Mapping](#database-name-mapping) section.|
| `primary-meta-stores.configuration-properties` | No | Map of the primary metastore personalized properties that will be added to the HiveConf used when creating the Thrift clients (they will be effect only on this client),the priority is higher than the properites of the same name in waggle-dance-server.yml. |
| `federated-meta-stores` | No | Possible empty list of read only federated metastores. |
| `federated-meta-stores[n].remote-meta-store-uris` | Yes | Thrift URIs of the federated read-only metastore. |
| `federated-meta-stores[n].name` | Yes | Name that uniquely identifies this metastore. Used internally. Cannot be empty. |
Expand All @@ -178,6 +183,7 @@ The table below describes all the available configuration values for Waggle Danc
| `federated-meta-stores[n].hive-metastore-filter-hook` | No | Name of the class which implements the `MetaStoreFilterHook` interface from Hive. This allows a metastore filter hook to be applied to the corresponding Hive metastore calls. Can be configured with the `configuration-properties` specified in the `waggle-dance-server.yml` configuration. They will be added in the HiveConf object that is given to the constructor of the `MetaStoreFilterHook` implementation you provide. |
| `federated-meta-stores[n].database-name-mapping` | No | BiDirectional Map of database names and mapped names where key=`<database name as known in the federated metastore>` and value=`<name that should be shown to a client>`. See the [Database Name Mapping](#database-name-mapping) section.|
| `federated-meta-stores[n].writable-database-white-list` | No | White-list of databases used to verify write access used in conjunction with `federated-meta-stores[n].access-control-type`. The list of databases should be listed without a `federated-meta-stores[n].database-prefix`. This property supports both full database names and (case-insensitive) [Java RegEx patterns](https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html).|
| `federated-meta-stores[n].configuration-properties` | No | Map of the federate metastore personalized properties that will be added to the HiveConf used when creating the Thrift clients (they will be effect only on this client),the priority is higher than the properites of the same name in waggle-dance-server.yml. |

#### Metastore tunnel
The table below describes the metastore tunnel configuration values:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@

import java.beans.Transient;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

Expand Down Expand Up @@ -51,7 +52,7 @@ public abstract class AbstractMetaStore {
private List<String> writableDatabaseWhitelist;
private List<String> mappedDatabases;
private @Valid List<MappedTables> mappedTables;
private Map<String, String> databaseNameMapping = Collections.emptyMap();
private Map<String, String> databaseNameMapping = new HashMap<>();
private @NotBlank String name;
private @NotBlank String remoteMetaStoreUris;
private @Valid MetastoreTunnel metastoreTunnel;
Expand All @@ -60,6 +61,8 @@ public abstract class AbstractMetaStore {
private long latency = 0;
private transient @JsonIgnore HashBiMap<String, String> databaseNameBiMapping = HashBiMap.create();
private boolean impersonationEnabled;
private Map<String, String> configurationProperties = new HashMap<>();

public AbstractMetaStore(String name, String remoteMetaStoreUris, AccessControlType accessControlType) {
this.name = name;
this.remoteMetaStoreUris = remoteMetaStoreUris;
Expand Down Expand Up @@ -201,6 +204,15 @@ public HashBiMap<String, String> getDatabaseNameBiMapping() {
return databaseNameBiMapping;
}

public Map<String, String> getConfigurationProperties() {
return configurationProperties;
}

public void setConfigurationProperties(
Map<String, String> configurationProperties) {
this.configurationProperties = configurationProperties;
}

@Transient
public MetaStoreStatus getStatus() {
return status;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ public void nullDatabasePrefix() {

@Test
public void toJson() throws Exception {
String expected = "{\"accessControlType\":\"READ_ONLY\",\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"name_\",\"federationType\":\"FEDERATED\",\"hiveMetastoreFilterHook\":null,\"impersonationEnabled\":false,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
String expected = "{\"accessControlType\":\"READ_ONLY\",\"configurationProperties\":{},\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"name_\",\"federationType\":\"FEDERATED\",\"hiveMetastoreFilterHook\":null,\"impersonationEnabled\":false,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
ObjectMapper mapper = new ObjectMapper();
// Sorting to get deterministic test behaviour
mapper.enable(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ public void nonEmptyDatabasePrefix() {

@Test
public void toJson() throws Exception {
String expected = "{\"accessControlType\":\"READ_ONLY\",\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"\",\"federationType\":\"PRIMARY\",\"hiveMetastoreFilterHook\":null,\"impersonationEnabled\":false,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
String expected = "{\"accessControlType\":\"READ_ONLY\",\"configurationProperties\":{},\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"\",\"federationType\":\"PRIMARY\",\"hiveMetastoreFilterHook\":null,\"impersonationEnabled\":false,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
ObjectMapper mapper = new ObjectMapper();
// Sorting to get deterministic test behaviour
mapper.enable(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,10 @@ public CloseableThriftHiveMetastoreIface newInstance(AbstractMetaStore metaStore
if (waggleDanceConfiguration.getConfigurationProperties() != null) {
properties.putAll(waggleDanceConfiguration.getConfigurationProperties());
}
if (metaStore.getConfigurationProperties() != null) {
properties.putAll(metaStore.getConfigurationProperties());
}

return newHiveInstance(metaStore, properties);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,6 @@
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;

import javax.security.sasl.SaslException;

import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hadoop.hive.conf.HiveConf.ConfVars;
import org.apache.hadoop.hive.metastore.api.MetaException;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/**
* Copyright (C) 2016-2021 Expedia, Inc.
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -24,6 +24,7 @@

import static com.hotels.bdp.waggledance.api.model.AbstractMetaStore.newFederatedInstance;

import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.TimeUnit;
Expand All @@ -38,6 +39,7 @@
import org.mockito.junit.MockitoJUnitRunner;

import com.hotels.bdp.waggledance.api.model.AbstractMetaStore;
import com.hotels.bdp.waggledance.api.model.FederatedMetaStore;
import com.hotels.bdp.waggledance.client.tunnelling.TunnelingMetaStoreClientFactory;
import com.hotels.bdp.waggledance.conf.WaggleDanceConfiguration;
import com.hotels.hcommon.hive.metastore.client.tunnelling.MetastoreTunnel;
Expand Down Expand Up @@ -68,8 +70,9 @@ public void setUp() {
@Test
public void defaultFactory() {
ArgumentCaptor<HiveConf> hiveConfCaptor = ArgumentCaptor.forClass(HiveConf.class);

factory.newInstance(newFederatedInstance("fed1", THRIFT_URI));
FederatedMetaStore fed1 = newFederatedInstance("fed1", THRIFT_URI);
fed1.setConfigurationProperties(Collections.singletonMap(ConfVars.METASTORE_KERBEROS_PRINCIPAL.varname, "hive/[email protected]"));
factory.newInstance(fed1);
verify(defaultMetaStoreClientFactory).newInstance(hiveConfCaptor.capture(), eq(
"waggledance-fed1"), eq(3), eq(2000));
verifyNoInteractions(tunnelingMetaStoreClientFactory);
Expand All @@ -80,6 +83,7 @@ public void defaultFactory() {
assertThat(hiveConf.getTimeVar(ConfVars.METASTORE_CLIENT_CONNECT_RETRY_DELAY, TimeUnit.SECONDS), is(5L));
assertThat(hiveConf.getBoolVar(ConfVars.METASTORE_USE_THRIFT_FRAMED_TRANSPORT), is(true));
assertThat(hiveConf.getBoolVar(ConfVars.METASTORE_USE_THRIFT_COMPACT_PROTOCOL), is(false));
assertThat(hiveConf.getVar(ConfVars.METASTORE_KERBEROS_PRINCIPAL), is("hive/[email protected]"));
}

@Test
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
import java.io.File;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.util.Collections;
import java.util.List;

import javax.validation.ConstraintViolationException;
Expand Down Expand Up @@ -211,12 +212,13 @@ public void saveFederationWriteFederations() throws Exception {
MappedTables mappedTables2 = new MappedTables("db2", Lists.newArrayList("tbl2"));
newFederatedInstance.setMappedTables(Lists.newArrayList(mappedTables1, mappedTables2));
newFederatedInstance.setHiveMetastoreFilterHook("filter.hook.class");
newFederatedInstance.setConfigurationProperties(Collections.singletonMap("hive.metastore.kerberos.principal", "hive/_HOST@REALM"));
storage.insert(newFederatedInstance);
storage.saveFederation();
List<String> lines = Files.readAllLines(file.toPath(), StandardCharsets.UTF_8);
assertThat(lines.size(), is(26));
assertThat(lines.size(), is(27));
int i = 0;
while (i < 26) {
while (i < lines.size()) {
assertThat(lines.get(i++), is("primary-meta-store:"));
assertThat(lines.get(i++), is(" access-control-type: READ_ONLY"));
assertThat(lines.get(i++), is(" database-prefix: ''"));
Expand All @@ -226,7 +228,8 @@ public void saveFederationWriteFederations() throws Exception {
assertThat(lines.get(i++), is(" remote-meta-store-uris: thrift://localhost:19083"));
assertThat(lines.get(i++), is("federated-meta-stores:"));
assertThat(lines.get(i++), is("- access-control-type: READ_ONLY"));
assertThat(lines.get(i++), is(" database-name-mapping: {}"));
assertThat(lines.get(i++), is(" configuration-properties:"));
assertThat(lines.get(i++), is(" hive.metastore.kerberos.principal: hive/_HOST@REALM"));
assertThat(lines.get(i++), is(" database-prefix: hcom_2_"));
assertThat(lines.get(i++), is(" hive-metastore-filter-hook: filter.hook.class"));
assertThat(lines.get(i++), is(" impersonation-enabled: false"));
Expand Down Expand Up @@ -297,15 +300,18 @@ public void savePrimaryWriteFederations() throws Exception {
MappedTables mappedTables1 = new MappedTables("db1", Lists.newArrayList("tbl1"));
MappedTables mappedTables2 = new MappedTables("db2", Lists.newArrayList("tbl2"));
primaryMetaStore.setMappedTables(Lists.newArrayList(mappedTables1, mappedTables2));
primaryMetaStore.setConfigurationProperties(Collections.singletonMap("hive.metastore.kerberos.principal", "hive/_HOST@REALM"));
storage.insert(primaryMetaStore);
storage.insert(newFederatedInstance("hcom_2", "thrift://localhost:29083"));
storage.saveFederation();
List<String> lines = Files.readAllLines(file.toPath(), StandardCharsets.UTF_8);
assertThat(lines.size(), is(25));
assertThat(lines.size(), is(26));
int i = 0;
while (i < 25) {
while (i < lines.size()) {
assertThat(lines.get(i++), is("primary-meta-store:"));
assertThat(lines.get(i++), is(" access-control-type: READ_ONLY"));
assertThat(lines.get(i++), is(" configuration-properties:"));
assertThat(lines.get(i++), is(" hive.metastore.kerberos.principal: hive/_HOST@REALM"));
assertThat(lines.get(i++), is(" database-prefix: ''"));
assertThat(lines.get(i++), is(" impersonation-enabled: false"));
assertThat(lines.get(i++), is(" latency: 0"));
Expand All @@ -323,7 +329,6 @@ public void savePrimaryWriteFederations() throws Exception {
assertThat(lines.get(i++), is(" remote-meta-store-uris: thrift://localhost:19083"));
assertThat(lines.get(i++), is("federated-meta-stores:"));
assertThat(lines.get(i++), is("- access-control-type: READ_ONLY"));
assertThat(lines.get(i++), is(" database-name-mapping: {}"));
assertThat(lines.get(i++), is(" database-prefix: hcom_2_"));
assertThat(lines.get(i++), is(" impersonation-enabled: false"));
assertThat(lines.get(i++), is(" latency: 0"));
Expand Down

0 comments on commit bcdd35e

Please sign in to comment.