Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use different tokens instead of forcing WD and all HMS to use the same delegatetoken in the kerberos environment #313

Merged
merged 9 commits into from
May 29, 2024
109 changes: 45 additions & 64 deletions HowToKerberize.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,83 +24,64 @@ In addition, because Kerberos authentication requires a delegation-token to prox
* Zookeeper to store delegation-token (Recommended)

### Configuration
Waggle Dance `waggle-dance-server.yml` example:

Waggle Dance does not read Hadoop's `core-site.xml` so a general property providing Kerberos auth should be added to
the Hive configuration file `hive-site.xml`:

```
<property>
<name>hadoop.security.authentication</name>
<value>KERBEROS</value>
</property>
```


Waggle Dance also needs a keytab file to communicate with the Metastore so the following properties should be present:
```
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/_HOST@YOUR_REALM.COM</value>
</property>
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/etc/hive.keytab</value>
</property>
port: 9083
verbose: true
#database-resolution: MANUAL
database-resolution: PREFIXED
yaml-storage:
overwrite-config-on-shutdown: false
logging:
config: file:/path/to/log4j2.xml
configuration-properties:
hadoop.security.authentication: KERBEROS
hive.metastore.sasl.enabled: true
hive.metastore.kerberos.principal: hive/[email protected]
hive.metastore.kerberos.keytab.file: /path/to/hive.keytab
hive.cluster.delegation.token.store.class: org.apache.hadoop.hive.thrift.ZooKeeperTokenStore
hive.cluster.delegation.token.store.zookeeper.connectString: zz1:2181,zz2:2181,zz3:2181
hive.cluster.delegation.token.store.zookeeper.znode: /hive/cluster/wd_delegation
hive.server2.authentication: KERBEROS
hive.server2.authentication.kerberos.principal: hive/[email protected]
hive.server2.authentication.kerberos.keytab: /path/to/hive.keytab
hive.server2.authentication.client.kerberos.principal: hive/[email protected]
hadoop.kerberos.keytab.login.autorenewal.enabled : true
hadoop.proxyuser.hive.users: '*'
hadoop.proxyuser.hive.hosts: '*'
```

In addition, all metastores need to use the Zookeeper shared token:
Waggle Dance `waggle-dance-federation.yml` example:
```
<property>
<name>hive.cluster.delegation.token.store.class</name>
<value>org.apache.hadoop.hive.thrift.ZooKeeperTokenStore</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.zookeeper.connectString</name>
<value>zk1:2181,zk2:2181,zk3:2181</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.zookeeper.znode</name>
<value>/hive/token</value>
</property>
primary-meta-store:
database-prefix: ''
name: local
remote-meta-store-uris: thrift://ms1:9083
access-control-type: READ_AND_WRITE_AND_CREATE
impersonation-enabled: true
federated-meta-stores:
- remote-meta-store-uris: thrift://ms2:9083
database-prefix: dw_
name: remote
impersonation-enabled: true
access-control-type: READ_AND_WRITE_ON_DATABASE_WHITELIST
writable-database-white-list:
- .*
```

If you are intending to use a Beeline client, the following properties may be valuable:
Connect to Waggle Dance via beeline, change ` hive.metastore.uris` in Hive configuration file `hive-site.xml`:
```
<property>
<name>hive.server2.transport.mode</name>
<value>http</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/_HOST@YOUR_REALM.COM</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/hive.keytab</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<name>hive.metastore.uris</name>
<value>thrift://wd:9083</value>
</property>
```


### Running

Waggle Dance should be started by a privileged user with a fresh keytab.

If Waggle Dance throws a GSS exception, you have problem with the keytab file.
Try to perform `kdestroy` and `kinit` operations and check the keytab file ownership flags.

If the Metastore throws an exception with code -127, Waggle Dance is probably using the wrong authentication policy.
Check the values in `hive-conf.xml` and make sure that HIVE_HOME and HIVE_CONF_DIR are defined.

Don't forget to restart hive services!
Just start the service directly, no kinit operation is required.
Because the ticket information is saved in jvm instead of being saved in a local file.
In this way, it can automatically renew without the need for additional operations to renew local tickets.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ The table below describes all the available configuration values for Waggle Danc
| `primary-meta-store.name` | Yes | Database name that uniquely identifies this metastore. Used internally. Cannot be empty. |
| `primary-meta-store.database-prefix` | No | Prefix used to access the primary metastore and differentiate databases in it from databases in another metastore. The default prefix (i.e. if this value isn't explicitly set) is empty string.|
| `primary-meta-store.access-control-type` | No | Sets how the client access controls should be handled. Default is `READ_ONLY` Other options `READ_AND_WRITE_AND_CREATE`, `READ_AND_WRITE_ON_DATABASE_WHITELIST` and `READ_AND_WRITE_AND_CREATE_ON_DATABASE_WHITELIST` see Access Control section below. |
| `primary-meta-store.impersonation-enabled` | No | Enable metastore end-user impersonation.|
| `primary-meta-store.writable-database-white-list` | No | White-list of databases used to verify write access used in conjunction with `primary-meta-store.access-control-type`. The list of databases should be listed without any `primary-meta-store.database-prefix`. This property supports both full database names and (case-insensitive) [Java RegEx patterns](https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html).|
| `primary-meta-store.metastore-tunnel` | No | See metastore tunnel configuration values below. |
| `primary-meta-store.latency` | No | Indicates the acceptable slowness of the metastore in **milliseconds** for increasing the default connection timeout. Default latency is `0` and should be changed if the metastore is particularly slow. If you get an error saying that results were omitted because the metastore was slow, consider changing the latency to a higher number.|
Expand All @@ -168,6 +169,7 @@ The table below describes all the available configuration values for Waggle Danc
| `federated-meta-stores` | No | Possible empty list of read only federated metastores. |
| `federated-meta-stores[n].remote-meta-store-uris` | Yes | Thrift URIs of the federated read-only metastore. |
| `federated-meta-stores[n].name` | Yes | Name that uniquely identifies this metastore. Used internally. Cannot be empty. |
| `federated-meta-stores[n].impersonation-enabled` | No | Enable metastore end-user impersonation.|
| `federated-meta-stores[n].database-prefix` | No | Prefix used to access this particular metastore and differentiate databases in it from databases in another metastore. Typically used if databases have the same name across metastores but federated access to them is still needed. The default prefix (i.e. if this value isn't explicitly set) is {federated-meta-stores[n].name} lowercased and postfixed with an underscore. For example if the metastore name was configured as "waggle" and no database prefix was provided but `PREFIXED` database resolution was used then the value of `database-prefix` would be "waggle_". |
| `federated-meta-stores[n].metastore-tunnel` | No | See metastore tunnel configuration values below. |
| `federated-meta-stores[n].latency` | No | Indicates the acceptable slowness of the metastore in **milliseconds** for increasing the default connection timeout. Default latency is `0` and should be changed if the metastore is particularly slow. If you get an error saying that results were omitted because the metastore was slow, consider changing the latency to a higher number.|
Expand Down
Binary file modified kerberos-process.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/**
* Copyright (C) 2016-2023 Expedia, Inc.
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -59,7 +59,7 @@ public abstract class AbstractMetaStore {
private transient @JsonProperty @NotNull MetaStoreStatus status = MetaStoreStatus.UNKNOWN;
private long latency = 0;
private transient @JsonIgnore HashBiMap<String, String> databaseNameBiMapping = HashBiMap.create();

private boolean impersonationEnabled;
public AbstractMetaStore(String name, String remoteMetaStoreUris, AccessControlType accessControlType) {
this.name = name;
this.remoteMetaStoreUris = remoteMetaStoreUris;
Expand Down Expand Up @@ -211,6 +211,14 @@ public void setStatus(MetaStoreStatus status) {
this.status = status;
}

public boolean isImpersonationEnabled() {
return impersonationEnabled;
}

public void setImpersonationEnabled(boolean impersonationEnabled) {
this.impersonationEnabled = impersonationEnabled;
}

@Override
public int hashCode() {
return Objects.hashCode(name);
Expand Down Expand Up @@ -242,5 +250,4 @@ public String toString() {
.add("status", status)
.toString();
}

}
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/**
* Copyright (C) 2016-2021 Expedia, Inc.
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -72,7 +72,7 @@ public void nullDatabasePrefix() {

@Test
public void toJson() throws Exception {
String expected = "{\"accessControlType\":\"READ_ONLY\",\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"name_\",\"federationType\":\"FEDERATED\",\"hiveMetastoreFilterHook\":null,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
String expected = "{\"accessControlType\":\"READ_ONLY\",\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"name_\",\"federationType\":\"FEDERATED\",\"hiveMetastoreFilterHook\":null,\"impersonationEnabled\":false,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
ObjectMapper mapper = new ObjectMapper();
// Sorting to get deterministic test behaviour
mapper.enable(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY);
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/**
* Copyright (C) 2016-2021 Expedia, Inc.
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -89,7 +89,7 @@ public void nonEmptyDatabasePrefix() {

@Test
public void toJson() throws Exception {
String expected = "{\"accessControlType\":\"READ_ONLY\",\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"\",\"federationType\":\"PRIMARY\",\"hiveMetastoreFilterHook\":null,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
String expected = "{\"accessControlType\":\"READ_ONLY\",\"connectionType\":\"DIRECT\",\"databaseNameMapping\":{},\"databasePrefix\":\"\",\"federationType\":\"PRIMARY\",\"hiveMetastoreFilterHook\":null,\"impersonationEnabled\":false,\"latency\":0,\"mappedDatabases\":null,\"mappedTables\":null,\"metastoreTunnel\":null,\"name\":\"name\",\"remoteMetaStoreUris\":\"uri\",\"status\":\"UNKNOWN\",\"writableDatabaseWhiteList\":[]}";
ObjectMapper mapper = new ObjectMapper();
// Sorting to get deterministic test behaviour
mapper.enable(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
/**
* Copyright (C) 2016-2024 Expedia, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.hotels.bdp.waggledance.client;

import java.io.Closeable;
import java.net.URI;
import java.util.Random;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;

import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hadoop.hive.conf.HiveConf.ConfVars;
import org.apache.hadoop.hive.conf.HiveConfUtil;
import org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore;
import org.apache.thrift.TException;
import org.apache.thrift.transport.TTransport;

import lombok.extern.log4j.Log4j2;

import com.hotels.bdp.waggledance.client.compatibility.HiveCompatibleThriftHiveMetastoreIfaceFactory;

@Log4j2
public abstract class AbstractThriftMetastoreClientManager implements Closeable {


static final AtomicInteger CONN_COUNT = new AtomicInteger(0);
flaming-archer marked this conversation as resolved.
Show resolved Hide resolved
patduin marked this conversation as resolved.
Show resolved Hide resolved
final HiveConf conf;
final HiveCompatibleThriftHiveMetastoreIfaceFactory hiveCompatibleThriftHiveMetastoreIfaceFactory;
final URI[] metastoreUris;
ThriftHiveMetastore.Iface client = null;
TTransport transport = null;
boolean isConnected = false;
// for thrift connects
int retries = 5;
long retryDelaySeconds = 0;

final int connectionTimeout;
final String msUri;
patduin marked this conversation as resolved.
Show resolved Hide resolved

AbstractThriftMetastoreClientManager(
HiveConf conf,
HiveCompatibleThriftHiveMetastoreIfaceFactory hiveCompatibleThriftHiveMetastoreIfaceFactory,
int connectionTimeout) {
this.conf = conf;
this.hiveCompatibleThriftHiveMetastoreIfaceFactory = hiveCompatibleThriftHiveMetastoreIfaceFactory;
this.connectionTimeout = connectionTimeout;
msUri = conf.getVar(ConfVars.METASTOREURIS);

if (HiveConfUtil.isEmbeddedMetaStore(msUri)) {
throw new RuntimeException("You can't waggle an embedded metastore");
}

// get the number retries
patduin marked this conversation as resolved.
Show resolved Hide resolved
retries = HiveConf.getIntVar(conf, ConfVars.METASTORETHRIFTCONNECTIONRETRIES);
patduin marked this conversation as resolved.
Show resolved Hide resolved
retryDelaySeconds = conf.getTimeVar(ConfVars.METASTORE_CLIENT_CONNECT_RETRY_DELAY, TimeUnit.SECONDS);

// user wants file store based configuration
if (msUri != null) {
patduin marked this conversation as resolved.
Show resolved Hide resolved
String[] metastoreUrisString = msUri.split(",");
metastoreUris = new URI[metastoreUrisString.length];
try {
int i = 0;
for (String s : metastoreUrisString) {
URI tmpUri = new URI(s);
if (tmpUri.getScheme() == null) {
throw new IllegalArgumentException("URI: " + s + " does not have a scheme");
}
metastoreUris[i++] = tmpUri;
}
} catch (IllegalArgumentException e) {
patduin marked this conversation as resolved.
Show resolved Hide resolved
throw (e);
} catch (Exception e) {
String exInfo = "Got exception: " + e.getClass().getName() + " " + e.getMessage();
log.error(exInfo, e);
patduin marked this conversation as resolved.
Show resolved Hide resolved
throw new RuntimeException(exInfo, e);
}
} else {
log.error("NOT getting uris from conf");
patduin marked this conversation as resolved.
Show resolved Hide resolved
throw new RuntimeException("MetaStoreURIs not found in conf file");
}
}

void open() {
open(null);
}

abstract void open(HiveUgiArgs ugiArgs);

void reconnect(HiveUgiArgs ugiArgs) {
close();
// Swap the first element of the metastoreUris[] with a random element from the rest
// of the array. Rationale being that this method will generally be called when the default
// connection has died and the default connection is likely to be the first array element.
promoteRandomMetaStoreURI();
patduin marked this conversation as resolved.
Show resolved Hide resolved
open(ugiArgs);
}

public String getHiveConfValue(String key, String defaultValue) {
return conf.get(key, defaultValue);
}

public void setHiveConfValue(String key, String value) {
conf.set(key, value);
}

public String generateNewTokenSignature(String defaultTokenSignature) {
flaming-archer marked this conversation as resolved.
Show resolved Hide resolved
String tokenSignature = conf.get(ConfVars.METASTORE_TOKEN_SIGNATURE.varname,
defaultTokenSignature);
conf.set(ConfVars.METASTORE_TOKEN_SIGNATURE.varname,
tokenSignature);
return tokenSignature;
}

public Boolean isSaslEnabled() {
flaming-archer marked this conversation as resolved.
Show resolved Hide resolved
return conf.getBoolVar(ConfVars.METASTORE_USE_THRIFT_SASL);
}

@Override
public void close() {
if (!isConnected) {
return;
}
isConnected = false;
try {
if (client != null) {
client.shutdown();
}
} catch (TException e) {
log.debug("Unable to shutdown metastore client. Will try closing transport directly.", e);
patduin marked this conversation as resolved.
Show resolved Hide resolved
}
// Transport would have got closed via client.shutdown(), so we don't need this, but
// just in case, we make this call.
if ((transport != null) && transport.isOpen()) {
transport.close();
transport = null;
}
log.info("Closed a connection to metastore, current connections: {}", CONN_COUNT.decrementAndGet());
}

boolean isOpen() {
return (transport != null) && transport.isOpen();
}

protected ThriftHiveMetastore.Iface getClient() {
return client;
}

/**
* Swaps the first element of the metastoreUris array with a random element from the remainder of the array.
*/
private void promoteRandomMetaStoreURI() {
patduin marked this conversation as resolved.
Show resolved Hide resolved
if (metastoreUris.length <= 1) {
return;
}
Random rng = new Random();
int index = rng.nextInt(metastoreUris.length - 1) + 1;
URI tmp = metastoreUris[0];
metastoreUris[0] = metastoreUris[index];
metastoreUris[index] = tmp;
}
}
Loading
Loading