Including SessionToken in url with v2 driver causes SQLException: IAM error retrieving temp credentials #96

Y-Asahi-dev · 2023-09-01T04:13:14Z

Driver version

2.1.0.9

Redshift version

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.54052

Client Operating System

Amazon EMR ver6.9

※OS info
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"

JAVA/JVM version

openjdk version "1.8.0_382"
OpenJDK Runtime Environment Corretto-8.382.05.1 (build 1.8.0_382-b05)
OpenJDK 64-Bit Server VM Corretto-8.382.05.1 (build 25.382-b05, mixed mode)

Table schema

Problem description

I am using amazon-redshift-jdbc-driver v2 with Pyspark (Spark version 3.3.2).
I get a SQLException when I run the code below.
It seems that an error occurs if the URL has a SessionToken parameter.
After replacing the jdbc-driver with v1 I get ret.count() results successfully without any errors.

Has the behavior changed between v1 and v2 when there is a SessionToken?

//-----------------------------------------------------------
from pyspark import SparkContext
from pyspark.sql import SQLContext
import boto3

redshift_cluster_id = 'sample_cluster'
redshift_dbname = 'sample_db'
bucket_name = 'sample_bucket'

sc = SparkContext.getOrCreate()
credentials = boto3.Session().get_credentials()
region = boto3.Session().region_name
sc._jsc.hadoopConfiguration().set("fs.s3.awsAccessKeyId", credentials.access_key)
sc._jsc.hadoopConfiguration().set("fs.s3.awsSecretAccessKey", credentials.secret_key)

url = f'jdbc:redshift:iam://{redshift_cluster_id}:{region}/{redshift_dbname}?DbUser=test_uesr&DbGroups=test_users_group&AutoCreate=true&AccessKeyID={credentials.access_key}&SecretAccessKey={credentials.secret_key}&user=&password='
token = credentials.token
sc._jsc.hadoopConfiguration().set("fs.s3.awsSessionToken", token)
url = f'{url}&SessionToken={credentials.token}'

sql_context = SQLContext(sc)
dfr = sql_context.read
.format('io.github.spark_redshift_community.spark.redshift')
.option('url', url)
.option('tempdir', f's3a://{bucket_name}/redshift_temp_dir')
.option('forward_spark_s3_credentials', 'true')
.option('fetchsize', 10000)

query = f'select * from sample_table'
ret = dfr.option('query', query).load(schema=None)
ret.count()
-----------------------------------------------------------//

Expected behaviour:
Actual behaviour:
Error message/stack trace:
: java.sql.SQLException: IAM error retrieving temp credentials: The security token included in the request is invalid (Service: AmazonRedshift; Status Code: 403; Error Code: InvalidClientTokenId; Request ID: 78fc4d63-34c5-45a1-af4f-454c122d93e5; Proxy: null)
at com.amazon.redshift.util.RedshiftException.getSQLException(RedshiftException.java:56)
at com.amazon.redshift.Driver.connect(Driver.java:339)
at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:46)
at io.github.spark_redshift_community.spark.redshift.JDBCWrapper.getConnector(RedshiftJDBCWrapper.scala:270)
at io.github.spark_redshift_community.spark.redshift.RedshiftRelation.$anonfun$schema$1(RedshiftRelation.scala:69)
at scala.Option.getOrElse(Option.scala:189)
at io.github.spark_redshift_community.spark.redshift.RedshiftRelation.schema$lzycompute(RedshiftRelation.scala:66)
at io.github.spark_redshift_community.spark.redshift.RedshiftRelation.schema(RedshiftRelation.scala:65)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:440)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:171)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.amazonaws.services.redshift.model.AmazonRedshiftException: The security token included in the request is invalid (Service: AmazonRedshift; Status Code: 403; Error Code: InvalidClientTokenId; Request ID: 78fc4d63-34c5-45a1-af4f-454c122d93e5; Proxy: null)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1418)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1387)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
at com.amazonaws.services.redshift.AmazonRedshiftClient.doInvoke(AmazonRedshiftClient.java:9234)
at com.amazonaws.services.redshift.AmazonRedshiftClient.invoke(AmazonRedshiftClient.java:9201)
at com.amazonaws.services.redshift.AmazonRedshiftClient.invoke(AmazonRedshiftClient.java:9190)
at com.amazonaws.services.redshift.AmazonRedshiftClient.executeDescribeClusters(AmazonRedshiftClient.java:4581)
at com.amazonaws.services.redshift.AmazonRedshiftClient.describeClusters(AmazonRedshiftClient.java:4549)
at com.amazon.redshift.core.IamHelper.setClusterCredentials(IamHelper.java:546)
at com.amazon.redshift.core.IamHelper.setIAMCredentials(IamHelper.java:518)
at com.amazon.redshift.core.IamHelper.setIAMProperties(IamHelper.java:279)
at com.amazon.redshift.jdbc.RedshiftConnectionImpl.(RedshiftConnectionImpl.java:260)
at com.amazon.redshift.Driver.makeConnection(Driver.java:502)
at com.amazon.redshift.Driver.connect(Driver.java:315)
... 24 more
Any other details that can be helpful:

JDBC trace logs

Reproduction code

bhvkshah · 2023-09-01T14:39:34Z

Hi @Y-Asahi-dev , thanks for reporting this issue. Can you try:
1/ with the latest driver, version 2.1.0.18
2/ passing the session token as part of a Property object, instead of as part of the URL?

Y-Asahi-dev · 2023-09-05T02:33:40Z

@bhvkshah

1/ with the latest driver, version 2.1.0.18

There was no change in the results.

2/ passing the session token as part of a Property object, instead of as part of the URL?

https://docs.aws.amazon.com/ja_jp/redshift/latest/mgmt/spark-redshift-connector-other-config.html

Remove the IAM parameters from the URL and

temporary_aws_access_key_id
temporary_aws_secret_access_key
temporary_aws_session_token

When I specified it in the config setting value, the process was successful.
But this config can be processed normally even if the value is empty, so it probably doesn't work.
By removing the IAM parameter from the URL, it seems that Redshift is only auto-loading a valid IAM user.

thanks.

ravi-Mode · 2024-09-03T16:25:27Z

Hi @bhvkshah
I am also trying to use the iam role base authetication with assume role api using the jdbc driver version 2.1.0.30. Here I am observing strange behavior, I am able to generate acceskey, secerectAccesskey and session token using asume role api.

But when I pass this as part of URL string parameter then it is failing with Caused by: com.amazonaws.services.redshift.model.AmazonRedshiftException: The security token included in the request is invalid

However when I passed using properties file then it is able to pass this state but now failing with error message java.sql.SQLException: FATAL: user "IAM:" does not exist.

The same accessKey, secretAccesskey and session token if used with aws cli with redshift data api it is working fine.
aws redshift-data execute-statement --sql "select 1" --database dev --cluster-identifier <clusterId>

ravi-Mode · 2024-09-03T16:41:51Z

Hi @bhvkshah I am also trying to use the iam role base authetication with assume role api using the jdbc driver version 2.1.0.30. Here I am observing strange behavior, I am able to generate acceskey, secerectAccesskey and session token using asume role api.

But when I pass this as part of URL string parameter then it is failing with Caused by: com.amazonaws.services.redshift.model.AmazonRedshiftException: The security token included in the request is invalid

However when I passed using properties file then it is able to pass this state but now failing with error message java.sql.SQLException: FATAL: user "IAM:" does not exist.

The same accessKey, secretAccesskey and session token if used with aws cli with redshift data api it is working fine. aws redshift-data execute-statement --sql "select 1" --database dev --cluster-identifier <clusterId>

Adding to above, I have given s3 access as part of this role and when use s3 aws sdk api, then it is working so it is making sure that credential generated is correct there are something missing with jdbc driver.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Including SessionToken in url with v2 driver causes SQLException: IAM error retrieving temp credentials #96

Including SessionToken in url with v2 driver causes SQLException: IAM error retrieving temp credentials #96

Y-Asahi-dev commented Sep 1, 2023

bhvkshah commented Sep 1, 2023

Y-Asahi-dev commented Sep 5, 2023 •

edited

Loading

ravi-Mode commented Sep 3, 2024

ravi-Mode commented Sep 3, 2024

Including SessionToken in url with v2 driver causes SQLException: IAM error retrieving temp credentials #96

Including SessionToken in url with v2 driver causes SQLException: IAM error retrieving temp credentials #96

Comments

Y-Asahi-dev commented Sep 1, 2023

Driver version

Redshift version

Client Operating System

JAVA/JVM version

Table schema

Problem description

JDBC trace logs

Reproduction code

bhvkshah commented Sep 1, 2023

Y-Asahi-dev commented Sep 5, 2023 • edited Loading

ravi-Mode commented Sep 3, 2024

ravi-Mode commented Sep 3, 2024

Y-Asahi-dev commented Sep 5, 2023 •

edited

Loading