-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Including SessionToken in url with v2 driver causes SQLException: IAM error retrieving temp credentials #96
Comments
Hi @Y-Asahi-dev , thanks for reporting this issue. Can you try: |
There was no change in the results.
https://docs.aws.amazon.com/ja_jp/redshift/latest/mgmt/spark-redshift-connector-other-config.html Remove the IAM parameters from the URL and temporary_aws_access_key_id When I specified it in the config setting value, the process was successful. thanks. |
Hi @bhvkshah But when I pass this as part of URL string parameter then it is failing with Caused by: com.amazonaws.services.redshift.model.AmazonRedshiftException: The security token included in the request is invalid However when I passed using properties file then it is able to pass this state but now failing with error message java.sql.SQLException: FATAL: user "IAM:" does not exist. The same accessKey, secretAccesskey and session token if used with aws cli with redshift data api it is working fine. |
Adding to above, I have given s3 access as part of this role and when use s3 aws sdk api, then it is working so it is making sure that credential generated is correct there are something missing with jdbc driver. |
Driver version
2.1.0.9
Redshift version
PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.54052
Client Operating System
Amazon EMR ver6.9
※OS info
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
JAVA/JVM version
openjdk version "1.8.0_382"
OpenJDK Runtime Environment Corretto-8.382.05.1 (build 1.8.0_382-b05)
OpenJDK 64-Bit Server VM Corretto-8.382.05.1 (build 25.382-b05, mixed mode)
Table schema
Problem description
I am using amazon-redshift-jdbc-driver v2 with Pyspark (Spark version 3.3.2).
I get a SQLException when I run the code below.
It seems that an error occurs if the URL has a SessionToken parameter.
After replacing the jdbc-driver with v1 I get ret.count() results successfully without any errors.
Has the behavior changed between v1 and v2 when there is a SessionToken?
//-----------------------------------------------------------
from pyspark import SparkContext
from pyspark.sql import SQLContext
import boto3
redshift_cluster_id = 'sample_cluster'
redshift_dbname = 'sample_db'
bucket_name = 'sample_bucket'
sc = SparkContext.getOrCreate()
credentials = boto3.Session().get_credentials()
region = boto3.Session().region_name
sc._jsc.hadoopConfiguration().set("fs.s3.awsAccessKeyId", credentials.access_key)
sc._jsc.hadoopConfiguration().set("fs.s3.awsSecretAccessKey", credentials.secret_key)
url = f'jdbc:redshift:iam://{redshift_cluster_id}:{region}/{redshift_dbname}?DbUser=test_uesr&DbGroups=test_users_group&AutoCreate=true&AccessKeyID={credentials.access_key}&SecretAccessKey={credentials.secret_key}&user=&password='
token = credentials.token
sc._jsc.hadoopConfiguration().set("fs.s3.awsSessionToken", token)
url = f'{url}&SessionToken={credentials.token}'
sql_context = SQLContext(sc)
dfr = sql_context.read
.format('io.github.spark_redshift_community.spark.redshift')
.option('url', url)
.option('tempdir', f's3a://{bucket_name}/redshift_temp_dir')
.option('forward_spark_s3_credentials', 'true')
.option('fetchsize', 10000)
query = f'select * from sample_table'
ret = dfr.option('query', query).load(schema=None)
ret.count()
-----------------------------------------------------------//
: java.sql.SQLException: IAM error retrieving temp credentials: The security token included in the request is invalid (Service: AmazonRedshift; Status Code: 403; Error Code: InvalidClientTokenId; Request ID: 78fc4d63-34c5-45a1-af4f-454c122d93e5; Proxy: null)
at com.amazon.redshift.util.RedshiftException.getSQLException(RedshiftException.java:56)
at com.amazon.redshift.Driver.connect(Driver.java:339)
at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:46)
at io.github.spark_redshift_community.spark.redshift.JDBCWrapper.getConnector(RedshiftJDBCWrapper.scala:270)
at io.github.spark_redshift_community.spark.redshift.RedshiftRelation.$anonfun$schema$1(RedshiftRelation.scala:69)
at scala.Option.getOrElse(Option.scala:189)
at io.github.spark_redshift_community.spark.redshift.RedshiftRelation.schema$lzycompute(RedshiftRelation.scala:66)
at io.github.spark_redshift_community.spark.redshift.RedshiftRelation.schema(RedshiftRelation.scala:65)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:440)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:171)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.amazonaws.services.redshift.model.AmazonRedshiftException: The security token included in the request is invalid (Service: AmazonRedshift; Status Code: 403; Error Code: InvalidClientTokenId; Request ID: 78fc4d63-34c5-45a1-af4f-454c122d93e5; Proxy: null)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1418)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1387)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
at com.amazonaws.services.redshift.AmazonRedshiftClient.doInvoke(AmazonRedshiftClient.java:9234)
at com.amazonaws.services.redshift.AmazonRedshiftClient.invoke(AmazonRedshiftClient.java:9201)
at com.amazonaws.services.redshift.AmazonRedshiftClient.invoke(AmazonRedshiftClient.java:9190)
at com.amazonaws.services.redshift.AmazonRedshiftClient.executeDescribeClusters(AmazonRedshiftClient.java:4581)
at com.amazonaws.services.redshift.AmazonRedshiftClient.describeClusters(AmazonRedshiftClient.java:4549)
at com.amazon.redshift.core.IamHelper.setClusterCredentials(IamHelper.java:546)
at com.amazon.redshift.core.IamHelper.setIAMCredentials(IamHelper.java:518)
at com.amazon.redshift.core.IamHelper.setIAMProperties(IamHelper.java:279)
at com.amazon.redshift.jdbc.RedshiftConnectionImpl.(RedshiftConnectionImpl.java:260)
at com.amazon.redshift.Driver.makeConnection(Driver.java:502)
at com.amazon.redshift.Driver.connect(Driver.java:315)
... 24 more
JDBC trace logs
Reproduction code
The text was updated successfully, but these errors were encountered: