Skip to content

Commit

Permalink
Access to public S3 buckets without credentials (#287)
Browse files Browse the repository at this point in the history
* Implement aws anonymous mode

* Pk fix

* Fix external refs

* Unit test

* Documentation about public bucket access

* Release date

* Scala formating

* Fix changes file

* Typo

Co-authored-by: Christoph Pirkl <[email protected]>

* Update doc/user_guide/user_guide.md

Co-authored-by: Christoph Pirkl <[email protected]>

---------

Co-authored-by: Christoph Pirkl <[email protected]>
  • Loading branch information
Shmuma and kaklakariada authored Nov 10, 2023
1 parent 8f45444 commit 5c222f9
Show file tree
Hide file tree
Showing 7 changed files with 54 additions and 19 deletions.
1 change: 1 addition & 0 deletions doc/changes/changelog.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions doc/changes/changes_2.7.8.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Cloud Storage Extension 2.7.8, released 2023-11-10

Code name: Access to public S3 buckets without credentials

## Summary

Implemented an option to access public S3 buckets without credentials.

## Features

* #283: Support publicly available S3 buckets without credentials
24 changes: 14 additions & 10 deletions doc/user_guide/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ downloaded jar file is the same as the checksum provided in the releases.
To check the SHA256 result of the local jar, run the command:

```sh
sha256sum exasol-cloud-storage-extension-2.7.7.jar
sha256sum exasol-cloud-storage-extension-2.7.8.jar
```

### Building From Source
Expand Down Expand Up @@ -180,7 +180,7 @@ mvn clean package -DskipTests=true
```

The assembled jar file should be located at
`target/exasol-cloud-storage-extension-2.7.7.jar`.
`target/exasol-cloud-storage-extension-2.7.8.jar`.

### Create an Exasol Bucket

Expand All @@ -202,7 +202,7 @@ for the HTTP protocol.
Upload the jar file using curl command:

```sh
curl -X PUT -T exasol-cloud-storage-extension-2.7.7.jar \
curl -X PUT -T exasol-cloud-storage-extension-2.7.8.jar \
http://w:<WRITE_PASSWORD>@exasol.datanode.domain.com:2580/<BUCKET>/
```

Expand Down Expand Up @@ -234,7 +234,7 @@ OPEN SCHEMA CLOUD_STORAGE_EXTENSION;

CREATE OR REPLACE JAVA SET SCRIPT IMPORT_PATH(...) EMITS (...) AS
%scriptclass com.exasol.cloudetl.scriptclasses.FilesImportQueryGenerator;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/

CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...) EMITS (
Expand All @@ -244,12 +244,12 @@ CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...) EMITS (
end_index DECIMAL(36, 0)
) AS
%scriptclass com.exasol.cloudetl.scriptclasses.FilesMetadataReader;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/

CREATE OR REPLACE JAVA SET SCRIPT IMPORT_FILES(...) EMITS (...) AS
%scriptclass com.exasol.cloudetl.scriptclasses.FilesDataImporter;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/
```

Expand All @@ -268,12 +268,12 @@ OPEN SCHEMA CLOUD_STORAGE_EXTENSION;

CREATE OR REPLACE JAVA SET SCRIPT EXPORT_PATH(...) EMITS (...) AS
%scriptclass com.exasol.cloudetl.scriptclasses.TableExportQueryGenerator;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/

CREATE OR REPLACE JAVA SET SCRIPT EXPORT_TABLE(...) EMITS (ROWS_AFFECTED INT) AS
%scriptclass com.exasol.cloudetl.scriptclasses.TableDataExporter;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/
```

Expand Down Expand Up @@ -407,13 +407,13 @@ CREATE OR REPLACE JAVA SCALAR SCRIPT IMPORT_METADATA(...) EMITS (
) AS
%jvmoption -DHTTPS_PROXY=http://username:password@10.10.1.10:1180
%scriptclass com.exasol.cloudetl.scriptclasses.FilesMetadataReader;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/

CREATE OR REPLACE JAVA SET SCRIPT IMPORT_FILES(...) EMITS (...) AS
%jvmoption -DHTTPS_PROXY=http://username:password@10.10.1.10:1180
%scriptclass com.exasol.cloudetl.scriptclasses.FilesDataImporter;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.7.jar;
%jar /buckets/bfsdefault/<BUCKET>/exasol-cloud-storage-extension-2.7.8.jar;
/
```

Expand Down Expand Up @@ -722,6 +722,10 @@ S3_SESSION_TOKEN
Please follow the [Amazon credentials management best practices][aws-creds] when
creating credentials.

If you are accessing a public bucket, you don't need credentials. In such case,
you need to set `S3_ACCESS_KEY` and `S3_SECRET_KEY` to empty values:
`S3_ACCESS_KEY=;S3_SECRET_KEY=`.

[aws-creds]: https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html

Another required parameter is the S3 endpoint, `S3_ENDPOINT`. An endpoint is the
Expand Down
2 changes: 1 addition & 1 deletion pk_generated_parent.pom

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@
<modelVersion>4.0.0</modelVersion>
<groupId>com.exasol</groupId>
<artifactId>cloud-storage-extension</artifactId>
<version>2.7.7</version>
<version>2.7.8</version>
<name>Cloud Storage Extension</name>
<description>Exasol Cloud Storage Import And Export Extension</description>
<url>https://github.com/exasol/cloud-storage-extension/</url>
<parent>
<artifactId>cloud-storage-extension-generated-parent</artifactId>
<groupId>com.exasol</groupId>
<version>2.7.7</version>
<version>2.7.8</version>
<relativePath>pk_generated_parent.pom</relativePath>
</parent>
<properties>
Expand Down
22 changes: 16 additions & 6 deletions src/main/scala/com/exasol/cloudetl/bucket/S3Bucket.scala
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ final case class S3Bucket(path: String, params: StorageProperties) extends Bucke
)
}

private[this] def isAnonymousAWSParams(properties: StorageProperties): Boolean =
properties.getString(S3_ACCESS_KEY).isEmpty && properties.getString(S3_SECRET_KEY).isEmpty

/**
* @inheritdoc
*
Expand Down Expand Up @@ -83,15 +86,22 @@ final case class S3Bucket(path: String, params: StorageProperties) extends Bucke
properties
}

conf.set("fs.s3a.access.key", mergedProperties.getString(S3_ACCESS_KEY))
conf.set("fs.s3a.secret.key", mergedProperties.getString(S3_SECRET_KEY))

if (mergedProperties.containsKey(S3_SESSION_TOKEN)) {
if (isAnonymousAWSParams(mergedProperties)) {
conf.set(
"fs.s3a.aws.credentials.provider",
classOf[org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider].getName()
classOf[org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider].getName()
)
conf.set("fs.s3a.session.token", mergedProperties.getString(S3_SESSION_TOKEN))
} else {
conf.set("fs.s3a.access.key", mergedProperties.getString(S3_ACCESS_KEY))
conf.set("fs.s3a.secret.key", mergedProperties.getString(S3_SECRET_KEY))

if (mergedProperties.containsKey(S3_SESSION_TOKEN)) {
conf.set(
"fs.s3a.aws.credentials.provider",
classOf[org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider].getName()
)
conf.set("fs.s3a.session.token", mergedProperties.getString(S3_SESSION_TOKEN))
}
}

properties.getProxyHost().foreach { proxyHost =>
Expand Down
9 changes: 9 additions & 0 deletions src/test/scala/com/exasol/cloudetl/bucket/S3BucketTest.scala
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,15 @@ class S3BucketTest extends AbstractBucketTest {
assertConfigurationProperties(bucket, configMappings - "fs.s3a.session.token")
}

test(testName = "apply returns specific credentials provider for public access configuration") {
val exaMetadata = mockConnectionInfo("access", "S3_ACCESS_KEY=;S3_SECRET_KEY=")
val bucket = getBucket(defaultProperties, exaMetadata)
assert(
bucket.getConfiguration().get("fs.s3a.aws.credentials.provider") ==
"org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"
)
}

test("apply returns S3Bucket with secret and session token from connection") {
val exaMetadata = mockConnectionInfo("access", "S3_SECRET_KEY=secret;S3_SESSION_TOKEN=token")
val bucket = getBucket(defaultProperties, exaMetadata)
Expand Down

0 comments on commit 5c222f9

Please sign in to comment.