Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send data to non-AWS S3 bucket #31

Open
bvarick opened this issue Nov 27, 2024 · 4 comments
Open

Send data to non-AWS S3 bucket #31

bvarick opened this issue Nov 27, 2024 · 4 comments

Comments

@bvarick
Copy link

bvarick commented Nov 27, 2024

Is it possible to save to something other than AWS S3? Either a different S3 compatible destination or a local directory?

@bvarick
Copy link
Author

bvarick commented Nov 27, 2024

When I try to put a minio url for the s3 uri in the global_config.json, I get this error:

mdb-2130-normalizer-1  | Invalid bucket name "https:": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:(s3|s3-object-lambda):[a-z\-0-9]*:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-.]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"

botocore has a built in endpoint_url property that I think is for this purpose.
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html

Is it as simple as adding that to the global_config.json as a new key and then editing src/normalize/normalize.py and src/normalize/compact.py to use that value instead of a generated value for the s3_bucket_path?

@AndiLi99
Copy link
Contributor

Hi, we haven’t tested this with a non-Amazon S3 bucket, but as long as it’s S3 compatible with the client, I don’t see why that wouldn’t work.

Based on the docs you linked, it sounds like your suggestion should work, I would be interested in hearing how it goes!

@bvarick
Copy link
Author

bvarick commented Dec 3, 2024

I got it working and submitted a pull request

@AndiLi99
Copy link
Contributor

Thank you @bvarick for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants