You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In 1.x, both the KinesisConfig (inStream) and S3Config (outStream) have its own setting of aws region. That means you can run the s3-loader in region A to consume data from Kinesis data streaming (region A) and persist raw/enriched events to S3 on region B.
Since 2.x, region is a global setting outside of "input" and "output" sections. The code logic always get the region from here even though I configure the s3 custom endpoint.
AWS client SDK provides the interface to turn on global bucket access. But snowplow-s3-loader has not exposed this setting in its pipeline configuration. Please review it and fix it.
The text was updated successfully, but these errors were encountered:
Hi @donnyding would you be able to share a bit more about the use-case you are trying to solve here and why you would want to read from Kinesis in one region and write to S3 in a different region?
As for the feature itself, if you have the bandwidth, we are always happy to review Pull Requests!
hi @jbeemster,
Usage scenario:
In order to improve the HA, we plan to setup similar env in two aws regions. The health check API could be used in traffic routing policy. That means the event payloads will be routed to two regions, no duplicated data. The enrichment processing is better to persist raw/enriched events to a global s3 storage area.
That's why I consume data from Kinesis data streaming (region A) and sink data to S3 (region B).
As a workaround, I can force the global bucket access through AWS Client SDK interface. But it's not a perfect solution.
It's possible to separate the region setting for both input and output section in configuration file, just like what Snowplow-OSS does in v1.0. Or add new configuration item in output section, to provide the functionality to let customer make choice of enable/disable global bucket access. Make sense?
In 1.x, both the KinesisConfig (inStream) and S3Config (outStream) have its own setting of aws region. That means you can run the s3-loader in region A to consume data from Kinesis data streaming (region A) and persist raw/enriched events to S3 on region B.
Since 2.x, region is a global setting outside of "input" and "output" sections. The code logic always get the region from here even though I configure the s3 custom endpoint.
AWS client SDK provides the interface to turn on global bucket access. But snowplow-s3-loader has not exposed this setting in its pipeline configuration. Please review it and fix it.
The text was updated successfully, but these errors were encountered: