Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add environment variable configuration for storage #42

Closed
wants to merge 5 commits into from

Conversation

joao-zanutto
Copy link
Contributor

@joao-zanutto joao-zanutto commented Mar 28, 2024

@wwoytenko I've added some of the variables proposed in #41. I already tested locally that these changes work and if you agree to add this PR to the project, I can also add the related docs to the other PR.

I'm still getting used to Go and the project, but I couldn't figure out where I could add the toggle to switch between directory and s3 storage as an environment variable (the proposed STORAGE_TYPE env var)

@joao-zanutto joao-zanutto changed the title Add minimal environment variable configuration for s3 storage Add environment variable configuration for storage Mar 28, 2024
@joao-zanutto
Copy link
Contributor Author

@wwoytenko I figured how to add the STORAGE_TYPE env var and added the directory env var with a validation function, but couldn't figure out where I should call the Validate() function just by checking the s3 storage code

Copy link
Contributor

@wwoytenko wwoytenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider mentioned comments

@@ -50,6 +51,9 @@ func NewConfig() *Config {
ForcePathStyle: true,
MaxRetries: defaultMaxRetries,
MaxPartSize: defaultMaxPartSize,
Bucket: os.Getenv("STORAGE_S3_BUCKET_NAME"),
Region: os.Getenv("STORAGE_S3_BUCKET_REGION"),
Prefix: os.Getenv("STORAGE_S3_BUCKET_PREFIX"),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Greenmask employs Viper and Cobra frameworks for managing the configuration. This means you should not hardcode os.Getenv.

Have a look at an example

Copy link
Contributor Author

@joao-zanutto joao-zanutto Mar 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wwoytenko I see, while investigating the code further I realized that Viper would take care of the configuration. As I understand, I believe that it was supposed to load all the configuration from environment variables automatically since viper.AutomaticEnv() is being called in the initConfig() function here.

However, I can't seem to make it work using any environment variables following Viper's docummentation (like STORAGE_DIRECTORY_PATH or COMMON_TMP_DIR). I'm still investigating but I'll surely drop those modifications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wwoytenko it seems I was able to fix it, it was simply the lack of the separator replacer. I opened a new PR #43 with the change.

On a side note I would like to propose a couple of other changes (and I'm unsure where would be the ideal location to communicate those, so I'm just adding them here):

  • make config.yaml not required -- this would make it easier to run greenmask as a container without needing to rebuild the image with a config file or mounting a volume with the file
  • set common.tmp_dir default value to linux /tmp
  • change Dockerfile entrypoint to greenmask instead of /bin/bash -- this is more of a matter of following convention of other dockerized tools
  • define a user greenmask on the container to avoid running it as root -- this is a security best practice I believe

I can work on all of those if you agree, I've tested a few of those changes locally already and they are really simple, the first one being the most impactful for my use case. To give you some context of my use case: I'm planning to run greenmask once a day as a ECS container inside a VPC in AWS, export dumps to S3 and also use it in Github actions to restore the dump to a postgres database so we can run database migration tests in our CI. So, having all the configurations as environment variables would be ideal, even the transformer configurations for the dump operation (which I still haven't tested yet).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joao-zanutto sure you can start doing it with all the listed fixes/features. Kindly remind you to review the documentation according to the changes will make.

Thank you!

@@ -27,9 +28,10 @@ import (
func GetStorage(ctx context.Context, stCfg *domains.StorageConfig, logCgf *domains.LogConfig) (
storages.Storager, error,
) {
if stCfg.Directory != nil {
envCfg := os.Getenv("STORAGE_TYPE")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to implement the configuration in the next way:

storage:
  type: "s3"

   s3_config: 
     bucket: "test"
  
   directory_config: 
     path: "/tmp"

We should introduce the storage.type parameter and depending on the type provided you can use either S3 config or Directory.

Keep in mind that the env vars mapping should be done via viper/cobra usage here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is important now, because AutomaticEnv does not work when we provide the variable STORAGE_DIRECTORY_PATH=/tmp. The default config file assembles the S3 storage (not the Directory). It would be fine to introduce the next keys in the config:

  • storage.type
  • storage.s3
  • storage.directory

Then the variables will be dynamically working using the next variable naming:

  • For path setting in Directory STORAGE_DIRECTORY_PATH=/tmp
  • For bucket setting in S3 STORAGE_S3_BUCKET=testbucket

@wwoytenko
Copy link
Contributor

Closing, because implemented in #51. Feel free to create new Merge Requests.

@wwoytenko wwoytenko closed this Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants