Use a mono-bucket S3 Storage for multiple database dumps #56
Replies: 4 comments 5 replies
-
Hi @flazzarini ! Thank you for your appreciation, we really value it! The problem is that transformations are related to the database and the config (including connection params) has to point to the specific database. If we introduce the transformation for a multiple database in the. one config file it may complicate the user experience because we will be forced to somehow match the transformation with the specific database. But I think we can take into account developers' experience when they just fetching the database dump from mono-bucket. In that case would be fine to be able to get the latest dump for a database name provided. Assuming we have The config for the dump might be Config for foobar common:
tmp_dir: ./tmp
dump:
pg_dump_options:
dbname: "host=localhost port=50022 user=foobar dbname=foobaz"
storage:
s3:
prefix: "foobar"
endpoint: "http://localhost:50020"
region: "us-east-1"
bucket: "coss-db-dumps"
access_key_id: "full-access-token"
secret_access_key: "xxxx" By applying The next obstacle is - how to simplify the config file usage experience. Where we need only one config file and a specific database restoration request. I suspect we could introduce the parameter
What do you think about it? Feel free to share your ideas, concerns and doubts about implementation. |
Beta Was this translation helpful? Give feedback.
-
Another problem. Currently, if we want to isolate the dump for specific databases we have to take into account I think we should do the following:
I think it should solve your UX problem |
Beta Was this translation helpful? Give feedback.
-
Hi @wwoytenko, yes this sounds like a valid option. Related to the use-case I've mentioned in the initial discussion I think a command line flag Additionally I saw that there was already a merge request related to this #62 which as far as I understood is a step forward to this feature, which is great to see. Thanks for your comments. |
Beta Was this translation helpful? Give feedback.
-
Hey @flazzarini and @wwoytenko while evaluating the possible impact for the implementations suggested in this discussion, I came across a concern: what should be the expected behavior of a command like Considering this concern, imagine we implement the new suggested storage pattern:
Another concern: what if the user wants to restore a dump in the Having those concerns in mind, can you think of other commands that should be affected by this change? Is there already a behavior you thought for those commands? |
Beta Was this translation helpful? Give feedback.
-
Hello,
Firstly, I'd like to express my appreciation for this software. The progress achieved with Greenmask is commendable, and its current state is impressive.
I'm exploring a potential use case and seeking clarification on whether Greenmask supports it or if there are alternative approaches available. Here's the scenario:
Use Case Description
I manage a central PostgreSQL database server housing multiple databases. I aim to back up selected databases to a centralized S3 bucket using Greenmask and potentially anonymize certain data within these databases. This backup process should occur regularly, resulting in a collection of database dumps within the S3 bucket. Additionally, I plan to create READ-only access tokens on S3 so that developers from various application databases can restore specific databases to their local machines.
Configuration Files
Backup / Dump Process Configuration for DB-1 (dump-db1.yaml)
Backup / Dump Process Configuration for DB-2 (dump-db2.yaml)
Developer Configuration (developer.yaml)
Dump Process Execution
The following commands are executed regularly to generate new dumps on the S3 storage within the same bucket:
User Perspective
To restore the latest dump of a particular database, a user needs to list all available dumps and select the appropriate ID. For example:
Suppose a developer wishes to restore the latest dump of the 'foobaz' database. In that case, they would initiate a command like this:
Question
Is it feasible to introduce a command that simplifies this workflow? For instance, a way to indicate that the user intends to restore the latest dump of a specific database ('foobaz' in this case)?
Thank you in advance for any suggestions on streamlining this process.
Beta Was this translation helpful? Give feedback.
All reactions