From 765b4058570504858adafc40328364d5c59d93fd Mon Sep 17 00:00:00 2001 From: Vadim Voitenko Date: Sun, 13 Oct 2024 16:43:37 +0300 Subject: [PATCH] Doc: Added GA support to docs and upgraded mkdocs version * Added cookie consent * Upgraded mkdocs material to 9.5.40 * Added GA * Added feedback question * Added Google forms feedback form * fixed docker pull command from ghcr.io in doc * Moved documentation to docs.greenmask.io * Revised index page - now it redirects to about.md * Fixed broken links --- README.md | 22 ++++---- docs/about.md | 75 +++++++++++++++++++++++++++ docs/configuration.md | 2 +- docs/database_subset.md | 2 +- docs/index.md | 75 ++------------------------- docs/installation.md | 2 +- docs/release_notes/greenmask_0_2_0.md | 2 +- mkdocs.yml | 43 +++++++++++++-- requirements.txt | 4 ++ 9 files changed, 137 insertions(+), 90 deletions(-) create mode 100644 docs/about.md diff --git a/README.md b/README.md index c0cb429d..6700a557 100644 --- a/README.md +++ b/README.md @@ -23,7 +23,7 @@ backward-compatible with existing PostgreSQL utilities, fast and reliable. ## Getting started -Greenmask has a [Playground](https://greenmask.io/latest/playground/) - it is a sandbox environment in Docker with +Greenmask has a [Playground](https://docs.greenmask.io/latest/playground/) - it is a sandbox environment in Docker with sample databases included to help you try Greenmask without any additional actions 1. Clone the `greenmask` repository and navigate to its directory by running the following commands: @@ -40,20 +40,20 @@ sample databases included to help you try Greenmask without any additional actio ## Features -* **[Deterministic transformers](https://greenmask.io/latest/built_in_transformers/transformation_engines/#hash-engine)** +* **[Deterministic transformers](https://docs.greenmask.io/latest/built_in_transformers/transformation_engines/#hash-engine)** — deterministic approach to data transformation based on the hash functions. This ensures that the same input data will always produce the same output data. Almost each transformer supports either `random` or `hash` engine making it universal for any use case. -* **[Dynamic parameters](https://greenmask.io/latest/built_in_transformers/dynamic_parameters/)** — almost each +* **[Dynamic parameters](https://docs.greenmask.io/latest/built_in_transformers/dynamic_parameters/)** — almost each transformer supports dynamic parameters, allowing to parametrize the transformer dynamically from the table column value. This is helpful for resolving the functional dependencies between columns and satisfying the constraints. -* **[Transformation validation and easy maintainable](https://greenmask.io/latest/commands/validate/)** - During +* **[Transformation validation and easy maintainable](https://docs.greenmask.io/latest/commands/validate/)** - During configuration process, Greenmask provides validation warnings, data transformation diff and schema diff features, allowing you to monitor and maintain transformations effectively throughout the software lifecycle. Schema diff helps to avoid data leakage when schema changed. -* **[Partitioned tables transformation inheritance](https://greenmask.io/latest/configuration/?h=partition#dump-section)** +* **[Partitioned tables transformation inheritance](https://docs.greenmask.io/latest/configuration/?h=partition#dump-section)** — Define transformation configurations once and apply them to all partitions within partitioned tables (using `apply_for_inherited` parameter), simplifying the anonymization process. * **Stateless** - Greenmask operates as a logical dump and does not impact your existing database schema. @@ -64,16 +64,16 @@ sample databases included to help you try Greenmask without any additional actio * **Backward compatible** - It fully supports the same features and protocols as existing vanilla PostgreSQL utilities. Dumps created by Greenmask can be successfully restored using the pg_restore utility. * **Extensible** - Users have the flexibility - to [implement domain-based transformations](https://greenmask.io/latest/built_in_transformers/standard_transformers/cmd/) + to [implement domain-based transformations](https://docs.greenmask.io/latest/built_in_transformers/standard_transformers/cmd/) in any programming language or - use [predefined templates](https://greenmask.io/latest/built_in_transformers/advanced_transformers/). + use [predefined templates](https://docs.greenmask.io/latest/built_in_transformers/advanced_transformers/). * **Integrable** - Integrate seamlessly into your CI/CD system for automated database anonymization and restoration. * **Parallel execution** - Take advantage of parallel dumping and restoration, significantly reducing the time required to deliver results. * **Provide variety of storages** - offers a variety of storage options for local and remote data storage, including directories and S3-like storage solutions. -* **[Pgzip support for faster compression](https://greenmask.io/latest/commands/dump/?h=pgzip#pgzip-compression)** — by +* **[Pgzip support for faster compression](https://docs.greenmask.io/latest/commands/dump/?h=pgzip#pgzip-compression)** — by setting `--pgzip`, it can speeds up the dump and restoration processes through parallel compression. @@ -121,13 +121,13 @@ the schema and data transformations, ensuring the desired outcomes during the An #### Customization If your table schema relies on functional dependencies between columns, you can address this challenge using the -[Dynamic parameters](https://greenmask.io/latest/built_in_transformers/dynamic_parameters/). By setting dynamic +[Dynamic parameters](https://docs.greenmask.io/latest/built_in_transformers/dynamic_parameters/). By setting dynamic parameters, you can resolve such as created_at and updated_at cases, where the updated_at must be greater or equal than the created_at. If you need to implement custom logic imperatively use -[TemplateRecord](https://greenmask.io/latest/built_in_transformers/advanced_transformers/template_record/) or -[Template](https://greenmask.io/latest/built_in_transformers/advanced_transformers/template/) transformers. +[TemplateRecord](https://docs.greenmask.io/latest/built_in_transformers/advanced_transformers/template_record/) or +[Template](https://docs.greenmask.io/latest/built_in_transformers/advanced_transformers/template/) transformers. Greenmask provides a framework for creating your custom transformers, which can be reused efficiently. These transformers can be seamlessly integrated without requiring recompilation, thanks to the PIPE (stdin/stdout) diff --git a/docs/about.md b/docs/about.md new file mode 100644 index 00000000..89817aa9 --- /dev/null +++ b/docs/about.md @@ -0,0 +1,75 @@ +--- +hide: + - feedback +--- + +# About Greenmask + +## Dump anonymization and synthetic data generation tool + +**Greenmask** is a powerful open-source utility that is designed for logical database backup dumping, +anonymization, synthetic data generation and restoration. It has ported PostgreSQL libraries, making it reliable. +It is stateless and does not require any changes to your database schema. It is designed to be highly customizable and +backward-compatible with existing PostgreSQL utilities, fast and reliable. + + +## Key features + +* **[Deterministic transformers](built_in_transformers/transformation_engines.md/#hash-engine)** + — deterministic approach to data transformation based on the hash + functions. This ensures that the same input data will always produce the same output data. Almost each transformer + supports either `random` or `hash` engine making it universal for any use case. +* **[Dynamic parameters](built_in_transformers/dynamic_parameters.md)** — almost each + transformer supports dynamic parameters, allowing to parametrize the + transformer dynamically from the table column value. This is helpful for resolving the functional dependencies + between columns and satisfying the constraints. +* **[Transformation validation and easy maintainable](commands/validate.md)** - During + configuration process, Greenmask provides validation + warnings, data transformation diff and schema diff features, allowing you to monitor and maintain transformations + effectively + throughout the software lifecycle. Schema diff helps to avoid data leakage when schema changed. +* **[Partitioned tables transformation inheritance](configuration.md/?h=partition#dump-section)** + — Define transformation configurations once and apply them to all + partitions within partitioned tables (using `apply_for_inherited` parameter), simplifying the anonymization process. +* **Stateless** - Greenmask operates as a logical dump and does not impact your existing database schema. +* **Cross-platform** - Can be easily built and executed on any platform, thanks to its Go-based architecture, + which eliminates platform dependencies. +* **Database type safe** - Ensures data integrity by validating data and utilizing the database driver for + encoding and decoding operations. This approach guarantees the preservation of data formats. +* **Backward compatible** - It fully supports the same features and protocols as existing vanilla PostgreSQL utilities. + Dumps created by Greenmask can be successfully restored using the pg_restore utility. +* **Extensible** - Users have the flexibility + to [implement domain-based transformations](built_in_transformers/standard_transformers/cmd.md/) + in any programming language or + use [predefined templates](built_in_transformers/advanced_transformers/index.md). +* **Integrable** - Integrate seamlessly into your CI/CD system for automated database anonymization and + restoration. +* **Parallel execution** - Take advantage of parallel dumping and restoration, significantly reducing the time required + to deliver results. +* **Provide variety of storages** - offers a variety of storage options for local and remote data storage, + including directories and S3-like storage solutions. +* **[Pgzip support for faster compression](commands/dump.md/?h=pgzip#pgzip-compression)** — by + setting `--pgzip`, it can speeds up the dump and restoration + processes through parallel compression. + + +## Use cases + +Greenmask is ideal for various scenarios, including: + +* **Backup and restoration**. Use Greenmask for your daily routines involving logical backup dumping and restoration. It + seamlessly handles tasks like table restoration after truncation. Its functionality closely mirrors that of pg_dump + and pg_restore, making it a straightforward replacement. +* **Anonymization, transformation, and data masking**. Employ Greenmask for anonymizing, transforming, and masking + backups, especially when setting up a staging environment or for analytical purposes. It simplifies the deployment of + a pre-production environment with consistently anonymized data, facilitating faster time-to-market in the development + lifecycle. + +## Links + +* [Greenmask Roadmap](https://github.com/orgs/GreenmaskIO/projects/6) +* [Email](mailto:support@greenmask.io) +* [Twitter](https://twitter.com/GreenmaskIO) +* [Telegram](https://t.me/greenmask_community) +* [Discord](https://discord.gg/tAJegUKSTB) +* [DockerHub](https://hub.docker.com/r/greenmask/greenmask) diff --git a/docs/configuration.md b/docs/configuration.md index 6215c71b..e88b4246 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -251,7 +251,7 @@ In the `restore` section of the configuration, you can specify parameters for th * `insert_error_exclusions` — a list of error codes that should be ignored during the restoration process. This is useful when you want to skip specific errors that are not critical for the restoration process. -As mentioned in [the architecture](architecture.md/#backing-up), a backup contains three sections: pre-data, data, and post-data. The custom script execution allows you to customize and control the restoration process by executing scripts or commands at specific stages. The available restoration stages and their corresponding execution conditions are as follows: +As mentioned in [the architecture](architecture.md/#backup-process), a backup contains three sections: pre-data, data, and post-data. The custom script execution allows you to customize and control the restoration process by executing scripts or commands at specific stages. The available restoration stages and their corresponding execution conditions are as follows: * `pre-data` — scripts or commands can be executed before or after restoring the pre-data section * `data` — scripts or commands can be executed before or after restoring the data section diff --git a/docs/database_subset.md b/docs/database_subset.md index 5ee2bf02..bcf649a2 100644 --- a/docs/database_subset.md +++ b/docs/database_subset.md @@ -167,7 +167,7 @@ section. !!! info If you find any issues related to the code or greenmask is not working as expected, do not hesitate to contact us - [directly](index.md#links) or by creating an [issue in the repository](https://github.com/GreenmaskIO/greenmask/issues). + [directly](about.md#links) or by creating an [issue in the repository](https://github.com/GreenmaskIO/greenmask/issues). ### ERROR: column reference "id" is ambiguous diff --git a/docs/index.md b/docs/index.md index 56f33dcf..55f6a033 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,71 +1,6 @@ -# About Greenmask +--- +hide: + - feedback +--- -**Greenmask** is a powerful open-source utility that is designed for logical database backup dumping, -anonymization, and restoration. It offers extensive functionality for backup, anonymization, and data masking. - -Greenmask is written in pure Go and includes ported PostgreSQL libraries that allows for platform independence. This -tool is stateless and does not require any changes to your database schema. It is designed to be highly customizable and -backward-compatible with existing PostgreSQL utilities. - -## Purpose - -The Greenmask utility plays a central role in the Greenmask ecosystem. Our goal is to develop a comprehensive, UI-based -solution for managing anonymization procedures. We recognize the challenges of maintaining anonymization consistency -throughout the software lifecycle. Greenmask is dedicated to providing valuable tools and features that ensure the -anonymization process remains fresh, predictable, and transparent. - -## Key features - -* **Database subset** - Dumps only the necessary data consistently based on the subset condition, reducing the size - of the dump and speeding up the restoration process. -* **Deterministic transformers** — deterministic approach to data transformation based on the hash - functions. This ensures that the same input data will always produce the same output data. Almost each transformer - supports either `random` or `hash` engine making it universal for any use case. -* **Dynamic parameters** — almost each transformer supports dynamic parameters, allowing to parametrize the - transformer dynamically from the table column value. This is helpful for resolving the functional dependencies - between columns and satisfying the constraints. -* **Cross-platform** — can be easily built and executed on any platform, thanks to its Go-based architecture, - which eliminates platform dependencies. -* **Database type safe** — ensures data integrity by validating data and utilizing the database driver for - encoding and decoding operations. This approach guarantees the preservation of data formats. -* **Transformation validation and easy maintainable** — during anonymization development, Greenmask provides validation - warnings and a transformation diff feature, allowing you to monitor and maintain transformations effectively - throughout the software lifecycle. -* **Partitioned tables transformation inheritance** — define transformation configurations once and apply them to all - partitions within partitioned tables, simplifying the anonymization process. -* **Stateless** — Greenmask operates as a logical dump and does not impact your existing database schema. -* **Backward compatible** — it fully supports the same features and protocols as existing vanilla PostgreSQL utilities. - Dumps created by Greenmask can be successfully restored using the pg_restore utility. -* **Extensible** — users have the flexibility to implement domain-based transformations in any programming language or - use predefined templates. -* **Declarative** — Greenmask allows you to define configurations in a structured, easily parsed, and recognizable - format. -* **Integrable** — integrate Greenmask seamlessly into your CI/CD system for automated database anonymization and - restoration. -* **Parallel execution** — take advantage of parallel dumping and restoration, significantly reducing the time required - to deliver results. -* **Provide variety of storages** — Greenmask offers a variety of storage options for local and remote data storage, - including directories and S3-like storage solutions. -* **Pgzip support for faster compression** — by setting `--pgzip`, greenmask can speeds up the dump and restoration - processes through parallel compression. - -## Use cases - -Greenmask is ideal for various scenarios, including: - -* **Backup and restoration**. Use Greenmask for your daily routines involving logical backup dumping and restoration. It - seamlessly handles tasks like table restoration after truncation. Its functionality closely mirrors that of pg_dump - and pg_restore, making it a straightforward replacement. -* **Anonymization, transformation, and data masking**. Employ Greenmask for anonymizing, transforming, and masking - backups, especially when setting up a staging environment or for analytical purposes. It simplifies the deployment of - a pre-production environment with consistently anonymized data, facilitating faster time-to-market in the development - lifecycle. - -## Links - -* [Greenmask Roadmap](https://github.com/orgs/GreenmaskIO/projects/6) -* [Email](mailto:support@greenmask.io) -* [Twitter](https://twitter.com/GreenmaskIO) -* [Telegram](https://t.me/greenmask_community) -* [Discord](https://discord.gg/tAJegUKSTB) -* [DockerHub](https://hub.docker.com/r/greenmask/greenmask) + diff --git a/docs/installation.md b/docs/installation.md index e1506e1c..29b64cdd 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -22,7 +22,7 @@ docker run -it greenmask/greenmask:latest To run the greenmask container from Github registry, use the following command: ```shell -docker run -it ghcr.io/GreenmaskIO/greenmask:latest +docker run -it ghcr.io/greenmaskio/greenmask:latest ``` !!! info diff --git a/docs/release_notes/greenmask_0_2_0.md b/docs/release_notes/greenmask_0_2_0.md index f5072614..32c00721 100644 --- a/docs/release_notes/greenmask_0_2_0.md +++ b/docs/release_notes/greenmask_0_2_0.md @@ -37,7 +37,7 @@ security. recursive query for the SCC whether it is a single cycle or multiple cycles, making the subset system universal for any database schema. * **Supports polymorphic relationships** - You can define - a [virtual reference for a table with polymorphic references]((../database_subset.md/#troubleshooting)) + a [virtual reference for a table with polymorphic references](../database_subset.md/#troubleshooting) using `polymorphic_exprs` attribute and use greenmask to generate a subset for such tables. * **pgzip** support for faster [compression](../commands/dump.md/#pgzip-compression) diff --git a/mkdocs.yml b/mkdocs.yml index 20c48770..38006dfb 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,4 +1,4 @@ -site_name: Greenmask — PostgreSQL masking and obfuscation tool +site_name: Greenmask — PostgreSQL database anonymization and synthetic data generation tool # Theme theme: @@ -43,8 +43,6 @@ markdown_extensions: - tables nav: - - Home: - - Home: index.md - Documentation: - Architecture: architecture.md - Playground: playground.md @@ -124,6 +122,8 @@ nav: - built_in_transformers/advanced_transformers/custom_functions/index.md - Core custom functions: built_in_transformers/advanced_transformers/custom_functions/core_functions.md - Faker function: built_in_transformers/advanced_transformers/custom_functions/faker_function.md + + - About: about.md - Release notes: - Greenmask 0.2.0: release_notes/greenmask_0_2_0.md - Greenmask 0.2.0b2: release_notes/greenmask_0_2_0_b2.md @@ -147,18 +147,51 @@ nav: repo_url: https://github.com/GreenmaskIO/greenmask repo_name: GreenmaskIO/greenmask -site_url: https://greenmask.io/ +site_url: https://docs.greenmask.io/ copyright: Copyright © 2024 Greenmask extra: + consent: + title: Cookie consent + description: >- + We use cookies to recognize your repeated visits and preferences, as well + as to measure the effectiveness of our documentation and whether users + find what they're searching for. With your consent, you're helping us to + make our documentation better. + analytics: + provider: google + property: G-1LGGK7P1GD + + feedback: + title: Was this page helpful? + ratings: + - icon: material/emoticon-happy-outline + name: This page was helpful + data: 1 + note: >- + Thanks for your feedback! + - icon: material/emoticon-sad-outline + name: This page could be improved + data: 0 + note: >- + Thanks for your feedback! Help us improve this page by using our feedback form + + + version: provider: mike social: - icon: fontawesome/brands/x-twitter link: https://twitter.com/GreenmaskIO - icon: fontawesome/brands/discord - link: https://discord.gg/97AKHdGD + link: https://discord.com/invite/rKBKvDECfd - icon: fontawesome/brands/github link: https://github.com/GreenmaskIO/greenmask diff --git a/requirements.txt b/requirements.txt index f4321d34..c3b74265 100644 --- a/requirements.txt +++ b/requirements.txt @@ -12,6 +12,7 @@ defusedxml==0.7.1 ghp-import==2.1.0 gitdb==4.0.11 GitPython==3.1.43 +hjson==3.1.0 idna==3.10 importlib_metadata==8.5.0 importlib_resources==6.4.5 @@ -26,6 +27,7 @@ mkdocs-get-deps==0.2.0 mkdocs-git-authors-plugin==0.9.0 mkdocs-git-committers-plugin-2==2.4.1 mkdocs-git-revision-date-localized-plugin==1.2.9 +mkdocs-macros-plugin==1.3.5 mkdocs-material==9.5.40 mkdocs-material-extensions==1.3.1 packaging==24.1 @@ -46,6 +48,8 @@ requests==2.32.3 six==1.16.0 smmap==5.0.1 soupsieve==2.6 +super_collections==0.5.3 +termcolor==2.5.0 tinycss2==1.3.0 urllib3==2.2.3 verspec==0.1.0