Skip to content

Commit

Permalink
Docs: Revised README.md
Browse files Browse the repository at this point in the history
* Added greenmask design schema
* Revised README.md content
* Added Getting started section
* Added links to docs
* Added badges
* Replace obfuscation keyword to anonymization
  • Loading branch information
wwoytenko committed Oct 13, 2024
1 parent 7e62488 commit afe1315
Show file tree
Hide file tree
Showing 8 changed files with 21 additions and 21 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ backward-compatible with existing PostgreSQL utilities, fast and reliable.
throughout the software lifecycle. Schema diff helps to avoid data leakage when schema changed.
* **[Partitioned tables transformation inheritance](https://greenmask.io/latest/configuration/?h=partition#dump-section)**
— Define transformation configurations once and apply them to all
partitions within partitioned tables (using `apply_for_inherited` parameter), simplifying the obfuscation process.
partitions within partitioned tables (using `apply_for_inherited` parameter), simplifying the anonymization process.
* **Stateless** - Greenmask operates as a logical dump and does not impact your existing database schema.
* **Cross-platform** - Can be easily built and executed on any platform, thanks to its Go-based architecture,
which eliminates platform dependencies.
Expand All @@ -45,7 +45,7 @@ backward-compatible with existing PostgreSQL utilities, fast and reliable.
to [implement domain-based transformations](https://greenmask.io/latest/built_in_transformers/standard_transformers/cmd/)
in any programming language or
use [predefined templates](https://greenmask.io/latest/built_in_transformers/advanced_transformers/).
* **Integrable** - Integrate seamlessly into your CI/CD system for automated database obfuscation and
* **Integrable** - Integrate seamlessly into your CI/CD system for automated database anonymization and
restoration.
* **Parallel execution** - Take advantage of parallel dumping and restoration, significantly reducing the time required
to deliver results.
Expand Down Expand Up @@ -96,7 +96,7 @@ maintaining seamless integration with PostgreSQL's standard tools.

Greenmask uses the **directory format** of _pg_dump_ and _pg_restore_. This format is particularly suitable for
parallel execution and partial restoration, and it includes clear metadata files that aid in determining the backup and
restoration steps. Greenmask has been optimized to work seamlessly with remote storage systems and obfuscation
restoration steps. Greenmask has been optimized to work seamlessly with remote storage systems and anonymization
procedures.

#### Storage Options
Expand All @@ -105,12 +105,12 @@ procedures.
various cloud-based storage solutions.
* **directory** - This is the standard choice, representing the ordinary filesystem directory for local storage.

## Data Obfuscation and Validation
## Data Anonymization and Validation

Greenmask works with **COPY lines**, collects schema metadata using the Golang driver, and employs this driver in the
encoding and decoding process. The **validate command** offers a way to assess the impact on both schema
(**validation warnings**) and data (**transformation and displaying differences**). This command allows you to validate
the schema and data transformations, ensuring the desired outcomes during the obfuscation process.
the schema and data transformations, ensuring the desired outcomes during the Anonymization process.

## Customization

Expand All @@ -128,7 +128,7 @@ transformers can be seamlessly integrated without requiring recompilation, thank
interaction.

Furthermore, Greenmask's architecture is designed to be highly extensible, making it possible to introduce other
interaction protocols, such as HTTP or Socket, for conducting obfuscation procedures.
interaction protocols, such as HTTP or Socket, for conducting anonymization procedures.
## PostgreSQL Version Compatibility
Expand Down
8 changes: 4 additions & 4 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ The process of backing up PostgreSQL databases is divided into three distinct se
Greenmask focuses exclusively on the data section during runtime. It delegates the handling of the `pre-data` and `post-data` sections to the core PostgreSQL utilities, `pg_dump` and `pg_restore`.

Greenmask employs the directory format of `pg_dump` and `pg_restore`. This format is particularly suitable for
parallel execution and partial restoration, and it includes clear metadata files that aid in determining the backup and restoration steps. Greenmask has been optimized to work seamlessly with remote storage systems and obfuscation procedures.
parallel execution and partial restoration, and it includes clear metadata files that aid in determining the backup and restoration steps. Greenmask has been optimized to work seamlessly with remote storage systems and anonymization procedures.

When performing data dumping, Greenmask utilizes the COPY command in TEXT format, maintaining reliability and
compatibility with the vanilla PostgreSQL utilities.
Expand All @@ -39,10 +39,10 @@ In the restoration process, Greenmask combines the capabilities of different too

Greenmask also supports **parallel restoration**, which can significantly reduce the time required to complete the restoration process. This parallel execution enhances the efficiency of restoring large datasets.

## Data obfuscation and validation
## Data anonymization and validation

Greenmask works with COPY lines, collects schema metadata using the Golang driver, and employs this driver in the encoding and decoding process. The **validate command** offers a way to assess the impact on both schema
(**validation warnings**) and data (**transformation and displaying differences**). This command allows you to validate the schema and data transformations, ensuring the desired outcomes during the obfuscation process.
(**validation warnings**) and data (**transformation and displaying differences**). This command allows you to validate the schema and data transformations, ensuring the desired outcomes during the anonymization process.

## Customization

Expand All @@ -53,7 +53,7 @@ transformers can be seamlessly integrated without requiring recompilation, thank
interaction.

!!! note
Furthermore, Greenmask's architecture is designed to be highly extensible, making it possible to introduce other interaction protocols, such as HTTP or Socket, for conducting obfuscation procedures.
Furthermore, Greenmask's architecture is designed to be highly extensible, making it possible to introduce other interaction protocols, such as HTTP or Socket, for conducting anonymization procedures.

## PostgreSQL version compatibility

Expand Down
2 changes: 1 addition & 1 deletion docs/built_in_transformers/advanced_transformers/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Advanced transformers

Advanced transformers are modifiable obfuscation methods that users can adjust based on their needs by using [custom functions](custom_functions/index.md).
Advanced transformers are modifiable anonymization methods that users can adjust based on their needs by using [custom functions](custom_functions/index.md).

Below you can find an index of all advanced transformers currently available in Greenmask.

Expand Down
2 changes: 1 addition & 1 deletion docs/built_in_transformers/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# About transformers

Transformers in Greenmask are methods which are applied to obfuscate sensitive data. All Greenmask transformers are
Transformers in Greenmask are methods which are applied to anonymize sensitive data. All Greenmask transformers are
split into the following groups:

- [Transformation engines](transformation_engines.md) — the type of generator used in transformers. Hash (deterministic)
Expand Down
2 changes: 1 addition & 1 deletion docs/built_in_transformers/standard_transformers/hash.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Generate a hash of the text value using the `Scrypt` hash function under the hoo
|------------|---------------------------------------------------------------------------------------------------------------------------------------|---------|----------|--------------------|
| column | The name of the column to be affected | | Yes | text, varchar |
| salt | Hex encoded salt string. This value may be provided via environment variable `GREENMASK_GLOBAL_SALT` | | Yes | text, varchar |
| function | Hash algorithm to obfuscate data. Can be any of `md5`, `sha1`, `sha256`, `sha512`, `sha3-224`, `sha3-254`, `sha3-384`, `sha3-512`. | `sha1` | No | - |
| function | Hash algorithm to anonymize data. Can be any of `md5`, `sha1`, `sha256`, `sha512`, `sha3-224`, `sha3-254`, `sha3-384`, `sha3-512`. | `sha1` | No | - |
| max_length | Indicates whether to truncate the hash tail and specifies at what length. Can be any integer number, where `0` means "no truncation". | `0` | No | - |

## Example: Generate hash from job title
Expand Down
2 changes: 1 addition & 1 deletion docs/database_subset.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ subset_conds:
## Use cases
* Database scale down - create obfuscated dump but for the limited and consistent set of tables
* Database scale down - create anonymized dump but for the limited and consistent set of tables
* Data migration - migrate only some records from one database to another
* Data anonymization - dump and anonymize only a specific records in the database
* Database catchup - catchup your another instance of database logically by adding a new records. In this case it
Expand Down
12 changes: 6 additions & 6 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# About Greenmask

**Greenmask** is a powerful open-source utility that is designed for logical database backup dumping,
obfuscation, and restoration. It offers extensive functionality for backup, anonymization, and data masking.
anonymization, and restoration. It offers extensive functionality for backup, anonymization, and data masking.

Greenmask is written in pure Go and includes ported PostgreSQL libraries that allows for platform independence. This
tool is stateless and does not require any changes to your database schema. It is designed to be highly customizable and
Expand All @@ -10,9 +10,9 @@ backward-compatible with existing PostgreSQL utilities.
## Purpose

The Greenmask utility plays a central role in the Greenmask ecosystem. Our goal is to develop a comprehensive, UI-based
solution for managing obfuscation procedures. We recognize the challenges of maintaining obfuscation consistency
solution for managing anonymization procedures. We recognize the challenges of maintaining anonymization consistency
throughout the software lifecycle. Greenmask is dedicated to providing valuable tools and features that ensure the
obfuscation process remains fresh, predictable, and transparent.
anonymization process remains fresh, predictable, and transparent.

## Key features

Expand All @@ -28,19 +28,19 @@ obfuscation process remains fresh, predictable, and transparent.
which eliminates platform dependencies.
* **Database type safe** — ensures data integrity by validating data and utilizing the database driver for
encoding and decoding operations. This approach guarantees the preservation of data formats.
* **Transformation validation and easy maintainable** — during obfuscation development, Greenmask provides validation
* **Transformation validation and easy maintainable** — during anonymization development, Greenmask provides validation
warnings and a transformation diff feature, allowing you to monitor and maintain transformations effectively
throughout the software lifecycle.
* **Partitioned tables transformation inheritance** — define transformation configurations once and apply them to all
partitions within partitioned tables, simplifying the obfuscation process.
partitions within partitioned tables, simplifying the anonymization process.
* **Stateless** — Greenmask operates as a logical dump and does not impact your existing database schema.
* **Backward compatible** — it fully supports the same features and protocols as existing vanilla PostgreSQL utilities.
Dumps created by Greenmask can be successfully restored using the pg_restore utility.
* **Extensible** — users have the flexibility to implement domain-based transformations in any programming language or
use predefined templates.
* **Declarative** — Greenmask allows you to define configurations in a structured, easily parsed, and recognizable
format.
* **Integrable** — integrate Greenmask seamlessly into your CI/CD system for automated database obfuscation and
* **Integrable** — integrate Greenmask seamlessly into your CI/CD system for automated database anonymization and
restoration.
* **Parallel execution** — take advantage of parallel dumping and restoration, significantly reducing the time required
to deliver results.
Expand Down
2 changes: 1 addition & 1 deletion docs/release_notes/greenmask_0_1_0_beta.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Greenmask 0.0.1 Beta

We are excited to announce the beta release of Greenmask, a versatile and open-source utility for PostgreSQL logical backup dumping, obfuscation, and restoration. Greenmask is perfect for routine backup and restoration tasks. It facilitates anonymization and data masking for staging environments and analytics.
We are excited to announce the beta release of Greenmask, a versatile and open-source utility for PostgreSQL logical backup dumping, anonymization, and restoration. Greenmask is perfect for routine backup and restoration tasks. It facilitates anonymization and data masking for staging environments and analytics.

This release introduces a range of features aimed at enhancing database management and security.

Expand Down

0 comments on commit afe1315

Please sign in to comment.