Skip to content

Releases: GreenmaskIO/greenmask

v0.2.6

07 Dec 19:01
cbd4b9b
Compare
Choose a tag to compare

Greenmask 0.2.6

This release introduces new features and bug fixes.

Changes

  • Introduces --disable-trigers, --use-session-replication-role-replica and --superuser options
    for restore command. It allows to disable triggers during data section restore #252. Closes feature request #228
  • Fix skipping unknown type when silent is true #251
  • Added sonar qube quality gate badge #250

Full Changelog: v0.2.5...v0.2.6

Contributors

@tsg
@wwoytenko

Scpecial thanks

@N-Putting

Links

Feel free to reach out to us if you have any questions or need assistance:

v0.2.5

16 Nov 21:29
Compare
Choose a tag to compare

Greenmask 0.2.5

This release introduces bug fixes.

Changes

  • Fixed a bug where a subset query was not generated when provided #247. A problems appears after RuntimeContext refactoring in v0.2.1. Covered with regression.

Contributors

@wwoytenko

Full Changelog: v0.2.4...v0.2.5

Links

Feel free to reach out to us if you have any questions or need assistance:

v0.2.4

16 Nov 20:12
09be53a
Compare
Choose a tag to compare

Greenmask 0.2.4

This release introduces bug fixes.

Changes

  • Fixed a bug #244 that caused incorrect subset and transformer inheritance behavior. See the merge request #245.

Full Changelog: v0.2.3...v0.2.4

Contributors

@janmeier
@wwoytenko

Links

Feel free to reach out to us if you have any questions or need assistance:

v0.2.3

12 Nov 19:30
1dae625
Compare
Choose a tag to compare

Greenmask 0.2.3

This release introduces bug fixes.

Changes

  • Fixed an issue where the partitioned table itself was executed in the restore worker, resulting in a "file not found"
    error in storage. Closes bug: restoring partitioned tables
    fails #238 #242.
  • Fixed template function availability #239. Renamed methods
    according to the documentation: GetColumnRawValue is now GetRawColumnValue, and SetColumnRawValue is now
    SetRawColumnValue #242
  • Resolved an issue where Dump.createTocEntries processed partitioned tables as if they were physical entities, despite
    being logical #241
  • Corrected merging in the pre-data, data, and post-data sections, which previously caused a panic in dump command when
    the post-data section was excluded #241
  • Fixed an issue where dumps created with --load-via-partition-root did not use the root partition table in --inserts
    generation during restoration #241

Full Changelog: v0.2.2...v0.2.3

Contributors

@wwoytenko

Scpecial thanks

@janmeier

Links

Feel free to reach out to us if you have any questions or need assistance:

v0.2.2

09 Nov 17:42
7119261
Compare
Choose a tag to compare

Greenmask 0.2.2

This release introduces bug fixes.

Changes

  • Fixed a case when apply_for_references applies validation to all transformations even if they are not
    marked as apply_for_references #236.
  • Fixed issue with the latest tag disappearing in the documentation #234.

Full Changelog: v0.2.1...v0.2.2

Contributors

@wwoytenko
@tarbaev-vl

Scpecial thanks

@janmeier

Links

Feel free to reach out to us if you have any questions or need assistance:

v0.2.1

04 Nov 18:23
9674943
Compare
Choose a tag to compare

Greenmask 0.2.1

This release introduces two new features transformation conditions and transformation inheritance for primary and foreign keys. It also includes several bug fixes and improvements.

Changes

  • Feat: Transformation conditions - execute a defined transformation only if a specified condition is met. #133
  • Feat: Transformation inheritance - transformation inheritance for partitioned tables and tables with foreign keys. Define once and apply to all. #229
  • CI/CD: Add golangci-lint job to pull request check #223
  • CI/CD: Deploy development version of the documentation (main branch) and divided jobs into separate blocks and made them reusable #212
  • Fix: Bump go and python dependencies #219
  • Fix: Fatal validation error in playground #224
  • Fix: Code refactoring and golangci-lint warns fixes #226
  • Docs: Revised README.md - added badges, updated the description, added getting started section, added greenmask design
    schema #216 #217 #218
  • Docs: main page errors in docs #221
  • Docs: Revised README.md according to the latest changes #225
  • Docs: moved documentation to docs.greenmask.io, added feedback form and GA integration #220
  • Docs: Fixed typo in subset documentation #211

Contributors

@wwoytenko
@tarbaev-vl
@JPrisk

Scpecial thanks

@janmeier
@gregwebs
@SiPaff

Full Changelog: v0.2.0...v0.2.1

Links

Feel free to reach out to us if you have any questions or need assistance:

v0.2.0

09 Oct 17:40
9e5ca1c
Compare
Choose a tag to compare

Greenmask 0.2.0

This is one of the biggest releases since Greenmask was founded. We've been in close contact with our users, gathering feedback, and working hard to make Greenmask more flexible, reliable, and user-friendly.

This major release introduces exciting new features such as database subsetting, pgzip support, restoration in topological order, and refactored transformers, significantly enhancing Greenmask's flexibility to better meet business needs. It also includes several fixes and improvements.

Preface

This release is a major milestone that significantly expands Greenmask's functionality, transforming it into a simple, extensible, and reliable solution for database security, data anonymization, and everyday operations. Our goal is to create a core system that can serve as a foundation for comprehensive dynamic staging environments and robust data security.

Notable changes

  • PostgreSQL 17 support - revised ported library to support PostgreSQL 17

  • Database Subset - a new feature that allows you to define a subset of the database, allowing you to scale down the dump size (#110). This is robust for multipurpose and especially useful for testing and development environments. It supports:

    • References with NULL values - generate the LEFT JOIN query for the FK reference with NULL values to include them in the subset.
    • Supports virtual references (virtual foreign keys) - create a logical FK in Greenmask that will be used for subset dependencies graph. The virtual reference can be defined for a column or an expression, allowing you to get the value from JSON and similar.
    • Supports circular references - Greenmask will automatically resolve
      circular dependencies in the subset by generating a recursive query. The query is generated with integrity checks of the subset ensuring that the data gathered from circular dependencies is consistent.
    • Fully covered with documentation including troubleshooting and examples.
    • Supports FK and PK that have more than one column (or expression).
    • Multi-cycles resolution in one strong connected component (SCC) is supported - Greenmask will generate a recursive query for the SCC whether it is a single cycle or multiple cycles, making the subset system universal for any database schema.
    • Supports polymorphic relationships - You can define a virtual reference for a table with polymorphic references using polymorphic_exprs attribute and use greenmask to generate a subset for such tables.
  • pgzip support for faster compression and [decompression] (https://docs.greenmask.io/v0.2.0/commands//restore#pgzip-decompression) — setting --pgzip can speed up the dump and restoration processes through parallel compression. In some tests, it shows up to 5x faster dump and restore operations.

  • Restoration in topological order - This flag ensures that dependent tables are not restored until the tables they depend on have been restored. This is useful when you want to be notified of errors as immediately as possible without waiting for the entire table to be restored.

  • Insert format restoration - For a flexible restoration process, Greenmask now supports data restoration in the INSERT format. It generates the insert statements based on COPY records from the dump. You do not need to re-dump your data to use this feature; it can be defined in the restore command. The list of new features related to the INSERT format:

    • Generate INSERT statements with the ON CONFLICT DO NOTHING clause if the flag --on-conflict-do-nothing is set.
    • Error exclusion list in the config to skip certain errors and continue inserting subsequent rows from the dump.
    • Use cases - incremental dump and restoration for logical data. For example, if you have a database, and you want to insert data periodically from another source, this can be used together with the database subset and transformations to catch up the target database.
  • Restore data batching (#173) - By default, the COPY protocol returns the error only on transaction commit. To override this behavior, use the --batch-size flag to specify the number of rows to insert in a single batch during the COPY command. This is useful when you want to control the transaction size and commit.

  • Introduced keep_null parameter for RandomPerson transformer.

  • Introduced dynamic parameters in the transformers

    • Most transformers now support dynamic parameters where applicable.
    • Dynamic parameters are strictly enforced. If you need to cast values to another type, Greenmask provides templates and predefined cast functions accessible via cast_to. These functions cover frequent operations such as UnixTimestampToDate and IntToBool.
  • The transformation logic has been significantly refactored, making transformers more customizable and flexible than before.

  • Introduced transformation engines

    • random - generates transformer values based on pseudo-random algorithms.
    • hash - generates transformer values using hash functions. Currently, it utilizes sha3 hash functions, which are secure but perform slowly. In the stable release, there will be an option to choose between sha3 and SipHash.
  • Introduced static parameters value template

  • Dumps retention management - Introduced retention parameters (#201) for the delete command. Introduced two new statuses: failed and in progress. A dump is considered failed if it lacks a "done" heartbeat or if the last heartbeat timestamp exceeds 30 minutes. The delete command now supports the following retention parameters:

    • --dry-run: Runs the deletion operation in test mode with verbose output, without actually deleting anything.
    • --before-date 2024-08-27T23:50:54+00:00: Deletes dumps older than the specified date. The date must be provided
      in RFC3339Nano format, for example: 2021-01-01T00:00:00Z.
    • --retain-recent 10: Retains the N most recent dumps, where N is specified by the user.
    • --retain-for 1w2d3h4m5s6ms7us8ns: Retains dumps for the specified duration. The format supports weeks (w), days (d), hours (h), minutes (m), seconds (s), milliseconds (ms), microseconds (us), and nanoseconds (ns).
    • --prune-failed: Prunes (removes) all dumps that have failed.
    • --prune-unsafe: Prunes dumps with "unknown-or-failed" statuses. This option only works in conjunction with --prune-failed.
  • Docker image mirroring into the GitHub Container Registry

Core

  • Introduced the Parametrizer interface, now implemented for both dynamic and static parameters.
  • Renamed most of the toolkit types for enhanced clarity and comprehensive documentation coverage.
  • Refactored the Driver initialization logic.
  • Added validation warnings for overridden types in the Driver.
  • Migrated existing built-in transformers to utilize the new Parametrizer interface.
  • Implemented a new abstraction, TransformationContext, as the first step towards enabling new feature transformation conditions (#34).
  • Optimized most transformers for performance in both dynamic and static modes. While dynamic mode offers flexibility, static mode ensures performance remains high. Using only the necessary transformation features helps keep transformation time predictable.

Transformers

  • RandomEmail - Introduces a new transformer that supports both random and deterministic engines. It allows for flexible email value generation; you can use column values in the template and choose to keep the original domain or select any from the domains parameter.

  • NoiseDate, NoiseFloat, NoiseInt - These transformers support both random and deterministic engines, offering dynamic mode parameters that control the noise thresholds within the min and max range. Unlike previous implementations which used a single ratio parameter, the new release features min_ratio and max_ratio parameters to define noise values more precisely. Utilizing the hash engine in these transformers enhances security by complicating statistical analysis for attackers, especially when the same sal...

Read more

v0.2.0b2

30 Aug 11:14
09e8f3b
Compare
Choose a tag to compare
v0.2.0b2 Pre-release
Pre-release

Greenmask 0.2.0b2

This major beta release introduces new features such as the database subset, pgzip support, restoration in
topological, and many more. It also includes fixes and improvements.

Preface

This release is a major milestone that significantly expands Greenmask's functionality, transforming it into a simple,
extensible, and reliable solution for database security, data anonymization, and everyday operations. Our goal is to
create a core system that can serve as a foundation for comprehensive dynamic staging environments and robust data
security.

Latest documentation version

Notable changes

  • Database Subset - a new feature that allows you to define a subset of the database,
    allowing you to scale down the dump size (#110). This is
    robust for multipurpose and especially useful for testing and development environments. It supports:

    • References with NULL values - generate the LEFT JOIN query
      for the FK reference with NULL values to include them in the subset.
    • Supports virtual references (virtual foreign keys) - create a logical
      FK in Greenmask that will be used for subset dependencies graph. The virtual reference can be defined for a column
      or an expression, allowing you to get the value from JSON and similar.
    • Supports circular references - Greenmask will automatically resolve
      circular dependencies in the subset by generating a recursive query. The query is generated with integrity checks
      of the subset ensuring that the data gathered from circular dependencies is consistent.
    • Fully covered with documentation including troubleshooting
      and examples.
    • Supports FK and PK that have more than one column (or expression).
    • Multi-cycles resolution in one strong connected component (SCC) is supported - Greenmask will generate a
      recursive query for the SCC whether it is a single cycle or multiple cycles, making the subset system universal
      for any database schema.
  • pgzip support for faster compression
    and decompression — setting --pgzip can speed up the dump and
    restoration processes through parallel compression. In some tests, it shows up to 5x faster dump and restore
    operations.

  • Restoration in topological order - This flag ensures
    that dependent tables are not restored until the tables they depend on have been restored. This is useful when you
    want to be notified of errors as immediately as possible without waiting for the entire table to be restored.

  • Insert format restoration - For a flexible restoration
    process, Greenmask now supports data restoration in the INSERT format. It generates the insert statements based on
    COPY records from the dump. You do not need to re-dump your data to use this feature; it can be defined in the
    restore command. The list of new features related to the INSERT format:

    • Generate INSERT statements with the **ON CONFLICT DO NOTHING** clause if the flag --on-conflict-do-nothing
      is set.
    • Error exclusion list in the config to skip
      certain errors and continue inserting subsequent rows from the dump.
    • Use cases - incremental dump and restoration for logical data. For example, if you have a database, and you
      want to insert data periodically from another source, this can be used together with the database subset and
      transformations to catch up the target database.
  • Restore data batching (#173) -
    By default, the COPY protocol returns the error only on the transaction commit. To override this behavior, use the
    --batch-size flag to specify the number of rows to insert in a single batch during the COPY command. This is useful
    when you want to control the transaction size and commit.

  • Introduced keep_null parameter for RandomPerson transformer.

Fixes and improvements

  • Fixed validate command with the --table flag, which had the
    wrong order of the table name representation {{ table_name }}.{{ schema }} instead of
    {{ schema }}.{{ table_name }}.
  • Fixed
    Row.SetColumn out of range validation.
  • Fixed
    restoreWorker panic caused when the worker received an error from pgx.
  • Fixed error
    handling in the restore command.
  • Fixed restore
    jobs now start a transaction for each table restoration and commit it after the table restoration is done.
  • Fixed
    --exit-on-error works incorrectly in the restore command. Now, the --exit-on-error flag works correctly with the
    data section.
  • Fixed transaction rollback in the validate command.
  • Fixed typo in documentation.
  • Fixed a CI/CD bug related to retrieving current tags.
  • Fixed the Docker image tag for latest to exclude specific
    keywords.
  • Fixed a case where the hashing value was not set for each column
    in the RandomPerson transformer.
  • Fixed original email value parsing conditions.
  • Subset docs revision.
  • Fixes a case where data entries were excluded by exclusion
    parameters such as --exclude-table, --table, etc.
  • Fixed zero bytes that were written in the buffer due to the wrong
    buffer limit in the Email transformer.
  • Fixed a case where the overridden type of column via
    columns_type_override did not work.
  • Fixed a case where an unknown option provided in the config was
    just ignored instead of throwing an error.
  • Fixed a case where min and max parameter values were ignored
    in transformers NoiseDate, NoiseNumeric, NoiseFloat, NoiseInt, RandomNumeric, RandomFloat, and
    RandomInt.
  • Fixed TOC entry COPY restoration statement - added missing
    newline and semicolon. Now backward pg_dump call pg_restore 1724504511561 --file 1724504511561.sql is backward
    compatible and works as expected.
  • Fixed a case where dump/restore fails when masking tables with a
    generated column.
  • Updated go version (v1.22) and dependencies
  • Revised installation section of doc
  • A bunch of refactoring and code cleanup to make the codebase more maintainable and readable.

Full Changelog: v0.2.0b1...v0.2.0b2

Contributors

Special thanks

Playground usage for the beta version

If you want to run a Greenmask playground for the beta version v0.2.0b2 execute:

git checkout tags/v0.2.0b2 -b v0.2.0b2
docker-compose run greenmask-from-source

Links

Feel free to reach out to us if you have any questions or need assistance:

v0.2.0b1

17 May 20:44
e298041
Compare
Choose a tag to compare
v0.2.0b1 Pre-release
Pre-release

Greenmask 0.2.0b1

This major beta release introduces new features and refactored transformers, significantly enhancing Greenmask's flexibility to better meet business needs.

Playground usage for beta version

If you want to run a Greenmask playground for the beta version execute:

git checkout tags/v0.2.0b1 -b v0.2.0b1
docker-compose run greenmask-from-source

Changes overview

  • Introduced dynamic parameters in the transformers

    • Most transformers now support dynamic parameters where applicable.
    • Dynamic parameters are strictly enforced. If you need to cast values to another type, Greenmask provides templates and predefined cast functions accessible via cast_to. These functions cover frequent operations such as UnixTimestampToDate and IntToBool.
  • The transformation logic has been significantly refactored, making transformers more customizable and flexible than before.

  • Introduced transformation engines

    • random - generates transformer values based on pseudo-random algorithms.
    • hash - generates transformer values using hash functions. Currently, it utilizes sha3 hash functions, which are secure but perform slowly. In the stable release, there will be an option to choose between sha3 and SipHash.
  • Introduced static parameters value template

Notable changes

Core

  • Introduced the Parametrizer interface, now implemented for both dynamic and static parameters.
  • Renamed most of the toolkit types for enhanced clarity and comprehensive documentation coverage.
  • Refactored the Driver initialization logic.
  • Added validation warnings for overridden types in the Driver.
  • Migrated existing built-in transformers to utilize the new Parametrizer interface.
  • Implemented a new abstraction, TransformationContext, as the first step towards enabling new feature transformation conditions (#34).
  • Optimized most transformers for performance in both dynamic and static modes. While dynamic mode offers flexibility, static mode ensures performance remains high. Using only the necessary transformation features helps keep transformation time predictable.

Documentation

Documentation has been significantly refactored. New information about features and updates to transformer descriptions have been added.

Transformers

  • RandomEmail - Introduces a new transformer that supports both random and deterministic engines. It allows for flexible email value generation; you can use column values in the template and choose to keep the original domain or select any from the domains parameter.

  • NoiseDate, NoiseFloat, NoiseInt - These transformers support both random and deterministic engines, offering dynamic mode parameters that control the noise thresholds within the min and max range. Unlike previous implementations which used a single ratio parameter, the new release features min_ratio and max_ratio parameters to define noise values more precisely. Utilizing the hash engine in these transformers enhances security by complicating statistical analysis for attackers, especially when the same salt is used consistently over long periods.

  • NoiseNumeric - A newly implemented transformer, sharing features with NoiseInt and NoiseFloat, but specifically designed for numeric values (large integers or floats). It provides a decimal parameter to handle values with fractions.

  • RandomChoice - Now supports the hash engine

  • RandomDate, RandomFloat, RandomInt - Now enhanced with hash engine support. Threshold parameters min and max have been updated to support dynamic mode, allowing for more flexible configurations.

  • RandomNumeric - A new transformer specifically designed for numeric types (large integers or floats), sharing similar features with RandomInt and RandomFloat, but tailored for handling huge numeric values.

  • RandomString - Now supports hash engine mode

  • RandomUnixTimestamp - This new transformer generates Unix timestamps with selectable units (second, millisecond, microsecond, nanosecond). Similar in function to RandomDate, it supports the hash engine and dynamic parameters for min and max thresholds, with the ability to override these units using min_unit and max_unit parameters.

  • RandomUuid - Added hash engine support

  • RandomPerson - Implemented a new transformer that replaces RandomName, RandomLastName, RandomFirstName, RandomFirstNameMale, RandomFirstNameFemale, RandomTitleMale, and RandomTitleFemale. This new transformer offers enhanced customizability while providing similar functionalities as the previous versions. It generates personal data such as FirstName, LastName, and Title, based on the provided gender parameter, which now supports dynamic mode. Future minor versions will allow for overriding the default names database.

  • Added tsModify - a new template function for time.Time objects modification

  • Introduced a new RandomIp transformer capable of generating a random IP address based on the specified netmask.

  • Added a new RandomMac transformer for generating random Mac addresses.

  • Deleted transformers include RandomMacAddress, RandomIPv4, RandomIPv6, RandomUnixTime, RandomTitleMale, RandomTitleFemale, RandomFirstName, RandomFirstNameMale, RandomFirstNameFemale, RandomLastName, and RandomName due to the introduction of more flexible and unified options.

Contributors

@dev-comrade
@wwoytenko

v0.1.14

09 May 17:46
16b1bd6
Compare
Choose a tag to compare

Greenmask 0.1.14

This release introduces bug fixes.

Changes

  • Fixed panic caused by Large Object dumper

Contributors

@wwoytenko

Special thanks

@TCY16