-
Notifications
You must be signed in to change notification settings - Fork 47
Guidelines
Starting from version 5.0, the command-line results in an error if the configuration file contains a column that is not defined in the database.
No breaking changes, but since version 4.2.0, the parameter filters
is deprecated.
It will be removed in the next major version.
Use the parameter where
instead.
Breaking changes:
- The following converters were renamed:
-
randomizeDate
->randomDate
-
randomizeDateTime
->randomDateTime
-
addPrefix
->prependText
-
addSuffix
->appendText
-
- The
orderBy
parameter was renamed toorder_by
. This was the only parameter that didn't use snake case.
Since this tool is a pure PHP implementation of a MySQL dumper, it is slower than mysqldump.
If the database to dump has very large tables, it is recommended to use the table filter mechanism.
If you want to share your configuration file, don't include the database credentials. Instead, use environment variables.
For example:
database:
host: '%env(DB_HOST)%'
user: '%env(DB_USER)%'
password: '%env(DB_PASSWORD)%'
name: '%env(DB_NAME)%'
If your database contains personal data, you can use converters to anonymize the data written to the dump file.
Example of personal data:
- username
- name
- date of birth
- phone number
- address
- IP address
- encrypted password
- payment data
- comment that could contain customer-related information
If you use one of the config templates bundled with this tool (e.g. magento2
), the anonymized data is not consistent across tables.
For example, the anonymized customer email won't have the same value between the customer table and the quote table.
You can add data consistency by specifying a cache key. For example, in Magento 2:
tables:
customer_entity:
converters:
email:
cache_key: 'customer_email'
unique: true
customer_flat_grid:
converters:
email:
cache_key: 'customer_email'
unique: true
# ... repeat this for each table that stores a customer email
With the above configuration, each table will use the same anonymized email for each customer.
Warning: this consumes a lot of memory (approximately 1G for 10 million values).
Performance
In the magento templates, quote tables are not truncated by default. If these tables contain a lot of values, adding filters to these tables will speed up the dump creation.
For example (Magento 2):
tables:
quote:
truncate: true
Admin Accounts
The magento1
and magento2
templates anonymize all admin accounts.
If you want to keep the email/password for some accounts, you can set a condition on the admin_user
table.
Example:
tables:
admin_user:
skip_conversion_if: '{{username}} === "admin123"'
Payment Data
In Magento 1 and Magento 2, the payment data is partially stored in a column named additional_information
.
The data is stored as a serialized array.
Only the CC_CN
property is anonymized by the magento1
and magento2
templates.
If this column contains other sensible data in your project, you must anonymize it in your custom config file. For example, in Magento 1:
tables:
sales_flat_quote_payment:
converters:
additional_information:
parameters:
converters:
fieldToAnonymize:
converter: 'anonymizeText'
sales_flat_order_payment:
converters:
additional_information:
parameters:
converters:
fieldToAnonymize:
converter: 'anonymizeText'
In Magento 2:
tables:
quote_payment:
converters:
additional_information:
parameters:
converters:
fieldToAnonymize:
converter: 'anonymizeText'
sales_order_payment:
converters:
additional_information:
parameters:
converters:
fieldToAnonymize:
converter: 'anonymizeText'
The fields to anonymize will depend on the payment methods that are used in the project.