Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non deterministic DDL leads to data inconsistency #366

Open
joffrey92 opened this issue Mar 3, 2020 · 0 comments
Open

Non deterministic DDL leads to data inconsistency #366

joffrey92 opened this issue Mar 3, 2020 · 0 comments

Comments

@joffrey92
Copy link

Hello Codership Team,

While doing an ALTER, I remembered that some DDL commands are non deterministic, and can lead to data difference between the nodes of the same cluster.

Scenario:

  • Assuming there is a table with existing data
  • When adding a column with a non-deterministic default value (ex: @@hostname or UUID())
  • Then the rows will be populated on each Galera node with different (local) data
  • When the new column is configured as part of the primary key
  • Then running an update on any row of the table will lead to only one node remaining in the cluster

To reproduce:
1) Install and Configure Galera with at least 2 nodes (minimal settings).
2) Create a table and insert a row:

MariaDB [oceandba]> CREATE TABLE nd_ddl (id serial, hits tinyint);
Query OK, 0 rows affected (0.041 sec)
MariaDB [oceandba]> INSERT INTO nd_ddl (hits) values (0);
Query OK, 1 row affected (0.008 sec)

3) Perform non-deterministic DDL:

MariaDB [oceandba]> ALTER TABLE nd_ddl ADD COLUMN hostname varchar(30) DEFAULT @@hostname, ADD PRIMARY KEY (id,hostname);
Query OK, 0 rows affected (0.065 sec)
Records: 0  Duplicates: 0  Warnings: 0

4) Verify data difference between nodes:

MariaDB [oceandba]> select * FROM nd_ddl;
+----+------+--------------+
| id | hits | hostname     |
+----+------+--------------+
|  1 |    0 | eaddd670b38b |
+----+------+--------------+
1 row in set (0.000 sec)
MariaDB [oceandba]> select * FROM nd_ddl;
+----+------+--------------+
| id | hits | hostname     |
+----+------+--------------+
|  1 |    0 | e199f61216be |
+----+------+--------------+
1 row in set (0.000 sec)

5) Update existing row and observe behavior:

MariaDB [oceandba]> UPDATE nd_ddl set hits = hits + 1 where id = 1;
Query OK, 1 row affected (0.006 sec)
Rows matched: 1  Changed: 1  Warnings: 0
MariaDB [oceandba]> select * FROM nd_ddl;
+----+------+--------------+
| id | hits | hostname     |
+----+------+--------------+
|  1 |    1 | eaddd670b38a |
+----+------+--------------+
1 row in set (0.000 sec)
MariaDB [oceandba]> select * FROM nd_ddl;
ERROR 1047 (08S01): WSREP has not yet prepared node for application use
MariaDB [(none)]> show global status like 'wsrep%state_%';
+---------------------------+--------------------------------------+
| Variable_name             | Value                                |
+---------------------------+--------------------------------------+
| wsrep_local_state_uuid    | 00000000-0000-0000-0000-000000000000 |
| wsrep_local_state_comment | Inconsistent                         |
| wsrep_cluster_state_uuid  | 87fce3ba-5c9b-11ea-bc33-9b152b8a843a |
+---------------------------+--------------------------------------+
3 rows in set (0.003 sec)

Notes:

  • This was reproduced on MariaDB 10.3 (galera-3) and MariaDB 10.5 (galera-4).
  • This doesn't happen when the "hostname" column is not part of PK
  • I remember Galera has a hook when a new AUTO_INCREMENT column is added (to bypass auto_increment parameters)
  • This should be added to limitations in documentation
  • Possibly a warning could be issued if such DDL is detected ?

Thanks for looking into it
Regards,
Joffrey

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant