mysql-devops
This library is considered production ready.
mysqlbackup.py
is a wrapper of xtrabackup
.
It provides with a full backup/restore mechanism for both data and binlog.
Basic usage:
mysqlbackup.py --conf-path <conf.yaml> {backup|restore|catchup_binlog}
conf.yaml
is a yaml decribes where the data is and where the backup is
# required:
mysql_base: /usr/local/mysql-5.7.13
# required, the host name or ip, used only for naming the backup tgz file.
host: 127.0.0.1
# required, base dir of running mysql instance data-dir
data_base: /data1
# required, base dir of backup dir.
backup_base: /data1/backup
# required, port of the mysql instance to be backed up.
port: 3309
# required, the instance id in our mysql replication group.
instance_id: 1
# optional, for data completeness test only
sql_test_insert: 'insert into `xp2`.`heartbeat`'
' (`key`, `value`, `ts`)'
' values'
' ("test-backup", "{v}", "{v}")'
sql_test_get_2: 'select `value` from `xp2`.`heartbeat` order by `_id` desc limit 2'
# optional, for backup to s3 storage.
s3_host : "s3.amazonaws.com"
s3_bucket : "mysql-backup"
s3_access_key : "143728432789"
s3_secret_key : "jfkdslfjdsklfjdsklfdjkls"
Additional command line arguments are also supported to override options in
conf.yaml
:
--mysql-base
base dir of mysql executable--host
name of this host, just as identity of backup file name--data-base
base dir of mysql data, like /data1--backup-base
base dir of backup tmp files, like /data1/backup--port
serving port of the mysql instance to backup/restore--instance-id
id in number, as part of backup file name, it should be unique in a replication group--date-str
date in form 2017_01_01. It is used in backup file name, or to specify which backup to use for restore. when absent, use date of today--s3-host
s3 compatible service hostname to store backup--s3-bucket
s3 bucket name--s3-access-key
s3 access key--s3-secret-key
s3 secret key--clean-after-restore
clean backup files after restore--when
no-data-dir|stopped, help='condition that must be satisfied before a command runs
syntax:
mysqlbackup.py backup --conf-path conf.yaml [<other argument>]
To backup, mysqlbackup.py
does following things:
-
Backup data from an active mysql instance.
-
Backup
my.cnf
(It does not backupauto.cnf
, in order to make it possible for a restored instance to accept binlog events generated by itself from remote). -
Backup binlog from an active mysql instance. Binlog may contains more events than it is in the data backup.
-
Pack backup dir into a
tgz.des3
package with des3 encryption. -
Calculate checksum of the
tgz.des3
. -
Upload it to a aws-s3 compatible storage service, if s3 account is provided in conf.
-
Removes temporary backup dir and file.
Normally restore is done in three main steps:
- Rebuild data dir.
- Setup replication, since the backup that is made from other instance may have different replication setup.
- Catchup binlog those are not in backup from other instances.
mysqlbackup.py
can do the first and third part.
But setting up replication is related to upper level business logic.
syntax:
mysqlbackup.py restore --conf-path conf.yaml [<other argument>]
A useful argument is --clean-after-srestore
.
It informs mysqlbackup.py
to remove backup data if restore succeeds.
To restore, mysqlbackup.py
does following things:
-
Downloads backup
<backup_fn>.tgz.des3
from aws-s3 compatible storage service, if s3 account is provided. -
Checks file checksum(sha1 is prefered) against the checksum recorded in meta of object in aws s3.
-
Unpacks
<backup_fn>tgz.des3
. -
Copies data back to the dir specified by the
my.cnf
in the backup. -
Starts mysql instance in read-only mode and applies binlog events from backup to this instance.
-
Shuts it down.
syntax:
mysqlbackup.py catchup_binlog --conf-path conf.yaml [<other argument>]
-
Starts mysql with a temporary
server-id
, and starts binlog replication to let it receive and apply binlog events those are created by itself after backup was made, from other instances in a port group. -
Shuts it down and leave it there as a ready-to-use mysql data directory.
-
--when no-data-dir
: executes(backup/restore/catchup_binlog) only when there is no data dir for the specified port.If condition is not satisfied, exit normally.
-
--when stopped
: executes only when instance is stopped.If condition is not satisfied, exit normally.
innobackupex
is a great tool for backing up data from an active mysql
instance.
But innobackupex
does not backup/restore binlog.
Thus we need to backup/restore binlog manually and keep binlog and data
consistent.
We can NOT restore binlog from remote instance:
There is no easy way to cut binlog at where local data ends.
We can not apply binlog to local, if an event has already been in local data, mysql does not produce a binlog.
We backup data first, then binlog.
Since very likely, the backup procedure may be run on an active mysql instance, there are continuous write operations on this instance. After data was backed up, there could be more events in binlog than there are in backup data.
Thus every time to restore an instance, we need to apply binlog events those are in binlog backup but not in data backup, to the restored mysql instance.
It also backup my.cnf
but it does not backup auto.cnf
(the file in which
the uuid of the mysql instance is stored).
The restore part:
After restoring data, and before apply binlog from remote, a third party script is responsible to setup new replication. Because replication topology might change after backup is made.
With new replication setup, start mysql instance with temporary server-id and catchup binlog from all reachable source, until this instance is less than 1 second behind any source instance.
Then shut down mysql and leave it as a ready-to-use instance.
After restoring data and binlog, we also need to setup binlog replication to sync binlog events from other mysql instances in a same port group, to recovery events those are written after this backup was made.
Setting up replication is done in following steps:
-
resets relay log files and index, since relay logs are not backed up.
flush local relay logs;
-
resets slave status:
reset slave
-
restarts replication.
start slave for channel "**"
Mysql instance must not be stop and restart between the last two steps. Or replication information gets lost.
And
start slave
must be done on every channel explicitly. Juststart slave
does not start replication, although it should, according to mysql ref manual.
But, because mysql just ignores binlog those are created by itself, it is impossible to restore binlog events from other instance to recover events those are created after backup and before crash.
Thus every time restoring a mysql instance, it creates a new uuid(by starting
mysql without auto.cnf
, mysql will create a new one),
and start mysql with a temporary server-id
.
A temporary
server-id
must not be used to generate any binlog events, and not equal to any existentserver-id
.
This way the new instance will accept all binlog event(created by itself or other instances) those it does not have.
We can NOT use --replicate-same-server-id
to force mysql to apply binlog
either.
Because --replicate-same-server-id
can not be used together with
--log-slave-updates
, which is required in our scenario(multiple master with
multiple way replication), to let every mysql instance keeps full history
binlog from several instances.
After syncing enough binlog, stop this temporary mysql instance.
For the next time it starts, with the original server-id
, just like
normally.
The following diagram illustrates how backup and restore works.
There are two instance instace 1
and instance 2
.
instance 1
is backed up.
instance 2
is a remote replication with the same data in it.
Initial: instance 1 instance 2
uuid = aaa: uuid = ccc:
Make backup from 1, without latest binlog gtid:5 in either data or binlog.
Instance 2 has the latest gtid:5
bkp: data: aaa:1-3 ccc: data: aaa:1-5
binlog: aaa:1-4 binlog: aaa:1-5
Restored from backup, with new created uuid = bbb:
bbb: data: aaa:1-3
Apply events in binlog but not in backup-data:
bbb: data: aaa:1-4
Copy back binlog, now data and binlog are consistent but behind ccc:
bbb: data: aaa:1-4
binlog: aaa:1-4
Setup replication and apply binlog gtid:5 from ccc:
bbb: data: aaa:1-5 <- ccc: data: aaa:1-5
binlog: aaa:1-5 binlog: aaa:1-5
If there is new write on bbb, it should be replicated to ccc too:
bbb: data: aaa:1-5, bbb:1-1 -> ccc: data: aaa:1-5, bbb:1-1
binlog: aaa:1-5, bbb:1-1 binlog: aaa:1-5, bbb:1-1
-
xtrabackup,
2.4
or newer.yum install libev-devel libev perl-DBD-MySQL
-
python libs:
pip install MySQLdb pip install boto3 pip install yaml pip install argparse
It also provides with a collection of script for easing daily devops work.
watch-slave.sh <port>
Display replication status on <port>
, refreshing every seconds.
Zhang Yanpo (张炎泼) [email protected]
The MIT License (MIT)
Copyright (c) 2015 Zhang Yanpo (张炎泼) [email protected]