Skip to content

bsc-s2/mysql-devops

Repository files navigation

Table of Content

Name

mysql-devops

Status

This library is considered production ready.

Description

mysqlbackup.py is a wrapper of xtrabackup. It provides with a full backup/restore mechanism for both data and binlog.

Usage

Basic usage:

mysqlbackup.py --conf-path <conf.yaml> {backup|restore|catchup_binlog}

conf.yaml is a yaml decribes where the data is and where the backup is

# required:
mysql_base: /usr/local/mysql-5.7.13

# required, the host name or ip, used only for naming the backup tgz file.
host: 127.0.0.1

# required, base dir of running mysql instance data-dir
data_base:   /data1

# required, base dir of backup dir.
backup_base: /data1/backup

# required, port of the mysql instance to be backed up.
port: 3309

# required, the instance id in our mysql replication group.
instance_id: 1

# optional, for data completeness test only
sql_test_insert: 'insert into `xp2`.`heartbeat`'
                    ' (`key`, `value`, `ts`)'
                    ' values'
                    ' ("test-backup", "{v}", "{v}")'
sql_test_get_2: 'select `value` from `xp2`.`heartbeat` order by `_id` desc limit 2'

# optional, for backup to s3 storage.
s3_host       : "s3.amazonaws.com"
s3_bucket     : "mysql-backup"
s3_access_key : "143728432789"
s3_secret_key : "jfkdslfjdsklfjdsklfdjkls"

Additional command line arguments are also supported to override options in conf.yaml:

  • --mysql-base base dir of mysql executable
  • --host name of this host, just as identity of backup file name
  • --data-base base dir of mysql data, like /data1
  • --backup-base base dir of backup tmp files, like /data1/backup
  • --port serving port of the mysql instance to backup/restore
  • --instance-id id in number, as part of backup file name, it should be unique in a replication group
  • --date-str date in form 2017_01_01. It is used in backup file name, or to specify which backup to use for restore. when absent, use date of today
  • --s3-host s3 compatible service hostname to store backup
  • --s3-bucket s3 bucket name
  • --s3-access-key s3 access key
  • --s3-secret-key s3 secret key
  • --clean-after-restore clean backup files after restore
  • --when no-data-dir|stopped, help='condition that must be satisfied before a command runs

Backup

syntax: mysqlbackup.py backup --conf-path conf.yaml [<other argument>]

To backup, mysqlbackup.py does following things:

  • Backup data from an active mysql instance.

  • Backup my.cnf(It does not backup auto.cnf, in order to make it possible for a restored instance to accept binlog events generated by itself from remote).

  • Backup binlog from an active mysql instance. Binlog may contains more events than it is in the data backup.

  • Pack backup dir into a tgz.des3 package with des3 encryption.

  • Calculate checksum of the tgz.des3.

  • Upload it to a aws-s3 compatible storage service, if s3 account is provided in conf.

  • Removes temporary backup dir and file.

Restore

Normally restore is done in three main steps:

  • Rebuild data dir.
  • Setup replication, since the backup that is made from other instance may have different replication setup.
  • Catchup binlog those are not in backup from other instances.

mysqlbackup.py can do the first and third part. But setting up replication is related to upper level business logic.

syntax: mysqlbackup.py restore --conf-path conf.yaml [<other argument>]

A useful argument is --clean-after-srestore. It informs mysqlbackup.py to remove backup data if restore succeeds.

To restore, mysqlbackup.py does following things:

  • Downloads backup <backup_fn>.tgz.des3 from aws-s3 compatible storage service, if s3 account is provided.

  • Checks file checksum(sha1 is prefered) against the checksum recorded in meta of object in aws s3.

  • Unpacks <backup_fn>tgz.des3.

  • Copies data back to the dir specified by the my.cnf in the backup.

  • Starts mysql instance in read-only mode and applies binlog events from backup to this instance.

  • Shuts it down.

Catchup binlog

syntax: mysqlbackup.py catchup_binlog --conf-path conf.yaml [<other argument>]

  • Starts mysql with a temporary server-id, and starts binlog replication to let it receive and apply binlog events those are created by itself after backup was made, from other instances in a port group.

  • Shuts it down and leave it there as a ready-to-use mysql data directory.

Other arguments

  • --when no-data-dir: executes(backup/restore/catchup_binlog) only when there is no data dir for the specified port.

    If condition is not satisfied, exit normally.

  • --when stopped: executes only when instance is stopped.

    If condition is not satisfied, exit normally.

How it works

innobackupex is a great tool for backing up data from an active mysql instance. But innobackupex does not backup/restore binlog. Thus we need to backup/restore binlog manually and keep binlog and data consistent.

We can NOT restore binlog from remote instance:

  • There is no easy way to cut binlog at where local data ends.

  • We can not apply binlog to local, if an event has already been in local data, mysql does not produce a binlog.

We backup data first, then binlog.

Since very likely, the backup procedure may be run on an active mysql instance, there are continuous write operations on this instance. After data was backed up, there could be more events in binlog than there are in backup data.

Thus every time to restore an instance, we need to apply binlog events those are in binlog backup but not in data backup, to the restored mysql instance.

It also backup my.cnf but it does not backup auto.cnf(the file in which the uuid of the mysql instance is stored).

The restore part:

After restoring data, and before apply binlog from remote, a third party script is responsible to setup new replication. Because replication topology might change after backup is made.

With new replication setup, start mysql instance with temporary server-id and catchup binlog from all reachable source, until this instance is less than 1 second behind any source instance.

Then shut down mysql and leave it as a ready-to-use instance.

After restoring data and binlog, we also need to setup binlog replication to sync binlog events from other mysql instances in a same port group, to recovery events those are written after this backup was made.

Setting up replication is done in following steps:

  • resets relay log files and index, since relay logs are not backed up.

    flush local relay logs;
    
  • resets slave status:

    reset slave
    
  • restarts replication.

    start slave for channel "**"
    

    Mysql instance must not be stop and restart between the last two steps. Or replication information gets lost.

    And start slave must be done on every channel explicitly. Just start slave does not start replication, although it should, according to mysql ref manual.

But, because mysql just ignores binlog those are created by itself, it is impossible to restore binlog events from other instance to recover events those are created after backup and before crash.

Thus every time restoring a mysql instance, it creates a new uuid(by starting mysql without auto.cnf, mysql will create a new one), and start mysql with a temporary server-id.

A temporary server-id must not be used to generate any binlog events, and not equal to any existent server-id.

This way the new instance will accept all binlog event(created by itself or other instances) those it does not have.

We can NOT use --replicate-same-server-id to force mysql to apply binlog either.

Because --replicate-same-server-id can not be used together with --log-slave-updates, which is required in our scenario(multiple master with multiple way replication), to let every mysql instance keeps full history binlog from several instances.

After syncing enough binlog, stop this temporary mysql instance.

For the next time it starts, with the original server-id, just like normally.

The following diagram illustrates how backup and restore works. There are two instance instace 1 and instance 2.

instance 1 is backed up. instance 2 is a remote replication with the same data in it.

Initial:  instance 1                            instance 2
          uuid = aaa:                           uuid = ccc:

Make backup from 1, without latest binlog gtid:5 in either data or binlog.
Instance 2 has the latest gtid:5

          bkp:    data:     aaa:1-3             ccc:    data:     aaa:1-5
                  binlog:   aaa:1-4                     binlog:   aaa:1-5

Restored from backup, with new created uuid = bbb:

          bbb:    data:     aaa:1-3

Apply events in binlog but not in backup-data:

          bbb:    data:     aaa:1-4

Copy back binlog, now data and binlog are consistent but behind ccc:

          bbb:    data:     aaa:1-4
                  binlog:   aaa:1-4

Setup replication and apply binlog gtid:5 from ccc:

          bbb:    data:     aaa:1-5          <- ccc:    data:     aaa:1-5
                  binlog:   aaa:1-5                     binlog:   aaa:1-5

If there is new write on bbb, it should be replicated to ccc too:

          bbb:    data:     aaa:1-5, bbb:1-1 -> ccc:    data:     aaa:1-5, bbb:1-1
                  binlog:   aaa:1-5, bbb:1-1            binlog:   aaa:1-5, bbb:1-1

Dependency

  • xtrabackup, 2.4 or newer.

    yum install libev-devel libev perl-DBD-MySQL
    

    download xtrabackup

    xtrabackup website

  • python libs:

    pip install MySQLdb
    pip install boto3
    pip install yaml
    pip install argparse
    

Scripts

It also provides with a collection of script for easing daily devops work.

watch-slave.sh

watch-slave.sh <port>

Display replication status on <port>, refreshing every seconds.

Author

Zhang Yanpo (张炎泼) [email protected]

Copyright and License

The MIT License (MIT)

Copyright (c) 2015 Zhang Yanpo (张炎泼) [email protected]

Releases

No releases published

Packages

No packages published