Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Point In Time Recovery #531

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Point In Time Recovery #531

wants to merge 10 commits into from

Conversation

Zvirovyi
Copy link

@Zvirovyi Zvirovyi commented Nov 25, 2024

Important!

This PR relies on the canonical/charmed-mysql-snap#56. Also, it's a copy of the VM PR.

Overview

MySQL stores binary transactions logs. This PR adds a service job to upload these logs to the S3 bucket and the ability to use them later for a point-in-time-recovery with a new restore-to-time parameter during restore. This new parameter accepts MySQL timestamp or keyword latest (for replaying all the transaction logs).

Also, a new application blocked status is introduced - Another cluster S3 repository to signal user that used S3 repository is claimed by the another cluster and binlogs collecting job is disabled and creating new backups is restricted (these are the only workload limitation). This is crucial to keep stored binary logs safe from the another clusters. This check uses @@GLOBAL.group_replication_group_name.
After restore, cluster group replication is reinitialized, so practically it becomes a new different cluster. For these cases, Another cluster S3 repository message is changed to the Move restored cluster to another S3 repository to indicate this event more conveniently for the user.
Both the block messages will disappear when S3 configuration is removed or changed to the empty repository.

Usage example

  1. deploy mysql + s3-integrator and integrate them
  2. create full backup
  3. create test data:
create database zvirovyi;
use zvirovyi;
create table asd(message varchar(255) primary key);
select current_timestamp; # 2024-11-20 17:10:01
insert into asd values ('hello');
select current_timestamp; # 2024-11-20 17:10:12
insert into asd values ('world');
flush binary logs;
  1. wait several minutes for binlogs to be uploaded
  2. restore: juju run mysql/leader restore backup-id=2024-11-20T17:08:24Z restore-to-time="2024-11-20 17:10:01"
use zvirovyi;
select * from asd; # empty set returned
  1. observe application block message
  2. restore: juju run mysql/leader restore backup-id=2024-11-20T17:08:24Z restore-to-time="latest"
use zvirovyi;
select * from asd; # hello, world returned

Key notes

@Zvirovyi Zvirovyi marked this pull request as ready for review December 8, 2024 09:11
@Zvirovyi
Copy link
Author

Zvirovyi commented Dec 8, 2024

Tests are WIP!
UPD: integration tests done, and I will fix + add unit tests after PR review.

not self.charm.unit.is_leader() or "binlogs-collecting" not in self.charm.app_peer_data
):
logger.debug("Stopping binlogs collector")
container.stop(MYSQL_BINLOGS_COLLECTOR_SERVICE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question as on VM, but for workload container restart: canonical/mysql-operator#551 (comment)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done for both VM and K8s implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants