-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cassandra backup/restore #29
Comments
Both Priam and cassandra_snap leverages Cassandra Snapshot. See Priam Backup. We could also consider to leverage EBS Snapshot. FireCamp Cassandra enables the remote JMX. We could run one or multiple job containers, which runs nodetool to connect to one or multiple Cassandra containers and flush the memtables to disk. After the flush finishes for all replicas, another job container(s) could be run to take snapshot of the EBS volumes. This is also eventual consistency with taking snapshot of all Cassandra replicas, and relies on Cassandra's built-in consistency mechanisms to resume consistency for the restored snapshot. |
Thanks for sharing your thoughts! I might be wrong, but according to Netflix/Priam#649, Priam can't be used just as a backup solution. What do you think of having a cronjob on the each C* node which will launch nodetool snapshot followed by a aws ec2 create-snapshot and aws ec2 delete-snapshot (for old snapshots) for the volumes of that node? This job could be created/altered/deleted by a command to firecamp-manageserver. Besides time of backup we might set snapshot volumes tags (with, for example, the node information), retention time, email or SNS topic for alerts in case of issues, etc. Having this implemented would also simplify launching a new C* from existing backup. |
Yes, Priam is more than backup/recovery. Didn't check the detail design/implementation. As you posted, it might not be able to only use the backup function. The cronjob may not be the best option. The nodes in one cluster may run multiple services. Different services will have different requirements for backup. The cronjob will end up to handle all services. It would be better to use the job container, which could be triggered on demand. The job container could launch nodetool flush and then call aws api to create EBS snapshot. Every service could have its own job container. Yes, we could automate the restore. A list command will be part of the general data management framework. We will evaluate other services as well for the general data management framework design. |
By C* node I meant a container where C* daemon is running, not an EC2 instance. Sorry for misleading. Looks like the separate container (within the same task) indeed better than cronjob from the different services backup management point of view: each service might have a backup job container with its own logic. |
Created two scripts to backup and restore: |
Hi Jazz,
Can we get similar script for backup and restore for Kafka and mongodb
service ?
Thanks,
Nagaraj
…On Wed, 29 May, 2019, 21:05 jazzl0ver ***@***.*** wrote:
Created two scripts to backup and restore:
https://gist.github.com/jazzl0ver/c6859e1615a0f97b8704052db0745e25
https://gist.github.com/jazzl0ver/c87c5ebfd76c07b56ffe8448f40e737b
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#29?email_source=notifications&email_token=ADMERTT3IVSSS5E4ZUFZTZTPX2PF5A5CNFSM4EP273Q2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWPXMMY#issuecomment-496989747>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADMERTX6CNIRP3LNCGWJYZTPX2PF5ANCNFSM4EP273QQ>
.
|
I don't use mongodb, so no luck here. Regarding Kafka - why do you need to back it up? |
Yes agreed, Kafka anyways data auto expires. It will be good if we could
get something for mongodb service.
Thanks,
Nagaraj
…On Mon, 3 Jun, 2019, 16:51 jazzl0ver ***@***.*** wrote:
I don't use mongodb, so no luck here. Regarding Kafka - why do you need to
back it up?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#29?email_source=notifications&email_token=ADMERTWX6KXMQ65OBZRWMXTPYT5CFA5CNFSM4EP273Q2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWZDBRY#issuecomment-498217159>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADMERTRTLMTI6HFL5U7RZ43PYT5CFANCNFSM4EP273QQ>
.
|
I'm not a part of CloudStax team, so I can't spend time on services we don't use. Sorry about that. |
Just would like to discuss ideas on the best way to have that implemented. I thought to integrate Netflix's Priam, but it doesn't seem to work as a backup/restore solution only.
Another cool tool is https://github.com/pearsontechnology/cassandra_snap. However, it needs ssh access to each instance and requires to enlist all nodes to take backup from, rather than figure that out automatically.
What are your thoughts?
The text was updated successfully, but these errors were encountered: