Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: amazon cloud drive integration #10

Open
chad3814 opened this issue Jun 4, 2015 · 6 comments
Open

feature request: amazon cloud drive integration #10

chad3814 opened this issue Jun 4, 2015 · 6 comments

Comments

@chad3814
Copy link

chad3814 commented Jun 4, 2015

There's a set of python command line tools (in conjunction with an app spot) to provide cli support for cloud drive here: yadayada/acd_cli@aed16fc
I haven't used python in a few years (but have done a ton of AWS work in bash and node), is it straight forward to make a new backend store for cloud drive?

@AmesCornish
Copy link
Owner

Buttersink is segmented so the different backends are pretty independent. I think you'd essentially be re-writing the "S3Store.py" file for ACD. The main class derives from Store in Store.py, where most of the documentation comments are.

Unless ACD presents a "copy-on-write" interface, you'd probably be storing volume differences on ACD, the same way I do with S3. You wouldn't be able to browse the files directly, but you could sync to and from your local btrfs.

Out of curiosity, why not just use the S3 backend? S3 is inexpensive and might give you even more functionality than you'd get out of an ACD backend. If you want to "browse" your files and backups from S3, you can spin up a server in EC2, and sync to that -- which is very quick and essentially free.

Let me know if you need any pointers.

@chad3814
Copy link
Author

I have 4T in btrfs that I want to back up off site, that would cost it would cost $120/mo on S3. Even reduced redundancy it's still about $100/mo. I also have a prime account, so for $60/yr I can get unlimited storage.

Imagine if I tried to backup my entire 40T btrfs array :) (although I imagine Amazon would contact me if I tried storing 40T in ACD)

@AmesCornish
Copy link
Owner

I see your logic. Glacier would be cheaper than S3, but still more than
"unlimited". Compression might help a bit too...

  • Ames

Ames Cornish ~ http://montebellopartners.com/
650-533-0835 ~ [email protected]

On Wed, Jun 10, 2015 at 2:22 PM, chad3814 [email protected] wrote:

I have 4T in btrfs that I want to back up off site, that would cost it
would cost $120/mo on S3. Even reduced redundancy it's still about $100/mo.
I also have a prime account, so for $60/yr I can get unlimited storage.

Imagine if I tried to backup my entire 40T btrfs array :) (although I
imagine Amazon would contact me if I tried storing 40T in ACD)


Reply to this email directly or view it on GitHub
#10 (comment)
.

@GrahamCobb
Copy link

I am looking at a somewhat similar use case, although using Glacier instead of ACD. A couple of questions:

  1. Does buttersink work (effectively) with Glacier? In particular, keeping enough information locally that it doesn't have to retrieve information back out of Glacier in order to work out how to do the sync (which snapshots to use for diffs, etc).

  2. Is there a way to upload the initial snapshot using an out-of-band channel -- in particular by importing a physical disk? There is no way I can upload several TB of data over the Internet. I could, however, do a btrfs send to a file on an external disk and send that to Amazon to import into my glacier storage. My daily changes to the data are much more reasonable to upload as long as the base snapshot is already there.

@AmesCornish
Copy link
Owner

Graham,

1 - Theoretically buttersink should work effectively with S3 objects that have been migrated to Glacier. It does not need to read the snapshots back out of S3 to sync new snapshots up to it. If you try it, I would be very interested in any bug reports or feature requests for it to work better.

One note is that buttersink might get confused if the ".bs" files are in Glacier and not retrievable. They are very small and are not essential, however, so I would recommend either not archiving them in Glacier or just deleting them when you archive the associated snapshots.

2 - AWS does have an out-of-band service. IIRC, you send them media, and they copy and mount it in EBS. You could use this to send them your snapshots on a btrfs system, then mount that to an EC2 instance, and then use buttersink to sync from the virtual instance to S3. Later syncs could be done from a local machine. Again, I have not tried this, and would be interested in hearing whether/how that worked for you.

HTH!

  • Ames

@Arzte
Copy link

Arzte commented Nov 14, 2016

@GrahamCobb Did you ever have any luck using buttersink with glacier? I am also intrested in using it this way since it is much cheaper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants