Synchronizing two computers using Amazon S3

18 April 2007 By Avinash Meetoo 11 Comments

I’ve moved to Amazon S3. I can now store 1Gb of data online at $0.15 per month (Rs 5). Transfers cost $0.20 per Gb (Rs 6.50). This is extremely cheap for the peace of mind you can enjoy when you know “your data is safe sitting on AmazonÃ¢â‚¬â„¢s geo-redundant servers right between some bits describing a new book from Oprah and a bad review on latest Ben Affleck movie!”

I have downloaded and started using Jungle Disk and it works great! Jungle Disk (simply said) allows you to mount your S3 online storage as a normal drive. It even caches data and transfers in the background so, in most cases, you don’t even think you are working online. It is multiplatform (Linux, Mac OS X and Windows) and is free (for the time being – the source is also available)

This is all great… except for one major issue: I don’t want to do backups! I want synchronization!

Let me explain. I use my MacBook at home and one Linux PC at work. I want to be able to access my files both at home and at work. Naturally, I can do this by moving all my files to S3 and then use Jungle Disk to access them transparently. Unfortunately, this scheme does not work when the Internet connection is down and no local cached copy of the data exists. And having no Internet connection in our poor country is relatively common.

One solution would be to have the data locally on the MacBook and on the Linux PC. Then S3 would be used as an intermediate to synchronize the two computers. Unfortunately, I have no idea how to do this. I’ve been reading about s3sync.rb and rsync (over the Jungle Disk mount) but I do not understand how to do file synchronization with them. I’ve also read about unison but I am not too confident of its performance with respect to S3…

Any idea?

Comments

John says

19 April 2007 at 03:09

You might want to check out jets3t.

http://jets3t.s3.amazonaws.com/applications/synchronize.html
avinash says

19 April 2007 at 08:49

Thanks John for this tip and your inspiring article
frederic says

19 April 2007 at 09:06

It isn’t relevant with the limited upload bandwidth we have here in Mauritius. I just can’t see me uploading 1 GB or more of data anywhere. That’s just not a feasible issue.
avinash says

19 April 2007 at 11:26

Yes, 128kbit/s is slow. But only the initial upload will take a long time. As soon as this is done (i.e. after some days or even weeks) then you can enjoy having all your data online.

The fact is you don’t have to wait. The data is still available locally. The initial upload can proceed in the background without you even noticing. And you can do it little by little. No need to have your computer on for weeks…

For the time being I only want to upload around 1Gb of work-related data (my personal data is regularly backed up on an external drive) to be able to work from home… And I only create a maximum of 1Mb of new work-related data every day so incrementally updating S3 will be extremely fast if I can identify a nice file synchronisation application which is multiplatform and free.
Eddy Young says

19 April 2007 at 16:26

One word: CVS. OK, I’ll throw in another one for FREE: Subversion :-)

Seriously, this is interesting in that it opens avenues for RDBMS-less applications; just dump objects into S3! I’ll have to explore that.

Eddy
avinash says

19 April 2007 at 17:47

But what CVS and Subversion server?

Is the something somewhere where I pay only $0.15 per Gb (unlimited) apart from S3. I’m planning (in the very very long run) to have 100 Gb there (all my files + MP3 + photos + film on my kids growing up + etc)

PS: I am exploring synchronize (from jets3t) right now and it seems to be very good. In fact, I am uploading my first Gb at this very moment from work…
Eddy Young says

19 April 2007 at 19:58

I have read about the S3 tools summarily, but all seem to work on the concept of mapping S3 to a local drive. So… you could set up a local repository on that drive, then use CVS/Subversion command to keep your files in sync.

I would recommend Assembla for free Subversion, but the repository size limits (200 MB per “space”) may not suit your needs.

http://assembla.com/tour

I am also looking into S3 although I’m a bit wary of putting my life onto someone else’s servers.
avinash says

20 April 2007 at 00:27

Yes and no.

When you use something like Jungle Disk or FUSE then you essentially have a mapped drive onto S3.

But I am currently using jets3t to upload files to S3 and it feels much more like a FTP server. The trick is to work locally (say at home) and, just before switching off, synchronise your HD with S3. Then you go to work and synchronise your computer there to get the new files…

It a pain in the ass I know (it would have been much better to work online) but bandwidth and reliability of the connection are laughable here as you know.
Eddy Young says

21 April 2007 at 00:10

I tried jets3t, and indeed synchronize seems to be the perfect fit for your needs. The usage would be exactly similar to that of a CVS client: update from server, modify, commit to server, and so on.

You may want to look into a version-control system also. Not only you’d be able to synchronise your files, but you’d get the added benefit of managing versions. I use Subversion with Assembla, and the combination works great. The big problem remains finding a free/cheap repository that would give you the same amount of storage space as Amazon S3.
Khalil A. says

21 April 2007 at 10:04

Well, I’m lucky that you’ve blogged about Amazon S3 about now. I’ve been looking at a variety of hosts and I must that Amazon S3 beats them all.

Gosh, I miss blogging. But once I’m set-up, I should be back quickly enough.
econclubmu says

25 April 2007 at 06:39

I use xdrive, its an aol subsidiary. It gives you 5GBs for free. All you need is aim account or you can sign up for one with them. It’s been pretty reliable for the past year that I’ve used it and has an easy interface. http://www.xdrive.com/

Reader Interactions

Comments

Leave a Reply