Linux Server Backup on Amazon S3 Cloud Using JungleDisk

On backups and drives^#

Online content is more and more added directly to servers, so there is no local version. Think uploaded files, think content management systems.

As a consequence, server data backup is getting even more important than before.

You can back up on the same server, or better on a different drive. Not good enough - your HD might break, you lose access to the server for whatever reason, the server room might get destroyed in an earthquake.
It's better to back up on a NAS, tape streamer device, or on optical disks - but these are costly both in terms of licensing and maintenance.
Even better it is to back up on a different computer. For example, you can download the backups to your desktop computer and then store them away.
If you have another server and it runs an FTP service, you can copy the backups there - but that is inherently insecure and requires having an FTP server.

The most sophisticated and elegant way though is just becoming possible - store your data in a computer network cloud, a virtual, infinitely large, reliable and extremely low-cost drive. Meet Amazon S3!

Still, Amazon S3 is just an API, it needs creative programmers to develop interfaces that will allow you to do the backup. By far the best tool existing today is called JungleDisk. Though the name (and logo!) might appear rather silly at first, JungleDisk will surprise you with such qualities (including rsync-like block level transfers!), that after a while you will end up loving the name as well (well, that's happened to me!) The developer, Dave Wright, and his colleagues, are moreover extremely quick-working and provide excellent support via a lively forum.

The desktop version of JungleDisk is trouble-free, just get an Amazon S3 account, download JungleDisk, and you'll be able to enjoy an infinite hard drive always at your disposal, wherever you are.

Linux server backup^#

It took me a lot of time to research and put together a clear procedure for installing and getting to run a command line Linux version of JungleDisk - so I have decided to post it here as a case study.

System: CentOS 5 64-bit - you also can get one from the excellent LayeredTech.com.

Downloaded the USB version of JungleDisk for Windows, configured it in GUI, which created settings file jungledisk-settings.xml. Note you can encrypt your data, including file names, if you so choose, increasing the already high security of the stored files.

Opened SSH terminal connection, using PuTTY, to the Linux server and created a folder /var/www/vhosts/example.com/jungledisk/

Downloaded the Linux version of JungleDisk and uploaded only the command line executable jungledisk, and the above jungledisk-settings.xml config file to that folder: /var/www/vhosts/example.com/jungledisk/

Go to the directory in PuTTY:
cd /var/www/vhosts/example.com/jungledisk

Then I needed to install fuse. Tried yum install fuse which seemed to work but then did not appear to function.

So (with a hint from Dave Wright) I got and installed the RPMs for dkms and dkms-fuse, like this:

wget ftp://fr2.rpmfind.net/linux/dag/redhat/el5/en/x86_64/dag/RPMS/dkms-2.0.19-2.el5.rf.noarch.rpm
rpm -i dkms-2.0.19-2.el5.rf.noarch.rpm
wget ftp://fr2.rpmfind.net/linux/dag/redhat/el5/en/x86_64/dag/RPMS/dkms-fuse-2.7.3-1.nodist.rf.noarch.rpm
rpm -i dkms-fuse-2.7.3-1.nodist.rf.noarch.rpm

Then I thought I was ready to mount the infinite drive, of course specifying the config file, so I used:
/var/www/vhosts/example.com/jungledisk/jungledisk /mnt/jungledisk -o config=/var/www/vhosts/example.com/jungledisk/jungledisk-settings.xml

That did not give any error message, but when I looked at the error log:
more /var/log/jungledisk.log

It said that fuse lacked:
modprobe fuse

After running that command I again issued:
/var/www/vhosts/example.com/jungledisk/jungledisk /mnt/jungledisk -o config=/var/www/vhosts/example.com/jungledisk/jungledisk-settings.xml
but the log said the port was occupied. That was a good occasion to learn how to quit the jungledisk process:
killall -QUIT jungledisk
and start it once again:
/var/www/vhosts/example.com/jungledisk/jungledisk /mnt/jungledisk -o config=/var/www/vhosts/example.com/jungledisk/jungledisk-settings.xml

You will be surprised, but that's all there is to it! Now anything you copy - or rsync, if you want - to /mnt/jungledisk will be securely stored in the distributed computer cloud storage, so you can access it from anywhere in the world!

The last thing you need to do is to set up a crontab process that will run the copying process regularly - while you can do something more exciting!

Enjoy!

It is clear that the cloud / network storage is on to change both the Internet and personal storage and hosting forever (and it was about time!) Amongst others, look out for the emerging hosting counterpart of the Amazon S3 cloud storage, Amazon EC2, which allows you to use the cloud as any other Linux (or Windows, whatever!) server. Again, with immense stability, scalability, access speed, and security. No more need for dedicated physical servers, complicated failover systems, geographic redirecting, etc. Google is taking, somewhat belatedly, a rather different approach with their Google Apps. And there are many others trailing behind, hoping to ride the wave -- all for the benefit of the users :-)

Tomáš Fülöpp

Sint-Agatha-Berchem, Belgium

July 18, 2008

Linux Server Backup on Amazon S3 Cloud Using JungleDisk

On backups and drives#

Linux server backup#

On backups and drives^#

Linux server backup^#