Syncing your files with an S3 account on Linux

Ben Tasker

2012-03-01 12:36 (updated 2019-05-06 10:49)

Amazon's cloud is incredibly useful for a number of things, including backup. I don't, as a rule, trust much to 'the cloud' but it does provide a nice easy way to achieve off-site backups at a reasonable cost.

This documentation explains one of the many ways to achieve automated backup to your S3 account (set one up here).

To begin with we need to install s3cmd (I've had limited success with the FUSE driver, so opted to take the more reliable route)

In a console

sudo -s
[Enter your password]
cd /usr/local/src
wget "http://bit.ly/xlzw8C" -O s3cmd.tar.gz
tar xvzf s3cmd.tar.gz
cd s3cmd-1.0.1
python setup.py install
# If this last command returns an error, try installing the Pytho distutils package
# On Ubuntu: apt-get install python2.4-setuptools
exit
s3cmd --configure

Enter your S3 Access Key - view it here, Press Enter and then enter your Secret key (same page but you need to click 'Show'). Press Enter

You can use an encryption password if you want, I tend not to for simplicities sake. Whether or not you need it depends on what you'll be backing up, but it's probably better to set something! Press Enter to accept each of the defaults unless you know something differs.

When asked whether to use HTTPS I usually set Yes. Read the information it gives you, but I've yet to experience issues.

Make sure you enter "Y" to save settings!

S3Cmd is now set up, so we need to create a bucket for our backups. Let's start by checking if we have any

s3cmd ls

If you've set up a new S3 account, this won't return anything. If not it'll list any buckets you already have, if you're planning on using one of those then skip the next step. In either case remember to substitute your own bucket name for the one I've used!

s3cmd mb s3://mynewbucket

Now if we run the ls command again we should see some output

s3cmd ls

2012-03-01 12:55  s3://mynewbucket

Now we will try syncing a file (which is how we'll perform the backup). We'll start by creating a test folder to sync

mkdir testdir
echo "" > testdir/test.txt
mkdir testdir/testdir2
echo "Test2" > testdir/testdir2/test.txt
s3cmd sync --recursive testdir s3://mynewbucket

Your system should now have uploaded the folder testdir and everything below it to S3. You can check with

s3cmd ls s3://mynewbucket

From here it should be reasonably obvious what you need to do to sync your folders. You can also specify directories on S3, so we could re-sync the folder like so

s3cmd sync --recursive testdir s3://mynewbucket/testdir

So let's remove our test folder and then create a backup script

s3cmd del s3://mynewbucket/testdir

From here it's reasonably simple;

nano ~/.s3backup.sh
#!/bin/bash
s3cmd sync --recursive /home/$USER/mydir/to/backup s3://mynewbucket
# Repeat for each of the directories you want to backup
# Lines should look like
# s3cmd sync --recursive /home/ben/Pictures s3://mynewbucket

# Press Ctrl-X to exit, hit Y to save

We now need to make that script executable and add it to the crontab

chmod +x ~/.s3backup.sh
crontab -e

## Add the following line, but substitute $USER for your username
0 0 * * * /home/$USER/.s3backup.sh

# Ctrl-X to exit, Y to save (assuming Nano opened)

Your sync script will now run every night at midnight. Only files that have changed will be uploaded, although the first time it runs will involve uploading everything. If you want to run it immediately

~/.s3backup.sh

Simple!

Related Snippets