Backups are great, and they’re not rocket science. I’m writing up how I do backups, not because I think it’s a cool or unique setup (it’s not), but to highlight how effective a simple solution can be.
We use rsync to take a local copy of whatever is on our web host without wasting bandwidth downloading files that aren’t needed. The layout looks like this:
Our hosting provider is accessible via ssh
, and the backup box we use is a Raspberry Pi model B, costing approximately 50 AUD to get running.
On the server
On the server, I back up databases with mysqldump
. To do this, you need to enter user details into a .my.cnf
file, and then something like this will do the trick:
|
|
The above script is called database-dump.sh
, and is called from the backup box, to dump the databases to a file before grabbing all the files.
On the backup box
First, a script to get the files. I use password-less login with ssh-copy-id
so that this works non-interactively:
|
|
We save a copy of the files at this date in a dated archive, so we can back-date to find deleted things. At the end of the above script:
|
|
There aren’t a huge number of changes to record daily, so I get cron
to run the above script weekly on the backup box. Read man crontab
for how to do this.
The elephant in the room
If you think you shouldn’t be doing backups, you’re wrong.
I just want to wrap up this blog post by addressing some of the most common rationalisations I’ve seen for skipping over this.
Excuse 1: Trust
This is the idea that whoever has the data won’t lose it.
Our host is pretty good, but their terms of service state that they won’t be responsible for any data loss. Even providers which have support agreements can make mistakes. You’ll also be able to work faster if you’re not paranoid about any mistake being unrecoverable.
Excuse 2: Expense
This is the idea that backup is a nice idea but not worth it.
I think that underlying this excuse is an ingrained idea that a backup system must have a particular set of (expensive) characteristics, when in reality a scheduled rsync
is an improvement over no backup at all.
It’s dirt cheap, you can learn to do it yourself, and once running it requires virtually no administration.
Excuse 3: RAID
Lastly, the idea that after investing in RAID, or some other kind of redundancy, you don’t need backups.
If you accidentally delete something, or notice that some your files have been tampered with, then RAID will not help you. If there is a problem (eg. fire) at the hosting location, then you will be in trouble regardless of disk redundancy.