Stichword-Archiv: backup

Backup Your Files with simple Bash Scripts

Juni 25, 2017 8:38 pm Veröffentlicht von

Ever lost data you stored on a usb drive just because it stopped working and you did not have a backup? How often did you promise yourself to set up a backup system so this will not happen again – just a few days before forgot you wanted to do so? You are not alone – so did I. Until a few months ago, when I decided to store my data on my own NAS, run by a RaspberryPi 3 and OwnCloud, to give me the feeling to have control over where my data is physically stored. On a USB drive below my desk. Without a popup reminding me, my Dropbox is running out of space.

As hard drives tend to fail, I decided to put a backup system in place so the data is safe as long as only one of the two hard drives stops working. And this was quite easy, so I want to share the simple bash scripts I use to create incremental backups of my data.

The Strategy

First, here is the backup strategy I implemented:
 – Over 5 Years, I want to keep a backup of the data, as it was in the beginning of that year
 – Over the last year, I want to keep the first backup of each month
 – Over the last month, I want to keep the backup of every Monday
 – Over the last week (7 days), the backups of every day are kept

That sounds like a high amount of data to store. But it is not, if you use the rsync argument –link-dest <folder> which makes rsync create hard links in the target folder to the folder we pass as an <folder> argument, instead of creating actual copies of the source. So, only a bit more space than the actual copy in the beginning is needed for every new backup. That is the data that actually changed – hence the data we want to back up, plus some overhead for folders and the hard links.
Here is the command we can use to create such incremental backups with rsync:

1
rsync -a --delete --link-dest ${LASTDAYPATH} ${DATADIR} ${TODAYPATH}
This command creates a backup of ${DATADIR} to ${TODAYPATH} creating links of unchanged data to ${LASTDAYPATH}.

The Scripts

Such a command should now be executed every night using a cron job.

 1
2
3
4
5
6
7
8
9
10
11
12
13
#!/bin/bash

TODAY=$(date +%Y-%m-%d)
BACKUPDIR=/nas/backup/daily/
SCRIPTDIR=/nas/data/backup_scripts
DATADIR=/nas/data/
LASTDAYPATH=${BACKUPDIR}/$(ls ${BACKUPDIR} | tail -n 1)
TODAYPATH=${BACKUPDIR}/${TODAY}
if [[ ! -e ${TODAYPATH} ]]; then
mkdir -p ${TODAYPATH}
fi
rsync -a --delete --link-dest ${LASTDAYPATH} ${DATADIR} ${TODAYPATH} $@
${SCRIPTDIR}/deleteOldBackups.sh
The data hard drive is mounted to /nas/data, the backup hard drive is mounted to /nas/backup. Every day the backup scripts creates a backup of the data drive to the backup drive (in the folder daily – which might be a misleading name as we store all the backups in it).

At the end of the script, we trigger another script deleting all the old backups, which are not needed anymore according to the backup strategy above.

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#!/bin/bash

BACKUPDIR=/nas/backup/daily/
function listYearlyBackups() {
for i in 0 1 2 3 4 5
do ls ${BACKUPDIR} | egrep "$(date +%Y -d "${i} year ago")-[0-9]{2}-[0-9]{2}" | sort -u | head -n 1
done
}

function listMonthlyBackups() {
for i in 0 1 2 3 4 5 6 7 8 9 10 11 12
do ls ${BACKUPDIR} | egrep "$(date +%Y-%m -d "${i} month ago")-[0-9]{2}" | sort -u | head -n 1
done
}

function listWeeklyBackups() {
for i in 0 1 2 3 4
do ls ${BACKUPDIR} | grep "$(date +%Y-%m-%d -d "last monday -${i} weeks")"
done
}

function listDailyBackups() {
for i in 0 1 2 3 4 5 6
do ls ${BACKUPDIR} | grep "$(date +%Y-%m-%d -d "-${i} day")"
done
}

function getAllBackups() {
listYearlyBackups
listMonthlyBackups
listWeeklyBackups
listDailyBackups
}

function listUniqueBackups() {
getAllBackups | sort -u
}

function listBackupsToDelete() {
ls ${BACKUPDIR} | grep -v -e "$(echo -n $(listUniqueBackups) |sed "s/ /\|/g")"
}

cd ${BACKUPDIR}
listBackupsToDelete | while read file_to_delete; do
rm ${file_to_delete}
done

The idea of this script is to first list all the backups that should be kept, according to our strategy, and afterwards invert this selection to find out the ones to delete.

And that’s it! Not much magic in creating incremental backups without needing too much space. My NAS is running these scripts every night since 10 months now, currently backing up 607 Gigabytes. The backups currently take 630 Gigabytes. Find the current version of my simple bash scripts in this GitHub repository: https://github.com/NautiluX/backup_scripts