Apr 092012
 
CrashPlan - Verifying

CrashPlan is (so far) a pretty good service: unlimited storage space (not that I would want to store TBs of data as backing up and restoring simply will take too long), decent price, Linux integration. But it has one drawback: no verification of backups.

The workaround is to restore files and compare them. That’s what we do at work once in a month: recall a random tape and see if you can restore it. We just expect the backup software to detect any corruption and tell us about it. While CrashPlan should do the same, in the past with older versions, it did not necessarily. I am quite sure that bug is squashed, but better safe than sorry and verify this now while you can do something about it. Since there’s no verify available, I created my own helper script:

#!/bin/bash
# Compare 2 directories
# Used for verifying backup done by backup software
# Checks missing or extra files and file contents
# Harald Kubota 2012-04-09

if [[ $# -ne 2 ]] ; then
 echo "Usage: $0 dir1 dir2"
 echo "Compare contents of dir1 and dir2 and all files in those directories"
 exit 10
fi

if [[ ! -d $1 ]] ; then
 echo "$1 must be a directory"
 exit 11
fi

if [[ ! -d $2 ]] ; then
 echo "$2 must be a directory"
 exit 11
fi

t1=`mktemp`
t2=`mktemp`
rm -f $t1 $t2

exitcode=0

pushd . >/dev/null
cd "$1"
find . -type f | sed 's/^\.\///' | sort >$t1
popd >/dev/null

pushd . >/dev/null
cd "$2"
find . -type f | sed 's/^\.\///' | sort >$t2
popd >/dev/null

diff $t1 $t2 >/dev/null
if [[ $? -ne 0 ]] ; then
 echo "Difference in filecount:"
 echo "< means extra files in $1, > means extra files in $2"
 diff $t1 $t2
 exitcode=1
fi

# Now check every single file which exists in both directories
for i in `cat $t1` ; do
  if [[ -f "$1/$i" && -f "$2/$i" ]] ; then
      a1=`md5sum "$1/$i" | awk '{print $1}'`
      a2=`md5sum "$2/$i" | awk '{print $1}'`
      if [[ "$a1" != "$a2" ]] ; then
          md5sum "$1/$i" "$2/$i"
          exitcode=1
      fi
  fi
done

rm -f $t1 $t2

if [[ $exitcode -eq 0 ]] ; then
 echo "$1 and $2 look to be identical"
fi

exit $exitcode

You can download this file here.

The usage is simple: Do a restore of a complete directory. Obviously don’t overwrite your original data. I restore by default into ~/Desktop/

When done do:

dir-diff.sh ~/Desktop/workspace ~/workspace

and see what the script says. If it’s all identical, it’ll say so. So far, with all tests I did, this is the result. Modify a file, create a new one or delete one and the script should report a difference.

At least now a test restore verification is quick and simple to do. And the results make me put some more faith into CrashPlan.