One advantage of a distributed version control system such as Mercurial is that any clone of a project is complete. The clone contains the entire history of the project. That makes backing up a repository trivial.
Not so for a centralized vcs such as Subversion. If you check out a copy of a project, you have a local copy sans the history of the project. Backing up the entire history of the project takes a bit of doing. This post concerns a method for creating a live mirror of a Subversion repository.
Here is the problem we are going to solve. You have a working Subversion (svn) repository. It contains valuable code, so naturally you want to back it up, both on a local machine, and maybe an offsite machine as insurance against disaster striking your site.
You could try to back up your repository by backing up the physical media containing the files which constitute your repository, say using rsynch-backup or some such. I don’t like that approach because if you ever have to use your backup, your first step will be to try to turn a bunch of files into a working svn repository. No thanks, I’d rather have a running svn instance all set to go.
I’ll show you how to back up your svn repository using three very simple bash scripts, and later give a link to to where you can download the scripts. Here’s the first one, make-svnrepo.
#!/bin/bash
## The user should edit this script to insert appropriate repository paths.
## The goal is to mirror ${REMOTESVN}/${REPO} in ${LOCALSVN}/${REPO}, which is created here.
## The population of ${LOCALSVN}/${REPO} is done via another script, sync-svnrepo
REPO=${1} # name of repository passed in by user
REMOTESVN='10.0.0.11' # omit final "/" since script provides it later
LOCALSVN='/home/drc/repo/saffronsvn' # ditto
LOCALREPO=${LOCALSVN}/${REPO} # the file path to local svn repository about to be created
PROTOCOL=svn # the transfer protocol to be used : one of {svn, svn+ssh, file}
## Sanity check: uncomment the svn info command below, and make sure it works.
## If it does not, you are not talking to the remote repo, and may not have correctly specified some component above.
## svn info ${PROTOCOL}://${REMOTESVN}/${REPO}
## do the magic
svnadmin create ${LOCALREPO}
echo "#!/bin/bash" > ${LOCALREPO}/hooks/pre-revprop-change
chmod +x ${LOCALREPO}/hooks/pre-revprop-change
svnsync init file://${LOCALREPO} ${PROTOCOL}://${REMOTESVN}/${REPO}
How would I use this? Well, I have an existing svn repository running on my LAN, on a host with internal ip address 10.0.0.11. Suppose it hosts a project called MyProject. I want to mirror that entire project on my local machine, in the existing directory /home/drc/repo/svn. I have already installed Subversion on my local machine, and make-svnrepo is marked executable and is in my path. What I have to do is run the following in the console:
make-svnrepo My-Project
This will create an empty project. To populate it, use the next script, sync-svnrepo:
#!/bin/bash
## You will want to run this periodically, say via a cron job. I run it 4x daily.
LOCALSVN='/home/drc/repo/saffronsvn' # should agree with LOCALSVN in make-svnrepo
REPO=${1}
REPOSITORY=${LOCALSVN}/${REPO}
svnsync sync file://${REPOSITORY}
As you might expect, we would populate the mirror created in the first step by running the following in a console:
sync-svnrepo My-Project
This has to be done on a regular basis to keep up with changes; cron does the trick for me. I imagine it would be possible to use hooks within the original repo to trigger backups whenever code is committed, but I prefer not to touch the original repository.
What I’ve done at work is to mirror our working repository in three separate mirrors. Two are onsite, one is offsite. I use the following script, query-svnrepo, to check up on the status of the mirrors.
#!/bin/bash
NAME=${1}
echo "---> " ${NAME}
## The idea here is that we have mirrored the repo named by NAME is in several places
## We want to check that revision numbers on the mirrors more or less agree with the original repo's numbers
ORIGINAL=10.0.0.11 ## on our lan
MIRROR1="/home/drc/repo/svn" ## file on localhost
MIRROR2=10.0.0.66/svn ## on our LAN
MIRROR3="foo.bar.org/home/drc/repo/svn" ## offsite, requires ssh access
COMMAND='svn info '
## adjust the protocol as required for each mirror
REPOSITORY=${ORIGINAL}
PROTOCOL='svn://'
RESULT=`${COMMAND} ${PROTOCOL}${REPOSITORY}/${NAME} | grep "Revision:"`
MSG="original: ${RESULT}"
echo $MSG
REPOSITORY=${MIRROR1}
PROTOCOL='file:///' ## the third / is needed as an escape because the path begins with /
RESULT=`${COMMAND} ${PROTOCOL}${REPOSITORY}/${NAME} | grep "Revision:"`
MSG=" mirror1: ${RESULT}"
echo $MSG
REPOSITORY=${MIRROR2}
PROTOCOL='svn://'
RESULT=`${COMMAND} ${PROTOCOL}${REPOSITORY}/${NAME} | grep "Revision:"`
MSG=" mirror2: ${RESULT}"
echo $MSG
REPOSITORY=${MIRROR3}
PROTOCOL='svn+ssh://'
RESULT=`${COMMAND} ${PROTOCOL}${REPOSITORY}/${NAME} | grep "Revision:"`
MSG=" mirror3: ${RESULT}"
echo $MSG
This system has worked like a champ for me. I have never had to use the backups, but it’s nice to know they are there. And it is real nice to know that they work. This can easily be tested by checking out the project from one or all of the mirrors.
Screen scraping code can be a pain because blogging software sometimes does weird things to certain characters. You can get clean copies of the scripts from my repository on BitBucket.
If you actually use any of these scripts, let me know how it goes. I’d be happy to hear any suggested improvements.