Tuesday, May 6, 2008

Convert and Filter Subversion to Git

The Challenge
I have one large (25GB) Subversion repository that partly has a structure like this:


/brad
/docs
/finances
/foo
...
I wish to convert the docs subtree (including history) into its own Git repository, without the foo directory.

The Solution
One way to achieve this would have been to dump the repository, filter the history, create a new repository, load the filtered history and then convert with git-svnimport.

Instead, I did the following:

1. Convert the docs subtree into a Mercurial repository, excluding the foo directory.
$ hg convert --filemap filemap --config convert.hg.usebranchnames=False file:///path/to/svnrepos/brad/docs docs-hg

filemap is a file in the current directory with only one line in it:
exclude foo

The effect of --config onvert.hg.usebranchnames=False is to import onto the default branch in Mercurial. Without it, a docs branch would have been used and carried over to Git in the subsequent steps. I wish the final Git repository to just have the conventional master branch.

2. Convert the Mercurial repository to Git.
$ mkdir docs
$ cd docs
$ git init


I installed Mercurial via MacPorts, so to get fast-export to work, I needed to use the right Python:
$ export PYTHON="/opt/local/bin/python2.5"
$ /path/to/fast-export/hg-fast-export.sh -A ../authors.txt -r ../docs-hg


The -A ../authors.txt simply maps the Subversion commit username to a normal Git author format. Same as git-svnimport.

$ git checkout master

3. Remove the intermediate Mercurial repository:
$ cd ..
$ rm -rf docs-hg


I did a diff of the docs subdirectories in the Subversion and Git working copies and did a quick check of the history. Looks like it worked successfully.

No comments: