Add new comment
When the Drupal Gardens project first switched from Subversion to Git, we adopted the popular git-flow branching model. While this model clearly works for a great many projects, it does not suit every workflow, and not long after we adopted it we decided it didn't suit ours. The branching strategy we ended up switching to is the Branch Per Feature (BPF) model described in this post by its originator, Adam Dymitruk. We were drawn to this model because of the level of control it gives us over what we do and do not include in a given release, and it ties in well with our use of Jira for issue tracking. Here is BPF in a nutshell:
- All work is done in feature branches
- All feature branches start from a common point (branched from master at the point of the last release)
- Merge regularly into a throw-away integration branch to resolve merge conflicts and ensure tests aren't breaking
- QA branch is created off master and gets all completed features merged into it; it is recreated every time a new feature becomes ready, or when a feature that was being QA'd has been deemed not release-ready (so it gets recreated without that feature branch in it).
- Manual testing is done on QA branch, whatever doesn't pass gets left out next time it is recreated
- When ready to release, QA branch is recreated with all features that passed testing, merged into master & tagged
At the heart of this branching strategy is the idea that your feature branches are completely unadulterated by anything else happening in the repo - their history should be a series of commits, each one of which is related solely to the feature itself (i.e. no back merges). Here's an illustration of this clean commit history:
(Feature branches are in pink, QA in green, and the master branch in blue.)
The blog post by Adam Dymitruk linked above does a great job of explaining the rationale for adopting this strategy, but we found it a little short on details with regard to how to implement it. There are certain key pieces, such as the use of a shared rerere cache, that took some figuring out before we could make the switch. We've now been up and running with this model very successfully for several months, and this post aims to complement Dymitruk's by filling in some of these practical details.
But first, some cautionary words, as it is easy to end up Doing It Wrong with this branching strategy.
1. Do not deploy the integration branch
The integration branch will always contain the latest commits from all features, because the developers working on those features will have been merging into it continuously as they go. It can therefore be tempting to deploy the integration branch to a test environment to see how things are coming along. This is an abuse of the integration branch and can lead to an endemic misunderstanding of what the branch is for - it is purely for ensuring that commits are able to merge with each other and are not breaking tests. It might therefore contain all kinds of cruft from features that were never completed*. At any time during the sprint if you need to test out a feature in a test/staging environment, recreate the QA branch with that feature in it and deploy that. You can always recreate it again without that branch if it proves itself not really QA ready yet.
2. Do not make creation of the QA branch the task of one single person
The QA branch should get recreated from scratch every time a new feature becomes ready for QA. Don't leave this until the end of the sprint and make it the responsibility of a dedicated release manager. The merging of a feature branch into QA is the responsibility of the developer who worked on that branch. This is important because even though merges into integration will catch most merge conflicts with other features (which are resolved and then the resolution shared using the rerere cache as explained below), it is possible that the order of commits is different when merging into QA so conflicts can easily arise for which no existing rerere will work. The best person to deal with a merge conflict is the person who worked on the feature being merged in. When they resolve conflicts at this point, the shared resolution will work for all future iterations of the QA branch recreation (because the branches get merged in in the same order each time)**. Once we realized that each developer would need to be able to recreate the QA branch, we put together a script that can be set up as a git alias so that this process amounts to running one simple command.
Here's the complete process of working on and completing a feature during a sprint:
$ git checkout master && git pull origin master // Make sure to be working off the latest code in the master branch
$ git checkout -b J-1234 // Where J-1234 is the Jira ticket this feature corresponds to
// Hack hack hack
$ git commit -am "My first commit on my feature branch"
$ git checkout integration && git pull origin integration // Make sure I have an up-to-date local integration branch
$ git merge --no-ff J-1234
// Resolve any conflicts that arise, they will automatically be pushed to the shared rerere cache
$ git push origin integration
// Hack hack hack and repeat merges to integration
// Feature is ready for code review, submit pull request, get feedback etc., hack some more if necessary
// Now my feature is ready for QA
$ git newqa // This is the alias to the script for recreating the QA branch from scratch
$ git merge --no-ff J-1234
$ git push --force origin QA // Force-push the new QA branch
$ git push origin J-1234 // Make sure my branch is on the remote for future QA branch recreation by other devs
And now for the nitty-gritty on how to set this up. Bear in mind that the set-up described below is in use for a single product codebase, i.e. our team is working on one ongoing project, not a sequence of new client projects. While this strategy may still be suitable for teams working on multiple client projects, some adjustments might need to be made to some of these implementation details.
Dealing with merge conflicts
When your feature branch is merged into the QA branch immediately prior to the QA branch being merged into master, this will be the nth time it has been merged in, where n is the number of times the QA branch has been recreated from scratch since your feature branch became ready for QA. If n = 1, the person doing the merging is you, the developer of that feature, and so any merge conflicts that arise can be easily dealt with by you. However if n is greater than 1, the chances are it is another developer doing the merging, and this developer likely won't know how to resolve the conflict (and certainly won't want to be dealing with it anyway.) So we use git's rerere feature to reuse recorded resolutions.
There's a great explanation of the rerere feature here. In short, "it allows you to ask Git to remember how you've resolved a hunk conflict so that the next time it sees the same conflict, Git can automatically resolve it for you". To set it up, run
$ git config --global rerere.enabled true
And to make sure that whenever a conflict is resolved in this way it is automatically staged and ready to commit, run
$ git config --global rerere.autoupdate true
Rerere in action
As the developer of a feature branch, you have been dutifully merging into the integration branch on a regular basis and resolving merge conflicts as they arise. With the rerere feature enabled, you record these resolutions. When your feature is finally ready for QA you recreate the QA branch to include your branch, and your recorded resolutions are used to resolve the conflicts (or there may be new conflicts, due to a difference in the order of the commits between the integration branch and the QA branch, in which case you resolve them and record the resolutions).
Later on, your co-worker finishes a feature she's been working on and now she needs to recreate the QA branch to include her branch. As she goes through the previously merged in branches, she comes to yours which has a conflict - how does she have access to your previously recorded resolution? This is where the idea of a shared rerere cache comes in. Normally, git's rerere cache is not something that gets shared between developers because it doesn't get included when you clone a repo - it's your own special cache in the .git directory of your local tree. The solution we implemented to this problem was along the lines of the suggestion here, i.e., to share the rereres in a git repository.
We created a separate "tools" repo that contains, among other things I'll get to later on, a rerere directory. As a one-off configuration step, each Gardens developer clones this tools repo and runs the following commands from within their regular Gardens repo:
$ rm -R .git/rr-cache
$ ln -s /path/to/my/clone/of/tools_repo/rerere/gardens .git/rr-cache
Now when a developer resolves a conflict in the gardens repo and a new rerere is recorded for it, it will be placed in the rerere directory of the tools repo, ready to be added, committed and pushed.
The obvious question then is, how do we make that "add, commit and push" part happen automatically as soon as the resolution happens? And the answer to this is git hooks. You can read up on git hooks here, but in short, they are just executable scripts that get run when particular actions happen. You can have pre-commit, pre-rebase, post-merge, etc. hooks. You can write them as simple bash scripts or use Ruby or Python for example. They reside in the hooks directory inside the .git directory of your local repo. Again, they are a local thing, not shared when a repo is cloned. So again, we use our tools repo to share these hooks. From within their regular Gardens repo, each developer runs:
$ rm -R .git/hooks
$ ln -s /path/to/my/clone/of/tools_repo/hooks/gardens .git/hooks
as another one-off configuration step.
Inside the hooks/gardens directory of our tools repo, we have a very simple post-commit script that checks whether the commit was a merge commit (the post-merge hook does not get triggered if there was a merge conflict, even if it was successfully resolved using rerere) and if so, goes to the rerere directory to see if there are any new rereres. If it finds one, it adds it, commits and pushes up to the tools repo so that everyone else will get it when they pull from the remote***.
Recreating the QA branch
I mentioned a script for recreating the QA branch, which developers have set up as an alias. All they have to do is run
$ git newqa
and the branch gets recreated with all of the feature branches merged into it that had been in the previous QA branch. So, what does this script do exactly? We will be making our script, which is written in Ruby, available once we figure out the best way to package it up for broader consumption, but in the meantime, here's what it does:
- If you don't specify a branch name, it assumes a branch name of "QA" and looks for such a branch in the "origin" remote of your repo
- If such a branch exists it parses its history to come up with a list of branches that were merged into it since it was branched off of master
- It creates a new local "QA" branch off master (having deleted any previously existing one, but not without prompting you for approval) and merges in the remote branches from the list, one by one, resolving conflicts using shared rereres
- Alternatively, if you specify a file containing an explicit list of branches to merge in, it will use this list instead, but can warn you if any of the branches was not merged into the previous QA branch
- If you have specified any branches to exclude using the -x switch, it will not merge them in, regardless of which list it is using
Once you have run the newqa command, you are ready to just merge in your own new feature branch and force-push the new QA branch to the remote.
Some clarification around long-running feature branches and hotfixes
With the BPF model, any features that don't make it into the release need to be rebased against the master branch after the release, so that they are then starting from the same point as all new features. We initially thought this meant that when we released a hotfix all feature branches would need to get rebased against master, but this doesn't really make any sense. Hotfixes are unlikely to be relevant to the feature branches in progress - the QA branch, when it gets recreated next time around, will have the hotfix, so there's no need for feature branches to be concerned with it. And besides, rebasing carries with it some dangers, especially if more than one developer has been working on a branch, so insisting that all feature branches be rebased against master after a hotfix simply isn't worth the hassle.
Interested in trying it?
This is a powerful branching strategy that may well be a good fit for your project - I hope the information presented here is helpful in determining that. We intend to make all the scripts and hooks I've mentioned available in the near future, but in the meantime for anyone happy to have a go at writing their own, I've tried to make it clear exactly what they need to do. I cannot stress enough, however, how important it is that every member on the team understand the fundamentals of this strategy. I highly recommend reading Dymitruk's blog post about it and also this Google plus thread which has a huge amount of discussion and some interesting insights on it.
* It can happen that the integration branch ends up containing a lot of cruft and it makes sense to recreate it from scratch off master - in this case all in-progress feature branches need to get re-merged into it.
** There are exceptions to this but they are fairly edge-casey - e.g. if a hotfix goes into master and is therefore in QA when the QA branch is recreated off the lastest master, if that hotfix touches code that is also touched by one of the features, a new conflict can arise. In that case the owner of the feature affected needs to resolve the conflict and recreate the QA branch.
*** This is one step we haven't yet automated, i.e. the updating of your local tools repo, but it just means that each developer has to ensure they have an up-to-date tools repo with all recent rereres before they recreate the QA branch.