Git is my version control system of choice nowadays. Its feature set fits extremely well with my usual software development workflow, and there are so many aspects of git that I could write about, but this particular blog post is about commit behavior.
There is nothing religious about this post, as I’m not religious about git. It baffles me how people can be religious over a certain piece of technology and it seems as if version control systems in particular trigger a lot of emotional responses. But I digress.
In the past, I have used CVS, SourceSafe, Subversion and ClearCase. While they are quite different in many ways, with these I always felt that the primary vehicle of change is the file; what you check in to the VCS is a number of edits to a number of files, at best held together by a common check in message or tag. But when you look at what’s in the VCS, it’s still just a set of files. You may disagree with this, but that was the feeling I had when using them.
When I started using git, things changed drastically. The primary vehicle of change was now the commit instead of the file. A commit is a single entity that captures a change. That change may of course be distributed over a number of files, but all the edits are held together by something stronger than a message.
Dealing with commits instead of files opens up for a lot of interesting workflow changes. In my team back then, we started using Gerrit for gated checkins. A colleague had to review a commit as encapsulating a single change, and could reject it for a number of reasons, such as:
- It is too big and thus too hard to review.
- It does not, in fact, contain a single change, meaning that it is not coherent.
- It mixes changes to functionality with changes to formatting.
- It is not consistent with other commits in a series of commits.
Pretty quickly, a number of rules for what constitutes a good commit emerged, and with that
a behavior of making small, coherent commits that “get to the point.” And the great thing
with git is that it provides tools for splitting up a bad commit into manageable pieces if
you for some reason should end up with one. Go ahead and lookup
git rebase -i and
git add -p/-i, for example!
If I have to pick a single git feature that I think mostly affects good commit behavior I choose
git revert. Given a commit,
git revert adds a new commit that is the exact inverse, thus
canceling out the old one. How well this works obviously depends on what has happened since that
old commit was added, but in essence undoing a change is very straightforward. A colleague once
had to undo a change in ClearCase, and his hair was a lot thinner when he was finished…
So what does
git revert has to do with commit behavior? Well, I always think about what happens
if I need to undo a certain change, i.e. revert the corresponding commit. In other words, I consider
the “revertability” of commits I make. If I pack two changes into a single commit, I will lose
both if I need to undo one. If I do some code cleanup together with a change, just because I’m
“touching the file anyway,” the entropy will increase again as I undo the change. If I format code
with a change… Well, you get the point!
Once you start thinking in terms of commit revertability, following the rules of a good commit comes naturally:
- Keep it small and manageable.
- Only pack a single change.
- Stick to a single theme - functionality change, bug fix, code cleanup, code formatting, etc.
- If you’re on a feature branch, keep all commits in line with the feature.
I’m sure there are other version control systems that are comparable to git in terms of what I have just described, and that’s fine. I’m not saying that git is unique. My point is that once I became familiar with git, my commit behavior had changed from being file oriented to being change oriented, which from a software development perspective makes a lot more sense!
What’s your commit behavior?