Nathan's Lucubrations; on computers, music and the outdoors

19 11 2015

Thu, 19 Nov 2015

The Answer is Always Rebase

While I'm no expert on git, I've used it enough to have formed some opinions, and the first one is that people who call it "overly complicated" or "not user friendly" are wrong. This simple fact is not up for debate.

Where things get interesting is when you decide certain things like whether or not to have a linear history (via rebase) or to use merge commits in all their soul-crushing hairy glory. You might be able to guess where I stand on this matter, but let me just try to convince you, as others have tried to convince me:

Take the argument that merge --rebase will break your unit tests, and there's nothing you can do about it. While it is commendable to ensure that every commit builds and passes tests, the assertion that merge rebase will break tests irreperably is flatly not true. How do I know? Because, I merge-rebase all the time, and have a strict policy of all commits to the public repository (origin/master) will build and pass tests.

And how do I accomplish this feat? Simple: rebase. The answer is always rebase. Made a typo in your commit message? Rebase (although this has handily been aliased to "commit --amend", it's still technically a rebase). Have a commit that reverts a commit immediately preceding it? Rebase-squash cancels them out. Have a commit with unit test for a feature implemented in the next commit? Rebase them into one.

So, just for more detail, here's a brief overview of my setup: Gerrit hooked up to Jenkins, where every commit must compile and pass over 300 unit/regression tests on six different build configurations between two compilers with every single warning as an error, on two operating systems, pass a wide variety of linters, and also have everything documented down to parameters and return values.

This setup doesn't allow for broken pushes, and if someone pushes something that passes all the automated tests and a human code reviews and accepts it, you will have to pull those changes down and replay your work on top of them. How do you do that? I'll give you a hint: it's one word and the output message from the command sounds similar to my last sentence (hint: it's rebase).

"But what if there's a conflict?" you say. If there's a conflict, then merge would have had a conflict to resolve as well, and at least you know that since you were slow on the draw, you have to change your code to accomodate the code that's already been accepted. "But what if upstream's changes break my changes?" you say. Well, then fix your changes. "But what if it breaks my new, as yet unaccepted tests?" If the previously pushed code is broken, treat it as a bugfix, fix it on a different branch, squash the test and fix into one commit with rebase, and push that, then switch to your original branch, rebase it on top of your bugfix/test change and continue working on your other changes. No matter what version control you use, there will be cases where you have to integrate your changes with other peoples' changes, and part of being a professional is handling this with grace (or just making your changes modular and isolated enough that they don't conflict in the first place).

Remember: in my system, any change is rejected if it breaks the tests. Therefore it's impossible to get a change in, merge or rebase, that will break the tests. If you're adding a new test to catch a bug, good! But sometimes that means you have to fix the bug, even if you didn't create it, and even if it didn't exist when you created the test.

So the real question becomes, if it works for both rebase and merge, why not use merge? Well, I'll admit this comes down to personal preference and aesthetics, but I feel with good justification: I don't care about every time a developer reverted an immediately preceding change because they tried something out and found out it didn't work or wasn't what they wanted. I don't want to see the often confused, muddled thought process that eventually made it's way to a working piece of code. That shouldn't be in the public repo, and I definitely don't want to have to bisect over it. I want a linear, coherent, cleaned up public repository history that I can quickly and easily bisect, without having to do exponential diving trips down different branches. You get a clean history by rebasing before pushing to the public repository.

posted at: 21:30 | path: | permanent link to this entry

powered by blosxom