Originally published in Webscope's blog: Hunting down regression using "git bisect"
Intro
Regression bugs are the worst! 😠 Especially, when you've just introduced a huge change and you have no clear idea where the regression bug stems from.
I'll show you a quick and reliable path on how to uncover regression bugs with a standard Git command, that not many people utilize, which is a pity!
As this article is discussing a very specific scenario in developers' workflow, let me open with a real-world model scenario, to get you into the context.
You may not even need this article! If you're only here to read about git bisect
's API either jump to the bottom or navigate to the Git docs. The point of this article is to discuss a real-life use case rather than duplicate the Git docs.
A real-world scenario
Storytime! 📚
Imagine you're working on a new, fancy feature, that you and the product team are very excited about. Once deployed, the feature is gonna bring a lot of value to your customers, moving you another tiny bit ahead of the competition. And you've been part of that! Exciting! 👏
You've spent days, maybe even weeks working on and perfecting the feature. The day finally comes. You receive the last required approval on your polished pull request. Proudly, you hit the "Merge PR" button and calmly watch the CI/CD pipeline take care of the deployment while sipping your cup of victorious coffee. ☕ You pat yourself on your back and feeling accomplished, you head home.
... fast-forward to the next morning. 🔅
You open your laptop to see a bunch of worrying messages from the product folks. Apparently, while shipping the awesome new feature, a bug 🐛 snuck through all the tests and even the code review! Now, your customers are complaining that something, that used to work before is now not working. You gulp and check out the feature branch, you hoped to never see again.
The root cause is not trivial and this is the commit history of the branch:
2bd81f7 chore: Initial commit
fd1c7ae feat: storing search state in user settings
03d78e refactor: basic tag autosuggest implementation
a2c6f9b feat: improve tag autosuggest algorithm
c1b2e7d chore: refactor search filter implementation
9e6c1a8 fix: correct fuzzy search implementation for tags
6c7d2b2 feat: add search suggestions to input field
e1f9a4d refactor: search input component reorganization
8a3b0c9 feat: add support for searching by date
7dfe573 test: add search performance testing
4a7b8f1 feat: integrate search input with external API
3a8bcb4 chore: add keyboard shortcuts for search input
b50b832 feat: allow searching within search results
d0e2b8f feat: add support for searching by category
5ab5e15 fix: improve search input accessibility
e7ba51c feat: add option to save search queries
4f1df68 test: add search input validation testing
1a2e82c feat: add ability to search within specific fields
0c38d2a feat: implement search input highlighting
e0ef1c6 feat: add support for searching within attachments
6d7e25d refactor: improve autocomplete for search input
9d7fae9 feat: add support for searching by location
6c7f6b3 test: implement search input throttling testing
3f49862 feat: add search input to mobile interface
1cbb0c2 refactor: implement search input suggestions from user history
7b642f1 chore: improve search input placeholder text
3e9a8f8 feat: add support for searching within shared documents
1d4e4d7 refactor: implement autocomplete for search input filters
f52973a feat: add support for searching within comments
7b64df1 fix: improve search input styling and layout
(Don't over-analyze the commit history, it's ChatGPT-generated). Here's the prompt, for reference.
Prompt for reference
If only you knew, where to start...
Taking a naive approach
If you ask me, that is quite an intimidating number of commits. If the commits are not single-purpose or close to atomic, it's very likely that the diff
is not gonna be the smallest as well.
Since we have no idea, where the sneaky bug is stemming from, it's important to realize, we're partially relying on a chance, to discover it. Therefore, debugging by browsing the branch and asserting a bunch of pseudo-randomly placed console.log
or debug
statements while clicking through the app would be very ineffective here.
After all, you're an engineer and there must be a systematic approach, right?
Tilting the odds in our favor
It's a general rule, that in case you're relying on chance, you better tilt the odds in your favor. How do we do that? We reduce the size of the faulty diff
to an absolute minimum.
What's the smallest primitive we can work down to in a Git-versioned repository? You guessed it, It's a commit. In other words. Instead of this.
We want to be digging through something like this.
That sounds like less of a headscratcher, right? 🤔
Leveraging "git bisect"
What it is
git bisect
is obviously a Git command and does exactly what we defined in the previous section. It helps us reduce the code to dig through by systematically identifying the first bad
commit ("bad" is a terminus technicus here) in Git history.
The process happens in a controlled, iterative fashion (similar to a wizard 🧙), using simple interval-halving, aka. bisection.
How does the command work
When you trigger git bisect
, the runner requests two inputs.
❌ A commit (hash) that you know is
bad
- meaning "is broken".✅ A commit (hash) that you know
good
- meaning "works fine".
Once you supply these two interval borders. The runner takes over. Iteratively, it starts checking out commits and asking you, whether things are broken or just fine on this particular commit.
Your only job is to re-run your test scenario, e.g.
Refresh a broken application UI and test the functionality
Re-run the failing test case and check whether it passes
Execute a script and see if it returns
0
this time... depends on your environment
With each step, it's your job to tell the bisect runner, if the commit is good
or bad
.
That's it! Since bisection is just another name for binary search, you'll locate the broken commit in very brief log2(number_of_commits)
steps. 😎
Looking through a history of 8 commits? You'll know the answer in 3 steps.
64 commits? 6 steps!
Even if you're digging through as many as 1024 commits, you'll know in 10 steps.
You've probably seen a log2(n) chart, right?
A practical example - Visual
Let's take the series of commits from above. Here's a little animation of how git bisect
locates the first bad commit.
Let's take 6d7e25d
as our broken commit, which we want to "identify". Below, you can watch a little animation of how we bisect from 2bd81f7 (good)
and 7b64df1 (bad)
all the way to the culprit.
There should be an animation under this line. If it's not, give it time to load. Or click the link below. ☕
📽️ If the embedded animation is too small, feel free to click through this link for a full-screen high-res version! Hope this animation says a thousand words.
Animation reference
convert -delay 1000 -dispose previous -loop 0 animate-*.png animation.gif
A practical example - CLI
In case you're a more hands-on type of person, let's also analyze the whole sequence of commands that lead us to the culprit (0c38d2a)
for completeness. If you got the idea from the animation, feel free to skip this. It's gonna be very linear.
Let's go over the command sequence. We start off by asserting the borders of the interval by passing a broken commit and a working commit. In this case HEAD
is broken, but 2bd81f7
works just fine.
👨💻 git bisect start HEAD 2bd81f7
Git acknowledges the internal borders and checks out a new commit - 5ab5e15
. Then informs us that the culprit will be identified in roughly 4 steps. Great!
🤖 Bisecting: 14 revisions left to test after this (roughly 4 steps) [5ab5e15] fix: improve search input accessibility
We re-run the target (application, test, script, ...) and access the commit as good
.
👨💻 git bisect good
Git acknowledges the good commit and checks out further, to 9d7fae9
.
🤖 Bisecting: 7 revisions left to test after this (roughly 3 steps) [9d7fae9] feat: add support for searching by location
We re-run the target (application, test, script, ...) and access the commit as bad
.
👨💻 git bisect bad
Git acknowledges the bad commit and checks out further, to 1a2e82c
.
🤖 Bisecting: 3 revisions left to test after this (roughly 2 steps) [1a2e82c] feat: add ability to search within specific fields
We re-run the target (application, test, script, ...) and access the commit as good
.
👨💻 git bisect good
Git acknowledges the bad commit and checks out further, to e0ef1c6
.
🤖 Bisecting: 1 revision left to test after this (roughly 1 step) [e0ef1c6] feat: add support for searching within attachments
We re-run the target (application, test, script, ...) and access the commit as bad
.
👨💻 git bisect bad
Git acknowledges and checks out the final commit - 0c38d2a
.
🤖 Bisecting: 0 revisions left to test after this (roughly 0 steps) [0c38d2a] feat: implement search input highlighting
As we've evaluated all necessary commits, the process ends. Now we know 0c38d2a
is the first broken commit that we need to dig through!
I'll now leave you in peace to debug your broken commit! 😊
Final words
The whole idea of using git bisect
for debugging, resp. tracing regression is to isolate the smallest possible piece of code, that we reliably identify as faulty. The smaller the diff is, the easier should it is to locate the culprit code.
Now we know, git bisect
can take of this in mere log2(number_of_commits)
steps. Therefore, even if we're working with a large commit sequence, e.g. of 1024, we can trace the regression bug in mere 10 steps.
Hope you learned something new today and you'll think twice before trying to debug large branches in the future.