# How to Get Binary Search Right

## Contents

As you already know, binary search is a O(log N) way to search if a particular value exists in a sorted array. There are some reports on how "most of our implementation" of binary searches are broken. The mistake discussed in that post is the overflow when calculating the index of the middle element. But regardless of whether you get this bit right, there's a different problem with all binary search implementations I've seen.

They're just too damn complicated and hard to get right out of your memory.

### Key points

Almost always, if you're relying on your memory, you should prefer simple algorithms to complex, like you should implement a treap instead of an AVL tree. So here's a sequence of steps that helps me to write binary search correctly.

#### Bisection instead of "binary search"

Starting from this point, we aren't searching for anything. Searching for an element in array has a too complex definition. What should we return if the element doesn't exist, or there are several of them equal to each other? That's a bit too confusing. Instead, let's *bisect* an array, pose a problem that always have a single solution no matter what.

**Given a boolean function f, find the smallest k such that, for all i < k, f(a[i]) == false, and for all j >= k, f(a[j] == true)**. In other words, we are to find the smallest k such that for all i within array bounds f(a[i]) == (i<k). If the array is sorted in such a way that f(a[i]) is monotonic, being first false and then true, this problem always has a single solution. In some cases, when f(a[i]) is false for all array elements, the resultant k is greater by one than the last index of the array, but this is still not a special case because this is exactly the answer as defined by bold text above.

This is more powerful than the binary search, which can be reduced to this problem I'll call here "bisection". Indeed, to find an element x in the array a of length N, we can do the following:

Now let's get back to solving the bisection, once we've defined it.

#### Well-defined bounds

The answer to our bisection problem is between 0 and N inclusively, so we will start by assigning i to 0, and j to N. No special cases allowed: we can do without them at all. At each step, the answer we seek will be between our values i <= k <= j, and we will not know any information that would help us to reduce the range size.

#### Non-overflowing pivot point

It's easy to make an overflow mistake in computing the pivot point; i.e. the point at the middle of the array we'll apply f to to make a step. This problem is discussed in the article I linked above more. If we were in the world of non-bounded integers, it would simply be (i+j)/2, but we live in the world of integer overflows. If i and j are positive integers, which makes sense for array indexes in C++, the never-overflowing expression would be:

Another useful property of this expression (in C++, at least) is that it's never equal to j if i < j. Here's why it's important.

#### Well-founded termination

A basic way to make sure that your program terminates some day is to construct such a "loop variant" that *strictly* decreases with each loop iteration, and which can't decrease forever. For our bisection, the natural variant is the length of the part of the original array we currently look for the answer in, which is max(j-i, 0). It can't decrease forever because its value would reach zero at some point, from which there's nowhere to proceed. We only need to prove that this variant decreases each iteration by at least one.

Fun fact: special programs can prove that a program terminates automatically, without human intervention! One of these programs is named... Terminator. You can read more about this in a very accessible article "Proving Program Termination" by Byron Cook et al. I first learned about it in a summer school where Mr. Cook himself gave a talk about this.

First step is to make sure that the time it reaches zero would be the last iteration of the loop (otherwise it wouldn't decrease!), so we use while (i < j).

Now assume that we've understood that the pivot point s does not satisfy the target function f. Our key difference with most binary search algorithms is that **we should assign the new i to s+1 rather than s**. Given our definition of the problem, s is *never* the answer in this case, and we should keep our candidate range as small as possible. If f(a[s]) is true, then s could still be the answer, so it does not apply here. Here's our iteration step now:

We mentioned in the previous section that i <= s < j, hence i < s + 1 and s < j, so each of assignments decreases j-i by at least 1 regardless of f's value. This proves that this loop never hangs up forever.

### The complete program

So here's our bisection that works for all sorted arrays (in the sense of f(a[i]) being first false then true as i increases).

The search itself in terms of finding if a value exists in a sorted array, would be then written (I'm just repeating what I wrote above, no surprises here):

### Proof By Authority

And GNU C++ STL implementation of binary_search follows the same guidelines I posted here. In my version of GCC 4.3.4 binary search looks like this:

Where lower_bound is the bisection function.

### Conclusion

Here's my version of bisection and binary search I usually use when I need one. It is designed to be simple, and simple things are easier to get right. Just make sure your loop variant decreases, you do not overflow, and you get rid of as much special cases as possible.

# Comments imported from the old website

The binary search is a little bit more complicated than usually presented. An exercise that can help a lot in writing a correct version of binary search is given here http://arxiv.org/abs/1402.4843 I highly recommend looking at it. Without it, it looks like guessing game. If we exclude recursive solution, there are no more than 4-5 correct implementations of the algorithm. All solutions including the one you gave can be derived from the table given in exercises. (Actually I am collecting implementations and showing how to do that.) Your example is optimized so it is not a good starting point. j in your code is keeping the index which is one after the last element in the selected subarray. You need to explain this explicitely. GNU code is optimized for several reasons, one of them is speeding up on assembly level. I am saying this is an advanced level what you are writing about. Nobody can use it as a starting point.

Many thanks to Martin Ward who pointed to inaccuracies in termination proof description.