I read the following in chapter seven of Beautiful Code. The author wants to make some points about testing, and opts to use binary search as a compact but non-trivial example.

If you have never implemented binary search, or haven’t done so in a few years, I suggest you try that yourself before going forward; it will give you greater appreciation for what follows.

Binary search is a great example because it’s so simple and yet it’s so easy to implement incorrectly. In Programming Pearls, Jon Bentley shares how, over the years, he asked hundreds of professional programmers to implement binary search after providing them with a description of the basic algorithm. He gave them a very generous two hours to write it, and even allowed them to use the high-level language of their choice (including pseudocode). Surprisingly, only about 10 percent of the professional programmers implemented binary search correctly.

I took the author’s advice and wrote my own binary search. I had three separate errors in my first iteration; all were flushed out by unit testing. My unit test strategy, one I try to use whenever possible, was to generate large numbers of random test cases on the fly. I suppose I should have used eunit, but I just rolled my own. Here’s the code.

-export([search/2, test/1]).

%% Return a random integer in 2 .. 5, or return 0
%% In particular, do not return 1; used for generating test cases
rint() ->

rint(Limit) ->
    Chance = random:uniform(Limit),
	Chance == Limit -> 0;
	true -> 1 + random:uniform(Limit)

%% Return a length N list of integers in non-decreasing order
%% 1: repeated elements can occur,
%% 2: distinct consecutive elements must differ by at least 2
%% rint and rlist were designed to force these properties, useful in tests
rlist(N) ->
    rlist(N, []).

rlist(0, Result)->
rlist(N, []) ->
    rlist(N-1, [rint()]);
rlist(N, [H|T]) ->
    rlist(N-1, [H+rint(),H |T]). %%non-decreasing by construction

%% avoid magic constant
failure() -> -1.

%% take care of trivial 0 element case in binary searh
search(_, []) -> failure();

%% List is a list of integers
%% K is an integer; return index of K in List, or failure()
search(K, List)->
    search(K, List, 1, length(List) ).

%% Lo and Hi inclusively bound the portion
%% of List to be searched for K
search(K, List, Lo, Hi) ->
	Hi < Lo -> failure();
	true ->
	    Probe = (Lo + Hi) div 2,
	    Elt = lists:nth(Probe, List),
		K < Elt ->
		K > Elt ->
		true  ->   %% when K == elt

testcase(N) ->
    List = rlist(N),
    Value = lists:nth(random:uniform(N), List),

    %% we know Value is in List
    Value = lists:nth(search(Value, List), List),

    %% we know some other values not in the list
    Failed = failure(),
    Failed = search(Value+1, List),
    Failed = search(Value-1, List),
    Failed = search(lists:nth(N,List)+1, List),
    Failed = search(lists:nth(1,List)-1, List),

    %% return 1 if the searches do not cause an exception

test(NumCases) ->
    test(NumCases, 0).
test(0, NumCorrect) ->
test(M, NumCorrect) ->
    test(M-1, NumCorrect + testcase(random:uniform(100))).

A million random tests passed, so I feel reasonably confident. Interestingly, Beautiful Code article comments at length about a subtle bug arising in a Java implementation. In the Java equivalent of my

Probe = (Hi + Lo) div 2

one needs to watch out for a possible integer overflow which would render Probe negative. It seems to me the flaw is in Java’s practice of allowing integer overflows without throwing an exception. But better than overflowing with an exception is not overflowing at all, that is, having a proper implementation of integers as do erlang, Haskell, lisp, Mathematica, and a host of other languages.

A final note: this code is merely an exercise, done for the sake of learning some erlang. The code, though very likely logically correct, suffers from a terrible performance flaw: accessing the kth element of an erlang list requires time proportional to k. Normally the whole point of binary search is to find things in logarithmic time. That clearly cannot not happen with this implementation. Caveat emptor.