Binary search
From CodeCodex
A binary search algorithm (or binary chop) is a technique for finding a particular value in a linear array, by ruling out half of the data at each step, widely but not exclusively used in computer science. A binary search finds the median, makes a comparison to determine whether the desired value comes before or after it, and then searches the remaining half in the same manner. A binary search is an example of a divide and conquer algorithm (more specifically a decrease and conquer algorithm) and a dichotomic search (more at Search algorithm).
Contents |
Implementations
Pseudocode
This determines the index of a given value in a sorted list a between indices left and right:
function binarySearch(a, value, left, right)
if right < left
return not found
mid := floor((right-left)/2)+left
if a[mid] = value
return mid
if value < a[mid]
return binarySearch(a, value, left, mid-1)
else
return binarySearch(a, value, mid+1, right)
Because the calls are tail-recursive, this can be rewritten as a loop, making the algorithm in-place:
function binarySearch(a, value, left, right)
while left ≤ right
mid := floor((right-left)/2)+left
if a[mid] = value
return mid
if value < a[mid]
right := mid-1
else
left := mid+1
return not found
In both cases, the algorithm terminates because on each recursive call or iteration, the range of indexes right minus left always gets smaller, and so must eventually become negative.
Algol pseudocode
%Array A, elements 1 to N, to be searched for value X.
L:=0; %Establish outer bounds.
R:=N + 1; %One before, and one after, the first and last.
while(P:=(R - L)/2) > 0 do %Truncate to integer. Beware integer overflow with (L + R)/2.
Case Sign(X - A(P:=P + L)) %Perform the comparison once.
+1:L:=P; %Left bound up: X follows A(P).
-1:R:=P; %Right bound down: X precedes A(P).
0:Return(P); %X is found, here! X equals A(P).
Otherwise; %No other cases are possible.
Return(-L); %Failed.
Returning -L rather than a constant such as -1 is a service for the caller, indicating where the not-found value X would belong in the array should there be an intention to insert it. Note that Algol allows assignments within an expression. Without that facility the while-loop testing becomes difficult and the method has to be recast if repeated calculation is to be avoided, somewhat as follows:
L:=0; %Establish outer bounds.
R:=N + 1; %One before, and one after, the first and last.
Probe:P:=(R - L)/2; %Truncate to integer. Beware integer overflow with (L + R)/2.
if P <= 0 then Return(-L); %Aha! Nowhere!
P:=P + L; %Probe the middle of the span L:R.
Case Sign(X - A(P)) %Perform the comparison once.
+1:L:=P, go to Probe; %Left bound up: X follows A(P).
-1:R:=P, go to Probe; %Right bound down: X precedes A(P).
0:Return(P); %X is found, here! X equals A(P).
Otherwise; %No other cases are possible.
Here, some "go to" statements are explicit.
Assembly
Formatted in MASM. Input: term is the term being searched for, array is the pointer to the array, and asize indicates the size of array in bytes.
bsearch proc term:DWORD,array:DWORD,asize:DWORD mov eax,array mov ecx,array add ecx,asize @@: cmp eax,ecx jg not_found mov edx,eax add edx,ecx shr edx,1 xchg DWORD PTR [edx],eax cmp eax,term xchg DWORD PTR [edx],eax jg search_right jl search_left mov eax,edx sub eax,array ret search_right: mov ecx,edx jmp @B search_left: mov eax,edx jmp @B not_found: mov eax,-1 ret bsearch endp
ss
C++
C++ has a template-based binary_search() in its standard library, which is what should normally be used, as well as the function pointer based bsearch inherited from C.
In C++, we can take advantage of templates to easily generalize the C implementation described at Binary search (C) to arrays of any type. We simply replace the element type everywhere by a template parameter "T" and replace "<" by a user-supplied less-than comparison operator.
<<array binary search>>=
template< typename T, typename compare_less >
int array_binary_search(T a[], int low, int high, T target) {
while (low <= high) {
int middle = low + (high - low)/2;
if (compare_less(target, a[middle]))
high = middle - 1;
else if (compare_less(a[middle], target))
low = middle + 1;
else
return middle;
}
return -1;
}
But we need not stop at arrays — we can generalize the implementation even further to any container permitting random access, such as std::vector. The conventional way of doing this, as done by the standard library's binary_search, is to pass in a begin and end iterator specifying the range. This is very similar to the above implementation, but with the indexes low and high and the return value replaced by iterators. Also, by convention, we make "end" point one past the last element to be searched instead of using an inclusive range, resulting in each instance of "end" being adjusted by 1; if the element is not found we can return the initial value of end, since this is outside the range being searched and so not a normal return value.
<<generic binary search>>=
template< typename T, typename IterT, typename compare_less >
IterT generic_binary_search(IterT begin, IterT end, T target) {
IterT initial_end = end;
while (begin < end) {
IterT middle = begin + (end - begin - 1)/2;
if (compare_less(target, *middle))
end = middle;
else if (compare_less(*middle, target))
begin = middle + 1;
else
return middle;
}
return initial_end;
}
For convenience, we supply an overload that doesn't take a comparison operator, just using "<". We can't use std::less because C++ doesn't support default template parameters on functions:
<<generic binary search with two template arguments>>=
template< typename T, typename IterT >
IterT generic_binary_search(IterT begin, IterT end, T target) {
IterT initial_end = end;
while (begin < end) {
IterT middle = begin + (end - begin - 1)/2;
if (target < *middle)
end = middle;
else if (*middle < target)
begin = middle + 1;
else
return middle;
}
return initial_end;
}
This simple test code tries out the final binary search on an array and a vector, using std::sort to sort them first:
<<binary_search.cpp>>=
#include <algorithm>
#include <iostream>
#include <vector>
#include <cstdlib> // atoi
<<generic binary search>>
<<generic binary search with two template arguments>>
int main(int argc, char* argv[]) {
int num_elements = std::atoi(argv[1]);
int* a = new int[num_elements];
std::vector<int> v;
for(int i=0; i<num_elements; i++) {
do {
a[i] = rand() % num_elements;
} while(a[i] == num_elements/7);
v.push_back(a[i]);
}
std::sort(a, a + num_elements);
std::sort(v.begin(), v.end());
{
int present_val = a[rand() % num_elements];
int* found_at = generic_binary_search(a, a + num_elements, present_val);
if (found_at == a + num_elements || *found_at != present_val) {
puts("ERROR");
}
}
{
int present_val = v[rand() % num_elements];
std::vector<int>::iterator found_at =
generic_binary_search(v.begin(), v.end(), present_val);
if (found_at == v.end() || *found_at != present_val) {
puts("ERROR");
}
}
if (generic_binary_search(a, a + num_elements, num_elements/7) != a + num_elements ||
generic_binary_search(v.begin(), v.end(), num_elements/7) != v.end())
{
puts("ERROR");
}
return 0;
}
Erlang
NOTE: This particular implementation returns 0 if the value cannot be found.
binary_search(Tuple, Key) ->
binary_search(Tuple, Key, 1, size(Tuple)).
binary_search(_, _, Low, High) when Low > High -> 0;
binary_search(Tuple, Key, Low, High) ->
Mid = (Low + High) div 2,
M = element(Mid, Tuple),
if
M > Key -> binary_search(Tuple, Key, Low, Mid-1);
M < Key -> binary_search(Tuple, Key, Mid+1, High);
true -> Mid
end.
Java
The method takes two arguments: The value to search and the list of values to search in. The input list must be sorted in ascending order prior to searching, else the result of this method is undefined. It must also contain at least one element. Through Java generics, the method can be used with any elements that extend Comparable, like Java number objects or strings, making Java 5 a requirement.
While we are searching, the interval [0, left] is empty or contains only list values < value. The interval [right, n-1] is empty or contains only list values > value. The remaining search interval is [left, right]. The parts left and right of that interval have been searched.
<<search>>=
do {
int mid = left + (right - left) / 2;
// mid values is modified - bug fix - Ref @ http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
T midVal = values.get(mid);
if (value.compareTo(midVal) < 0)
right = mid - 1;
else if (value.compareTo(midVal) > 0)
left = mid + 1;
else
return mid;
} while (left <= right);
Now left > right and the value was not found in the list. We now want to determine the index at which the element should be inserted to maintain the ordering. As the remaining searching interval is empty now, there is an interval i [0, n-1] with two disjunct intervals i1 [0, left] and i2 [right, n-1] with the following attributes: all indices in i1 have list values < value, all indices in i2 have list values > value and i1 merged with i2 equals i. We now have two possible cases:
Case a) value < midVal & right = mid-1. From the first part follows that mid is in interval [right, n-1] and from the second part follows that mid is at the left edge of the interval, so mid is the smallest index with list value > value. So in case a) an insertion must happen on index mid.
Case b) value > midVal & left = mid+1. From the first part follows that mid is in interval [0, left]. From the second part follows that mid is at the right edge of the interval. This means mid is the largest index with a list value < value, so in case b) we need to insert at mid + 1.
<<return>>= return (value.compareTo(midVal) < 0) ? -mid : -(mid + 1);
Usage
Illustration of the functionality using JUnit 4 unit tests for strings and integers:
<<test_strings>>=
assertEquals(-1, BinarySearch.search("Billy", Arrays.asList("Anny","Emmy","Grammy")));
<<test_integers>>= assertEquals(-1, BinarySearch.search(2, Arrays.asList(1, 3, 4)));
The complete program:
<<BinarySearch.java>>=
import static org.junit.Assert.assertEquals;
import org.junit.Test;
import java.util.Arrays;
import java.util.List;
public class BinarySearch {
static <T extends Comparable<? super T>> int search(T value, List<T> values) {
int n = values.size();
int left = 0, right = n - 1;
<<search>>
<<return>>
}
@Test
public void testStringSearch() {
<<test_strings>>
}
@Test
public void testIntegerSearch() {
<<test_integers>>
}
}
OCaml
NOTE: This particular implementation raises the Not_found exception if the value cannot be found.
# let rec search a ?(l=0) ?(u=Array.length a - 1) k =
if l > u then raise Not_found;
let m = l + (u - l) / 2 in
match compare k a.(m) with
| -1 -> search a ~l ~u:(m - 1) k
| 1 -> search a ~l:(m + 1) ~u k
| _ -> m;;
val search : 'a array -> ?l:int -> ?u:int -> 'a -> int = <fun>
For example:
# search [|0;1;2;3;4;5;6;7;8;9|] 3;; - : int = 3
Perl
1:
my @list = qw(Cat Elephant Giraffe Lion Orangutan Zebra);
print binarySearch(\@list, 'Orangutan'), "\n";
sub binarySearch {
my ($list, $query) = @_;
my ($low, $high) = ($[, $#$list);
while ( $low <= $high ) {
my $try = int( ($low + $high) / 2 );
# note: "lt" and "gt" because we are comparing strings
# if we were comparing numbers then we would use "<" and ">"
$list->[$try] lt $query and do { $low = $try + 1; next };
$list->[$try] gt $query and do { $high = $try - 1; next };
return $try; # By this point it must be equal!
}
}
2:
sub binary_search_recursive {
my ($arr, $value, $left, $right) = @_;
return if $right < $left; # not found
my $mid = int(($right-$left)/2)+$left;
return $mid if $arr->[$mid] == $value;
if ($value < $arr->[$mid]) {
return binary_search_recursive($arr, $value, $left, $mid-1);
} else {
return binary_search_recursive($arr, $value, $mid+1, $right);
}
}
sub binary_search_inplace {
my ($arr, $value, $left, $right) = @_;
while ($left <= $right) {
my $mid = int(($right-$left)/2)+$left;
return $mid if $arr->[$mid] == $value;
if ($value < $arr->[$mid]) {
$right = $mid-1;
} else {
$left = $mid+1;
}
}
}
my @l = qw(0 1 2 3 4 5 6 7 8 9);
print binary_search_recursive \@l, 7, $l[0], $l[-1];
print binary_search_inplace \@l, 7, $l[0], $l[-1];
There is another way to do it, using the indexes of the vector. TODO: Isn't this version better?
use warnings;
sub binary_search(@$) {
my ($arr, $value) = @_;
$left = $[;
$right = %#$arr;
while ($left <= $right) {
my $mid = int($right + $left) >> 1;
if ($arr->[$mid] eq $value)
{
return 1
}
if ($value lt $arr->[$mid]) {
$right = $mid - 1
} else {
$left = $mid + 1
}
}
return 0;
}
my @arr = qw(1 2 3 4 5 6 7 8);
for (0 .. 9)
{
print binary_search(\@arr, "$_") . "\n";
}
PHP
NOTE: This particular implementation returns -1 if the value cannot be found.
// $low and $high have to be integers
function BinarySearch( $array, $key, $low, $high )
{
if( $low > $high ) // termination case
{
return -1;
}
$middle = intval( ( $low+$high )/2 ); // gets the middle of the array
if ( $array[$middle] == $key ) // if the middle is our key
{
return $middle;
}
elseif ( $key < $array[$middle] ) // our key might be in the left sub-array
{
return BinarySearch( $array, $key, $low, $middle-1 );
}
return BinarySearch( $array, $key, $middle+1, $high ); // our key might be in the right sub-array
}
Python
NOTE: This particular implementation returns -1 if the value cannot be found.
# low and high have to be integers
def binarySearch(array, key, low, high):
if low > high: # termination case
return -1
middle = (low + high) / 2 # gets the middle of the array
if array[middle] == key: # if the middle is our key
return middle
elif key < array[middle]: # our key might be in the left sub-array
return binarySearch(array, key, low, middle-1)
else: # our key might be in the right sub-array
return binarySearch(array, key, middle+1, high)
Ruby
NOTE: This particular implementation returns -1 if the value cannot be found.
def binary_search(array, key, low=0, high=array.size-1)
return -1 if low > high
mid = (low + high) / 2
return mid if array[mid]==key
if array[mid] > key
high = mid - 1
else
low = mid + 1
end
binary_search(array, key, low, high)
end
ary = [1,2,3,4,5,6,7,8,9]
puts binary_search(ary, 8) #=> 7
Scheme
NOTE: This particular list implementation returns -1 if the value cannot be found.
(define (binary-search list key low high)
(let ((mid (quotient (+ low high) 2)))
(cond ((> low high) -1)
((= (list-ref list mid) key) mid)
((> (list-ref list mid) key) (binary-search list key low (- mid 1)))
(else (binary-search list key (+ mid 1) high)))))
Seed7
NOTE: This particular implementation returns 0 if the value cannot be found.
const func integer: binarySearch (in array elemType: arr, in elemType: aKey, in integer: low, in integer: high) is func
result
var integer: result is 0;
begin
if low <= high then
result := (low + high) div 2;
if aKey < arr[result] then
result := binarySearch(arr, aKey, low, pred(result)); # search left
elsif aKey > arr[result] then
result := binarySearch(arr, aKey, succ(result), high); # search right
end if;
end if;
end func;
Tcl
Binary search in arbitrary list. A compare procedure should be given
proc _bs {comp l key low high} {
if {$low > $high} {return -1} else {
set middle [expr {($low + $high)/2}]
set test [$comp $key [lindex $l $middle]]
if {$test == 0} {
return $middle
} elseif {$test < 0} {
incr middle -1
return [_bs $comp $l $key $low $middle]
} else {
incr middle
return [_bs $comp $l $key $middle $high]
}
}
}
proc binary_search {comp l key} {_bs $comp $l $key 0 [expr {[llength $l] - 1}]}
Searching in an integer list
proc compare_integer {x y} {expr {$x - $y}}
binary_search compare_integer {0 10 20 30 40 50 60 70 80 90 100 110} 90 ;# -> 9
also
lsearch -integer -sorted {0 10 20 30 40 50 60 70 80 90 100 110} 90
Searching in a string list
proc compare_string {x y} {string compare $x $y}
binary_search compare_string {aa bb cc dd ee ff gg hh} cc ;# -> 2
also
lsearch -sorted {aa bb cc dd ee ff gg hh} cc