# Difference between revisions of "Binary search"

A binary search algorithm (or binary chop) is a technique for finding a particular value in a linear array, by ruling out half of the data at each step, widely but not exclusively used in computer science. A binary search finds the median, makes a comparison to determine whether the desired value comes before or after it, and then searches the remaining half in the same manner. A binary search is an example of a divide and conquer algorithm (more specifically a decrease and conquer algorithm) and a dichotomic search (more at Search algorithm).

## Implementations

### C++

C++ has a template-based binary_search() in its standard library, which is what should normally be used, as well as the function pointer based bsearch inherited from C.

In C++, we can take advantage of templates to easily generalize the C implementation described at Binary search (C) to arrays of any type. We simply replace the element type everywhere by a template parameter "T" and replace "<" by a user-supplied less-than comparison operator.

<highlightsyntax language="cpp"> <<array binary search>>= template< typename T, typename compare_less > int array_binary_search(T a[], int low, int high, T target) {

```   while (low <= high) {
int middle = low + (high - low)/2;
if (compare_less(target, a[middle]))
high = middle - 1;
else if (compare_less(a[middle], target))
low = middle + 1;
else
return middle;
}
return -1;
```

} </highlightsyntax>

But we need not stop at arrays — we can generalize the implementation even further to any container permitting random access, such as std::vector. The conventional way of doing this, as done by the standard library's binary_search, is to pass in a begin and end iterator specifying the range. This is very similar to the above implementation, but with the indexes low and high and the return value replaced by iterators. Also, by convention, we make "end" point one past the last element to be searched instead of using an inclusive range, resulting in each instance of "end" being adjusted by 1; if the element is not found we can return the initial value of end, since this is outside the range being searched and so not a normal return value.

<highlightsyntax language="cpp"> <<generic binary search>>= template< typename T, typename IterT, typename compare_less > IterT generic_binary_search(IterT begin, IterT end, T target) {

```   IterT initial_end = end;
while (begin < end) {
IterT middle = begin + (end - begin - 1)/2;
if (compare_less(target, *middle))
end = middle;
else if (compare_less(*middle, target))
begin = middle + 1;
else
return middle;
}
return initial_end;
```

} </highlightsyntax>

For convenience, we supply an overload that doesn't take a comparison operator, just using "<". We can't use std::less because C++ doesn't support default template parameters on functions:

<highlightsyntax language="cpp"> <<generic binary search with two template arguments>>= template< typename T, typename IterT > IterT generic_binary_search(IterT begin, IterT end, T target) {

```   IterT initial_end = end;
while (begin < end) {
IterT middle = begin + (end - begin - 1)/2;
if (target < *middle)
end = middle;
else if (*middle < target)
begin = middle + 1;
else
return middle;
}
return initial_end;
```

} </highlightsyntax>

This simple test code tries out the final binary search on an array and a vector, using std::sort to sort them first:

<highlightsyntax language="cpp"> <<binary_search.cpp>>=

1. include <algorithm>
2. include <iostream>
3. include <vector>
4. include <cstdlib> // atoi

<<generic binary search>> <<generic binary search with two template arguments>>

int main(int argc, char* argv[]) {

```   int num_elements = std::atoi(argv[1]);
int* a = new int[num_elements];
std::vector<int> v;
for(int i=0; i<num_elements; i++) {
do {
a[i] = rand() % num_elements;
} while(a[i] == num_elements/7);
v.push_back(a[i]);
}
std::sort(a, a + num_elements);
std::sort(v.begin(), v.end());
```
```   {
int present_val = a[rand() % num_elements];
int* found_at = generic_binary_search(a, a + num_elements, present_val);
if (found_at == a + num_elements || *found_at != present_val) {
puts("ERROR");
}
}
{
int present_val = v[rand() % num_elements];
std::vector<int>::iterator found_at =
generic_binary_search(v.begin(), v.end(), present_val);
if (found_at == v.end() || *found_at != present_val) {
puts("ERROR");
}
}
if (generic_binary_search(a, a + num_elements, num_elements/7) != a + num_elements ||
generic_binary_search(v.begin(), v.end(),  num_elements/7) != v.end())
{
puts("ERROR");
}
return 0;
```

} </highlightsyntax>

### Pseudocode

This determines the index of a given value in a sorted list a between indices left and right:

```function binarySearch(a, value, left, right)
if right < left
mid := floor((right-left)/2)+left
if a[mid] = value
return mid
if value < a[mid]
return binarySearch(a, value, left, mid-1)
else
return binarySearch(a, value, mid+1, right)
```

Because the calls are tail-recursive, this can be rewritten as a loop, making the algorithm in-place:

```function binarySearch(a, value, left, right)
while left ≤ right
mid := floor((right-left)/2)+left
if a[mid] = value
return mid
if value < a[mid]
right := mid-1
else
left  := mid+1
```

In both cases, the algorithm terminates because on each recursive call or iteration, the range of indexes `right` minus `left` always gets smaller, and so must eventually become negative.

### C

• found is 1 if the item exist otherwise 0
• loc is where the item is if found
• Item is the value you are looking for
• right is the size -1 at first when called
• left is 0 when called
• Array is the sorted array of elements.
```  void binarySearch( int * loc, int * found, int Item, int Array[], int right , int left)
{
int middle = ( right + left ) /2;
(* loc) = -1;
(*found) = 0;
if(( middle>=0)&&(Array[middle]==Item))
{
(*loc)=middle;
(*found)=1;
}
else if(right>left)
{
if(Array[middle]<Item)
binarySearch(loc,found,Item,Array,right,middle+1);
else
binarySearch(loc,found,Item,Array,middle-1,left);
}
}
```

To further adapt it so you can insert an element if it does not exist you can use loc value as follows

```  void binarySearch( int * loc, int * found, int Item, int Array[], int right , int left)
{
int middle = ( right + left ) /2;
(*loc)=right;
if(*loc<0)
(*loc)=0;
(*found) = 0;
if((middle>=0)&&( Array[middle]==Item ))
{
(*loc)=middle;
(*found)=1;
}
else if(right>left)
{
if(Array[middle]<Item)
binarySearch(loc,found,Item,Array,right,middle+1);
else
binarySearch(loc,found,Item,Array,middle-1,left);
}
else if(Array[middle]<Item)
{
(*loc)=right+1;
}
}
```

NOTE: The following implementation returns -1 if the value cannot be found.

```int BinarySearch(int *array, int key, int low, int high)
{
int middle;
if(low > high) /* termination case */
return -1;
middle = (low+high)/2;
if(array[middle] == key)
return middle;
else if(array[middle] > key)
return BinarySearch(array, key, low, middle-1); /* search left */
return BinarySearch(array, key, middle+1, high); /* search right */
}
```

Here's a more elaborate version using a callback to perform the actual element access and comparison. That way, it is possible to search more elaborate structures than simple C arrays, since all knowledge of the structure is delegated to the callback. Also if the element is not found, an index is still returned which can be useful for inserting a new element with that search key.

```#include <stdbool.h>

typedef unsigned long
ulong;

typedef int (*CompareProc)
/* compares an element key against the desired key */
(
ulong Element,
void * Arg
);
/* result should be -1, 0 or +1,
indicating whether given key is less than, equal to or
greater than desired key. */

bool Search
(
ulong LowBound,
ulong HighBound,
CompareProc Compare,
void * CompareArg,
bool ReturnGreater,
ulong * Location
)
/* does a binary search of a sorted list of elements identified
by indexes in the range [LowBound .. HighBound]. The specified
Compare routine is called to compare the key of each element
against the desired one. If a match is found, its index is
returned in Location and the function result is true. If no
match is found, the function result is false, and either the
index of the least element not less than the desired one (if
ReturnGreater is true) or the greatest element not greater than
the desired one (if ReturnGreater is false) is returned in
Location. */
{
ulong MidElement;
bool Found;
int Comparison;
for (;;)
{
if (LowBound > HighBound)
{
Found = false;
if (ReturnGreater)
{
*Location = LowBound;
}
else
{
*Location = HighBound;
} /*if*/
break;
} /*if*/
MidElement = (LowBound + HighBound) / 2;
Comparison = Compare(MidElement, CompareArg);
if (Comparison < 0)
{
LowBound = MidElement + 1;
}
else if (Comparison > 0)
{
HighBound = MidElement - 1;
}
else
{
*Location = MidElement;
Found = true;
break;
} /*if*/
} /*for*/
return
Found;
} /*Search*/
```

### OCaml

NOTE: This particular implementation raises the Not_found exception if the value cannot be found.

```# let rec search a ?(l=0) ?(u=Array.length a - 1) k =
if l > u then raise Not_found;
let m = l + (u - l) / 2 in
match compare k a.(m) with
| -1 -> search a ~l ~u:(m - 1) k
| 1 -> search a ~l:(m + 1) ~u k
| _ -> m;;
val search : 'a array -> ?l:int -> ?u:int -> 'a -> int = <fun>
```

For example:

```# search [|0;1;2;3;4;5;6;7;8;9|] 3;;
- : int = 3
```

### Perl

1: <HIGHLIGHTSYNTAX language="perl"> my @list = qw(Cat Elephant Giraffe Lion Orangutan Zebra);

print binarySearch(\@list, 'Orangutan'), "\n";

sub binarySearch {

```   my (\$list, \$query) = @_;
my (\$low, \$high)   = (0, \$#{\$list});
```
```   while ( \$low <= \$high ) {
my \$try = int( (\$low + \$high) / 2 );
```
```       # note: "lt" and "gt" because we are comparing strings
# if we were comparing numbers then we would use "<" and ">"
\$list->[\$try] lt \$query and do { \$low  = \$try + 1; next };
\$list->[\$try] gt \$query and do { \$high = \$try - 1; next };
```
```       return \$try; # By this point it must be equal!
}
```

} </HIGHLIGHTSYNTAX>

2: <HIGHLIGHTSYNTAX language="perl"> use POSIX qw(floor);

sub binary_search_recursive {

```   my (\$arr, \$value, \$left, \$right) = @_;
my \$mid = floor((\$right-\$left)/2)+\$left;
return \$mid if \$arr->[\$mid] == \$value;
if (\$value < \$arr->[\$mid]) {
return binary_search_recursive(\$arr, \$value, \$left, \$mid-1);
} else {
return binary_search_recursive(\$arr, \$value, \$mid+1, \$right);
};
```

};

sub binary_search_inplace {

```   my (\$arr, \$value, \$left, \$right) = @_;
while (\$left <= \$right) {
my \$mid = floor((\$right-\$left)/2)+\$left;
return \$mid if \$arr->[\$mid] == \$value;
if (\$value < \$arr->[\$mid]) {
\$right = \$mid-1;
} else {
\$left  = \$mid+1;
};
};
```

};

my @l = qw(0 1 2 3 4 5 6 7 8 9); print binary_search_recursive \@l, 7, \$l[0], \$l[-1]; print binary_search_inplace \@l, 7, \$l[0], \$l[-1]; </HIGHLIGHTSYNTAX>

There is another way to do it, using the indexes of the vector. TODO: Isn't this version better?

<HIGHLIGHTSYNTAX language="perl"> use warnings;

sub binary_search(@\$) {

```my (\$arr, \$value) = @_;
\$left = 0;
\$right = @\$arr - 1;
while (\$left <= \$right) {
my \$mid = int(\$right + \$left) >> 1;
if (\$arr->[\$mid] eq \$value)
{
return 1
}
if (\$value lt \$arr->[\$mid]) {
\$right = \$mid - 1
} else {
\$left  = \$mid + 1
}
}
return 0;
```

};

my @arr = qw(1 2 3 4 5 6 7 8);

for (0 .. 9) {

``` print binary_search(\@arr, "\$_") . "\n";
```

} </HIGHLIGHTSYNTAX>

### PHP

NOTE: This particular implementation returns -1 if the value cannot be found.

```// \$low and \$high have to be integers

function BinarySearch( \$array, \$key, \$low, \$high )
{
if( \$low > \$high ) // termination case
{
return -1;
}

\$middle = intval( ( \$low+\$high )/2 ); // gets the middle of the array

if ( \$array[\$middle] == \$key ) // if the middle is our key
{
return \$middle;
}
elseif ( \$key < \$array[\$middle] ) // our key might be in the left sub-array
{
return BinarySearch( \$array, \$key, \$low, \$middle-1 );
}

return BinarySearch( \$array, \$key, \$middle+1, \$high ); // our key might be in the right sub-array
}
```

### Ruby

NOTE: This particular implementation returns -1 if the value cannot be found.

```def binary_search(array, key, low, high)
return -1 if low> high
mid = low + (high -low)/2
return mid if array[mid]==key
if array[mid] > key
high = mid - 1
else
low = mid + 1
end
binary_search(array, key, low, high)
end

```

### Python

NOTE: This particular implementation returns -1 if the value cannot be found.

```# low and high have to be integers

def binarySearch(array, key, low, high):
if low > high: # termination case
return -1

middle = (low + high) / 2 # gets the middle of the array

if array[middle] == key:  # if the middle is our key
return middle
elif key < array[middle]: # our key might be in the left sub-array
return binarySearch(array, key, low, middle-1)
else:                     # our key might be in the right sub-array
return binarySearch(array, key, middle+1, high)
```

### Scheme

NOTE: This particular list implementation returns -1 if the value cannot be found.

```(define (binary-search list key low high)
(let ((mid (quotient (+ low high) 2)))
(cond ((> low high) -1)
((= (list-ref list mid) key) mid)
((> (list-ref list mid) key) (binary-search list key low (- mid 1)))
(else (binary-search list key (+ mid 1) high)))))
```

### Seed7

NOTE: This particular implementation returns 0 if the value cannot be found.

```const func integer: binarySearch (in array elemType: arr, in elemType: aKey, in integer: low, in integer: high) is func
result
var integer: result is 0;
begin
if low <= high then
result := (low + high) div 2;
if aKey < arr[result] then
result := binarySearch(arr, aKey, low, pred(result)); # search left
elsif aKey > arr[result] then
result := binarySearch(arr, aKey, succ(result), high); # search right
end if;
end if;
end func;
```

### Tcl

Binary search in arbitrary list. A compare procedure should be given

```proc _bs {comp l key low high} {
if {\$low > \$high} {return -1} else {
set middle [expr {(\$low + \$high)/2}]
set test [\$comp \$key [lindex \$l \$middle]]
if {\$test == 0} {
return \$middle
} elseif {\$test < 0} {
incr middle -1
return [_bs \$comp \$l \$key \$low \$middle]
} else {
incr middle
return [_bs \$comp \$l \$key \$middle \$high]
}
}
}

proc binary_search {comp l key} {_bs \$comp \$l \$key 0 [expr {[llength \$l] - 1}]}
```

Searching in an integer list

```proc compare_integer {x y} {expr {\$x - \$y}}
binary_search compare_integer {0 10 20 30 40 50 60 70 80 90 100 110} 90 ;# -> 9
```

also

```lsearch -integer -sorted {0 10 20 30 40 50 60 70 80 90 100 110} 90
```

Searching in a string list

```proc compare_string {x y} {string compare \$x \$y}
binary_search compare_string {aa bb cc dd ee ff gg hh} cc ;# -> 2
```

also

```lsearch -sorted {aa bb cc dd ee ff gg hh} cc
```

### Algol pseudocode

```  %Array A, elements 1 to N, to be searched for value X.
L:=0;                        %Establish outer bounds.
R:=N + 1;                    %One before, and one after, the first and last.
while(P:=(R - L)/2) > 0 do   %Truncate to integer. Beware integer overflow with (L + R)/2.
Case Sign(X - A(P:=P + L))  %Perform the comparison once.
+1:L:=P;                   %Left bound up:    X follows A(P).
-1:R:=P;                   %Right bound down: X precedes A(P).
0:Return(P);              %X is found, here! X equals A(P).
Otherwise;                  %No other cases are possible.
Return(-L);                  %Failed.
```

Returning -L rather than a constant such as -1 is a service for the caller, indicating where the not-found value X would belong in the array should there be an intention to insert it. Note that Algol allows assignments within an expression. Without that facility the while-loop testing becomes difficult and the method has to be recast if repeated calculation is to be avoided, somewhat as follows:

```       L:=0;                       %Establish outer bounds.
R:=N + 1;                   %One before, and one after, the first and last.
Probe:P:=(R - L)/2;               %Truncate to integer. Beware integer overflow with (L + R)/2.
if P <= 0 then Return(-L);  %Aha! Nowhere!
P:=P + L;                   %Probe the middle of the span L:R.
Case Sign(X - A(P))         %Perform the comparison once.
+1:L:=P, go to Probe;         %Left bound up:    X follows A(P).
-1:R:=P, go to Probe;         %Right bound down: X precedes A(P).
0:Return(P);                 %X is found, here! X equals A(P).
Otherwise;                  %No other cases are possible.
```

Here, some "go to" statements are explicit.

### Assembly

Formatted in MASM. Input: term is the term being searched for, array is the pointer to the array, and asize indicates the size of array in bytes.

```bsearch proc term:DWORD,array:DWORD,asize:DWORD
mov eax,array
mov ecx,array
@@:
cmp eax,ecx
jg not_found
mov edx,eax
shr edx,1
xchg DWORD PTR [edx],eax
cmp eax,term
xchg DWORD PTR [edx],eax
jg search_right
jl search_left
mov eax,edx
sub eax,array
ret
search_right:
mov ecx,edx
jmp @B
search_left:
mov eax,edx
jmp @B
not_found:
mov eax,-1
ret
bsearch endp
```

### Java

The method takes two arguments: The value to search and the list of values to search in. The input list must be sorted in ascending order prior to searching, else the result of this method is undefined. It must also contain at least one element. Through Java generics, the method can be used with any elements that extend Comparable, like Java number objects or strings, making Java 5 a requirement.

While we are searching, the interval [0, left] is empty or contains only list values < value. The interval [right, n-1] is empty or contains only list values > value. The remaining search interval is [left, right]. The parts left and right of that interval have been searched.

<highlightsyntax language=java> <<search>>=

do {

```   int mid = (left + right) / 2;
T midVal = values.get(mid);
if (value.compareTo(midVal) < 0)
right = mid - 1;
else if (value.compareTo(midVal) > 0)
left = mid + 1;
else
return mid;
```

} while (left <= right); </highlightsyntax>

Now left > right and the value was not found in the list. We now want to determine the index at which the element should be inserted to maintain the ordering. As the remaining searching interval is empty now, there is an interval i [0, n-1] with two disjunct intervals i1 [0, left] and i2 [right, n-1] with the following attributes: all indices in i1 have list values < value, all indices in i2 have list values > value and i1 merged with i2 equals i. We now have two possible cases:

Case a) value < midVal & right = mid-1. From the first part follows that mid is in interval [right, n-1] and from the second part follows that mid is at the left edge of the interval, so mid is the smallest index with list value > value. So in case a) an insertion must happen on index mid.

Case b) value > midVal & left = mid+1. From the first part follows that mid is in interval [0, left]. From the second part follows that mid is at the right edge of the interval. This means mid is the largest index with a list value < value, so in case b) we need to insert at mid + 1.

<highlightsyntax language=java> <<return>>=

return (value.compareTo(midVal) < 0) ? -mid : -(mid + 1); </highlightsyntax>

#### Usage

Illustration of the functionality using JUnit 4 unit tests for strings and integers:

<highlightsyntax language="java"> <<test_strings>>=

assertEquals(-1, BinarySearch.search("Billy", Arrays.asList("Anny","Emmy","Grammy"))); </highlightsyntax>

<highlightsyntax language=java> <<test_integers>>=

assertEquals(-1, BinarySearch.search(2, Arrays.asList(1, 3, 4))); </highlightsyntax> The complete program: <highlightsyntax language=java> <<BinarySearch.java>>=

import static org.junit.Assert.assertEquals; import org.junit.Test;

import java.util.Arrays; import java.util.List;

public class BinarySearch { static <T extends Comparable<? super T>> int search(T value, List<T> values) { int n = values.size(); int left = 0, right = n - 1; <<search>> <<return>> } @Test public void testStringSearch() { <<test_strings>> }

@Test public void testIntegerSearch() { <<test_integers>> } } </highlightsyntax>