Home arrow Practices arrow Page 3 - More Techniques for Finding Things

Binary Search Trade-offs - Practices

In this second part of a two-part series that provides an overview of search techniques for the developer, you'll learn more about the challenges and trade-offs of various approaches. It is excerpted from chapter four of Beautiful Code: Leading Programmers Explain How They Think, written by Andy Oram and Greg Wilson (O'Reilly, 2007; ISBN: 0596510047). Copyright © 2007 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.

TABLE OF CONTENTS:
  1. More Techniques for Finding Things
  2. Binary Search
  3. Binary Search Trade-offs
  4. Escaping the Loop
  5. Searching the Web
By: O'Reilly Media
Rating: starstarstarstarstar / 4
July 17, 2008

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

Binary search has some very large advantages. First of all, its performance is O(log2 N) . People often donít really grasp how powerful this is. On a 32-bit computer, the biggest log2 youíll ever encounter is 32 (similarly, 64 on a 64-bit computer), and any algorithm that competes in an upper bound of a few dozen steps will be ďgood enoughĒ for many real-world scenarios.

Second, the binary-search code is short and simple. Code that is short and simple is beautiful, for a bunch of reasons. Maybe the most important is that itís easier to understand, and understanding code is harder than writing it. There are fewer places for bugs to hide. Also, compact code plays better with instruction sets, I-caches, and JIT compilers, and thus tends to run faster.

Third, once youíve got that sorted array, you donít need any more index structures; binary search is very space-efficient.

The big downside to binary search is that the data has to be kept in order in memory. There are some data sets for which this is impossible, but fewer than you might think. If you think you have too much data to fit in memory, check the price of RAM these days and make sure. Any search strategy that requires going to disk is going to be immensely more complex, and in many scenarios slower.

Suppose you need to update the data set; you might think that would rule out binary search because you have to update a huge, contiguous array in memory. But that turns out to be easier than you might think. In fact, your programís memory is scattered randomly all over the computerís physical RAM, with the operating systemís paging software making it look sequential; you can do the same kind of trick with your own data.

Some might argue that since a hash table isO(1), that has to be better than binary searchísO(log2N). In practice, the difference may not be that significant; set up an experiment sometime and do some measurements. Also, consider that hash tables, with the necessary collision-resolution code, are considerably more complex to implement.

I donít want to be dogmatic, but in recent years, Iíve started to take the following approach to search problems:

  1. Try to solve it using your languageís built-in hash tables.
  2. Then try to solve it with binary search.
  3. Only then should you reluctantly start to consider other more complex options.



 
 
>>> More Practices Articles          >>> More By O'Reilly Media
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

PRACTICES ARTICLES

- Calculating Development Project Costs
- More Techniques for Finding Things
- Finding Things
- Finishing the System`s Outlines
- The System in So Many Words
- Basic Data Types and Calculations
- What`s the Address? Pointers
- Design with ArgoUML
- Pragmatic Guidelines: Diagrams That Work
- Five-Step UML: OOAD for Short Attention Span...
- Five-Step UML: OOAD for Short Attention Span...
- Introducing UML: Object-Oriented Analysis an...
- Class and Object Diagrams
- Class Relationships
- Classes

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: