

However, if you use a simple hash function together with what’s called “linear probing” you can create a decent hash table quite easily. Hash tables can seem quite scary: there are a lot of different types, and a ton of different optimizations you can do. Thanks Seth Arnold and Olaf Seibert for the feedback. Not that I’ll be searching a 16 exabyte array on my 64-bit system anytime soon! For further reading, see the article Nearly All Binary Searches and Mergesorts are Broken. However, I’m going to leave the code stand for educational purposes – with the initial overflow check, I don’t think there’s a bug, but it is non-ideal that I’m only allowing half the range of size_t. This would mean changing the mid calculation to low + (high-low)/2. Note: in binary_search, it would be slightly better to avoid the up-front “half size overflow check” and allow the entire range of size_t. Here is the algorithm in C (assuming each array item is a string key and integer value): Searching for bob takes five steps (indexes 0 through 4). If the key matches what you’re looking for, you’re done. You simply start at the beginning ( foo at index 0) and compare each key. Let’s say you’re searching for the key bob in the following array (each item is a string key with an associated integer value): Index With this type of search you’re comparing an average of num_keys/2 items. Linear search also allows you to append new items to the end of the array. This is actually not a bad strategy if you’ve only got a few items – in my simple comparison using strings, it’s faster than a hash table lookup up to about 7 items (but unless your program is very performance-sensitive, it’s probably fine up to 20 or 30 items). The simplest option is to use linear search to scan through an array. This is often necessary in C, but it can also be useful if you need a custom hash table when using another language. We’re going to take a quick look at linear and binary search, and then learn how to write our own hash table. There are many things you can do when you realize this: use linear search, use binary search, grab someone else’s hash table implementation, or write your own hash table. Recently I wrote an article that compared a simple program that counts word frequencies across various languages, and one of the things that came up was how C doesn’t have a hash table data structure in its standard library. Go to: Linear search | Binary search | Hash tables | Implementation | Discussion My goal is to show that hash table internals are not scary, but – within certain constraints – are easy enough to build from scratch. I briefly demonstrate linear and binary search, and then design and implement a hash table.


Summary: An explanation of how to implement a simple hash table data structure using the C programming language.
