Hash Tables part 2: Separate Chaining.
Part 2: expanding the collision resolution repertoire
Welcome back for part two of my articles on hash tables. In my previous article i covered the basics of linear probing. In this article i will discuss another popular collision resolution technique: separate chaining. While open address tables contain just one item per bucket, choosing instead a different slot when a key collision occurs, separate chaining takes the concept of "bucket" to its logical conclusion. By marrying a dynamically sized container, most commonly a linked list to each table slot we can store each key that maps to that slot in said bucket. Other data structures besides linked lists have been explored as well, in languages where dynamically sized arrays are available they can provide better cache locality than a linked list will. In implementations where the number of buckets must be kept small in turn leading to buckets of higher item counts, binary search trees cab be employed to cut down on search times. In this article i will be using singly linked lists to keep the concept its self the focus. Despite their similarities there are significant enough differences between open addressing tables and separate chained tables to warrant significant differences in the organization of the data structures. In that sense, this article may seem like its own stand alone piece than a part two. But without further delay lets dig in to this most useful data structure.Many pieces makes a whole
As previously discussed a hash table is a marriage of concepts requiring many pieces working in concert. These pieces are:- The hash function - for converting keys to table indexes
- The table data structure - an array of "buckets" that key/value pairs will be mapped to, with associated satellite info such as item count and table size.
- The collision resolution strategy - a way to handle multiple keys that map to the same table slot.
Handling Table Insertion
Insertion into a table with separate chaining is a breeze, and is really where this collision resolution shines. Once we have computed the index from our key, we proceed exactly the same as we would for insertion into a singly list. Pushing the new item to the front of the bucket's linked list will yield an any case O(1) insertion algorithm. It is important to note that this is a worst case performance that open addressing implementations cannot match!Retrieving data from the hash table
Once our table is populated, querying it becomes a very simple matter as well. We proceed by computing the hash value of the key for the item in question and then perform a simple sequential sort of the bucket, if the item is in the table, we return its associated value:Deleting Items from the hash table
The deletion items also proceeds exactly as it would from a linked list. The one caveat being that we are using special list head structures, so an additional check is required to see if they item we are searching for is the first item on the list, as this case must be handled differently:But wait! There's more....
With that we have a full implementation of our separate chaining hash table. But that does not conclude my series on hash tables. In the next article I will cover yet more strategies, such as double hashing, cuckoo hashing, and hashing values that are not based on strings.Search
Recent Posts
-
Lossless Compression Part III: Huffman Coding
-
Lossless Compression Part II: The LZ77 Algorithm
-
Lossless Compression Part I: Working with Bits in a Byte Oriented World
-
Bottom Up AVL Tree: The OG Self-Balancing Binary Search Tree
-
A Different Take on Merge Sort: Binary Search Trees?
-
Deleting Entries From Open-Address Hash tables
-
Transform any Binary Search Tree in to a Sorted Linked List
-
From Regular Expressions To NFA by way of Abstract Syntax Trees: Thompsons Construction
-
Extending Sedgewick's explicit Digraph NFA
-
Iterative In-order B Tree Traversal
Meta
Leave A Comment