Digital Search Trees
Digital Search Trees are an interesting data structure. Often times they are used as a gentle introduction to bit-oriented data structures, such as PATRICIA tries. Having just covered some other bit focused data structures and algorithms in the previous posts, I figured now would be as good a time as any to talk about digital search trees.
Digital Search Trees (DST) are very similar to their byte oriented cousin, the Binary Search Tree. DST's operate on the individual bits of the binary representation of the key being used, and are themselves a type of binary tree. They are so similar that the algorithms are practically identical. Despite their similarities there is one very important difference between the two: despite the name of digital search trees, they do not maintain the "search tree" property. An inorder traversal of the tree will not result in a sorted collection.
A Search Tree That Branches on Bits
A Digital Search Tree (DST) is a binary tree which when inserting or searching, makes branch descisions based on the bit value of the bit located in the position at which the current tree depth is. As such, at the root level we check the bit in the left most position (index 0), the roots children use the bit at index 1 and so on. In short, we branch based on the ith bit when at the ith level of the tree.
In a previous article where I introduced my BitStream class used in various projects, I briefly mentioned and gave a snippet of code for a procedure called getBitAtX(). Serendipitously, this procedure turns out to be precisely whats needed for us to implement searching and insertion in digital search trees! The procedure as the name implies, allows as us to retrive the individual bit, indexed by position, of a provided value.
int getBitAtX(BYTE b, int x) {
return b & (1 << x);
}
Armed with this procedure we can now very easily implement digital search trees using the following class definition as a starting point:
template <class K, class V>
class DST {
private:
struct node {
K key;
V value;
node* left;
node* right;
node(K c_, V v_) {
key = c_; value = v_;
left = nullptr;
right = nullptr;
}
};
using link = node*;
link root;
int digit(K c, int d) {
return c & (1 << d);
}
public:
DST() {
root = nullptr;
}
};
As can be seen by the above class definition, and as you will soon see with the algorithms for insertion and searching, the code for digital search tree's and binary search tree's proceed very, VERY similarly.
Searching & Insertion in Digital Search Tree
As mentioned above, during search and insertion we comprare the bit of the keys of the level of the tree we are currently visiting, with the root level being level 0, its children being level one, and so on.
If the bit value of the bit being tested is a 0, the left branch is chosen. If the bit value of the current position is 1, we take the right branch. The end result is a tree structure which tends to be flatter and thus better balanced then a standard byte-oriented Binary Search Tree.
We track the depth currently being visited through the variable 'd' in both the search and insertion algorithm. If we encounter a null link, then the search has failed and the key we are looking for is not in the tree.
V& search(K key) {
link tmp = getR(root);
return tmp == nullptr ? nilValue:tmp->value;
}
link getR(link h, K key, int d) {
if (h == nullptr)
return nullptr;
if (key == h->key)
return h;
if (digit(key, d)) return getR(h->right, key, d+1);
else return getR(h->left, key, d+1);
}
Insertion proceeds similarly, except we create a new node upon a search failure, as we do in binary search trees:
void insert(K key, V value) {
root = putR(root, key, value, 0);
}
link putR(link h, K c, V v, int d) {
if (h == nullptr)
return new node(c, v);
if (digit(c, d)) h->right = putR(h->right, c, v, d+1);
else h->left = putR(h->left, c, v, d+1);
return h;
}
As you can see, I certainly wasnt lying when I said the algorithms were very similar. What is not similar is the resulting shape of the tree. If we insert the same set of keys into a binary search tree and digital search tree, the difference is striking.
-
Digital Search Trees
-
Lossless Compression Part III: Huffman Coding
-
Lossless Compression Part II: The LZ77 Algorithm
-
Lossless Compression Part I: Working with Bits in a Byte Oriented World
-
Bottom Up AVL Tree: The OG Self-Balancing Binary Search Tree
-
A Different Take on Merge Sort: Binary Search Trees?
-
Deleting Entries From Open-Address Hash tables
-
Transform any Binary Search Tree in to a Sorted Linked List
-
From Regular Expressions To NFA by way of Abstract Syntax Trees: Thompsons Construction
-
Extending Sedgewick's explicit Digraph NFA
Leave A Comment