Left Leaning Red/Black Trees, Part 1: Theory, Rotations, Creation and Insertion.
Binary Search Trees
Binary Search Trees are pervasive in computer science, their usefulness can not be overemphasized. From parsing, to searching, to sorting they pop up everywhere, even underpinning other ADT. They offer quick look up times and keep their data stored in sorted order with very little over head. They Are not without their pitfalls however, in the worst case, which results from data being entered in sorted, or almost sorted order you can end up with whats called a "skewed tree", which is when the tree structure devolves into a linked list. In this situation the benefits of a binary search tree in terms of fast lookup times (searching) has been lost and you back at O(n) instead of O(logN).
There is an answer to this problem in the form of "Self Balancing Binary Search Trees".
Self Balancing Binary Search Trees
By employing whats called "Tree Rotations", self balancing binary search trees can modify their shape while still conforming to the Binary Search Tree properties. Their are many types of such trees: Red Black trees, AVL Trees, Scapegoat trees, and many others.
Like so many things in computer science, the concept behind them is relatively simple, the implementation is another matter entirely. In the early 60's two soviet computer scientists invented the AVL Tree. AVL Trees were the first such data structure to make use of tree rotations for self modification, and for a while they stood alone. Then in the late 1970s Red/Black trees appeared on the scene, and during the 1980s many other types of self balancing trees appeared.
While Red/Black trees are not as strictly balanced as AVL trees, they offer faster insertion/deletion times in comparison. AVL Trees however slightly faster look up.
Their usefulness, combined with their worst case performance guarantee assured them a place in many standard libraries and critical applications. The C++ Standard Template Library makes use of Red/Black Trees to implement std::map and std::set. Java also makes use of Red/Black Trees in it's standard library as well. The linux kernel uses it for scheduling purposes, and the list goes on.
With increased performance comes increased code complexity and a proper implementation of a Red/Black tree was not for the faint of heart, with full featured, well working versions often stretching into several HUNDRED lines of code, in comparison a non self balancing BST can be fully implemented in less than 100.
Research Yields a Breakthrough in Code Complexity
Robert Sedgewick of Princeton University, author of the book "Algorithms", and co-inventor of the original red/black trees made major strides in making Red/Black trees easier to code and the early 2000's wrote several papers on What has become known "Left Leaning Red Black Trees." This name comes from the relaxation of one of the rules for implementing Red/Black trees.
Red/Black Trees were developed by representing 2-3-4 Trees (Themselves a special case of B-Trees) as a binary search tree through the use of edge/node "coloring". 2-3-4 Trees can have nodes that have either 2, 3, or 4 key/values stored in them. By changing one of the implementation rules so that "3 nodes" can ONLY be represented by 2 red links leaning left, the number of "special cases" needed when do insertion was reduced from 6 to just 2, making their implementation almost trivially simple. The new research into Left Leaning Red Black Trees also came with a surprise: if the work of splitting nodes is performed on your way back up the tree instead of on your way down during the recursion process, the resulting tree models a 2-3 tree instead of a 2-3-4 tree!
For more information on how Red/Black Trees are related to 2-3-4d trees, read the paper by Sedgewick at
https://www.cs.princeton.edu/~rs/talks/LLRB/RedBlack.pdf
Left Leaning Red Black Tree Operations
Many of the operations conducted on a Binary Search Tree will work on LLRB trees unmodified: getMin(), getMax(), search, and traversals all perform as expected. Insert and delete require some modifications, as this is the operations which change the structure of the tree and so is where the transformations occur. To make these transformations possible, we will need to add some helper functions: rotateLeft(), rotateRight(), colorFlip(), and isRed().
The structure of a LLRB node looks as follows:
static const bool red = true;
static const bool black = false;
class rbnode_t {
public:
int key;
int value;
bool color;
rbnode_t* left,
rbnode_t* right;
rbnode_t(int k, int v, bool c) : key(k), value(v), color(c), right(nullptr), left(nullptr) { }
};
As can be seen, the structure differs only from a normal BST in the addition of a bool value to denote color.
our LLRB tree structure is laid like so:
class LLRB {
private:
bool isRed(rbnode_t* x);
rbnode_t* put(rbnode_t* x, int k, int v);
rbnode_t* rotateLeft(rbnode_t* x);
rbnode_t* rotateRight(rbnode_t* x);
rbnode_t* colorFlip(rbnode_t* x);
rbnode_t* root;
public:
LLRB()
{
root - nullptr;
}
void insert(int key, int value) { root = put(root, key, value); }
};
The above definitions are the functions used for building the tree, more will be added as we add more functionality, but it will get us started. Lets implement the helper functions before we dive into the insertion, starting with the simplest, isRed().
isRed() is used for verifying the color of the links as we work our way through the tree, crucially, it also checks if a link is a null pointer, to avoid us trying to read values from a node that doesn't exist:
bool isRed(rbnode_t* x)
{
if (x == nullptr)
return false;
return (x->color == red);
}
colorFlip() swaps the colors of a node and its children, as LLRB trees are binary tree representations of 2-3-4 trees, colorFlip() is analogous to the process of "splitting a 4 node". Again, this is covered in more depth in the paper i linked to above.
rbnode_t* colorFlip(rbnode_t* x)
{
x->color = !x->color;
x->left->color = !x->left->color;
x->right->color = !x->right->color;
return x;
}
Right rotations and left rotations are used both to enforce the left leaning properties, as well as keep the tree balanced while preserving the binary search tree property. Without rotations self balancing trees would not be possible without rebuilding the entire tree from the ground up (such as in scapegoat trees)
rbnode_t* rotateLeft(rbnode_t* h)
{
rbnode_t* x = h->right;
h->right = x->left;
x->left = h;
x->color = x->left->color;
x->left->color = red;
return x;
}
rbnode_t* rotateRight(rbnode_t* h)
{
rbnode_t* x = h->left;
h->left = x->right;
x->right = h;
x->color = x->right->color;
x->right->color = red;
return x;
}
Performing an Insertion on a Red Black Tree
With our restructuring functions in place we now have everything we need to implement an insertion. It should be noted that unlike the original Red/Black trees which lended themselves well to an iterative process, Left Leaning Red Black trees take better advantage of the recursive nature of binary search trees, and due to their better balancing, even large node counts will avoid excessive stack depth: LLRB have a 2LogN height guarantee.
Inserting a node into a LLRB is essentially a four step process:
- Following the binary search tree property, insert a node at the bottom of the tree
- split any "four nodes" that are encountered on the way down the tree. (colorFlip())
- enforce the "left leaning condition"
- balance any remaining four nodes"
These four steps are accomplished by adding three conditionals to the standard binary search tree insertion:
rbnode_t* put(rbnode_t* x, int k, int v)
{
/* 1) insert new nodes at the bottom of the tree */
if (x == nullptr)
return new node(k, v, red);
/* 2) split any four nodes */
if (isRed(x->left) && isRed(x->right))
x = colorFlip(x);
if (k < x->key)
x->left = put(x->left, k, v);
else
x->right = put(x->right, k, v);
/* 3) Enforce "Left Leaning" property */
if (isRed(x->right) && !isRed(x->left))
x = rotateLeft(x);
/* 4) Balance any "four nodes" */
if (isRed(x->left) && isRed(x->left->left))
x = rotateRight(x);
return x;
}
I kid you not, insertion into a Left Leaning Red Black tree it is as simple as that.
Thoughts
Thanks to continued research what used to be a very complicated, particularly tricky process that took almost 100 lines of code has been reduced to ~35 lines of very understandable code, and with it the power of redblack trees has been distilled to its essences while preserving its power, integrity, and useability. Stay tuned for part 2, where we will covere deleting items from a Left Leaning Red Black tree, another process which was been greatly simplified.
For a full implementation of the code that was used in this article, check out my github at:
https://github.com/maxgoren/LeftLeaningRedBlack/blob/main/lrb-recursivev2.cpp
-
Pratt Parsing: Top-Down Operator Precedence
-
Generating P-Code by AST Traversal
-
Ternary Search Tries: String Specific Ordered Symbol Tables
-
Digital Search Trees
-
Lossless Compression Part III: Huffman Coding
-
Lossless Compression Part II: The LZ77 Algorithm
-
Lossless Compression Part I: Working with Bits in a Byte Oriented World
-
Bottom Up AVL Tree: The OG Self-Balancing Binary Search Tree
-
A Different Take on Merge Sort: Binary Search Trees?
-
Deleting Entries From Open-Address Hash tables
Leave A Comment